Vladimir Davydov <vdavydov.dev@gmail.com> <vdavydov@parallels.com>
Takashi YOSHII <takashi.yoshii.zj@renesas.com>
Will Deacon <will@kernel.org> <will.deacon@arm.com>
+Wolfram Sang <wsa@kernel.org> <wsa@the-dreams.de>
+Wolfram Sang <wsa@kernel.org> <w.sang@pengutronix.de>
Yakir Yang <kuankuan.y@gmail.com> <ykk@rock-chips.com>
Yusuke Goda <goda.yusuke@renesas.com>
Gustavo Padovan <gustavo@las.ic.unicamp.br>
Description: Dump debug registers from the HPRE.
Only available for PF.
-What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/qm_regs
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/regs
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump debug registers from the QM.
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
- show its debug registers in above qm_regs.
+ show its debug registers in above regs.
Only available for PF.
What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/clear_enable
Date: Sep 2019
Contact: linux-crypto@vger.kernel.org
-Description: QM debug registers(qm_regs) read clear control. 1 means enable
+Description: QM debug registers(regs) read clear control. 1 means enable
register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/err_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of invalid interrupts for
+ QM task completion.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/aeq_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of QM async event queue interrupts.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/abnormal_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of interrupts for QM abnormal event.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/create_qp_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of queue allocation errors.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/mb_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of failed QM mailbox commands.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/qm/status
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the status of the QM.
+ Four states: initiated, started, stopped and closed.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/send_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of sent requests.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/recv_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of received requests.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/send_busy_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of requests sent
+ with returning busy.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/send_fail_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of completed but error requests.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/invalid_req_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of invalid requests being received.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/overtime_thrhld
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Set the threshold time for counting the request which is
+ processed longer than the threshold.
+ 0: disable(default), 1: 1 microsecond.
+ Available for both PF and VF, and take no other effect on HPRE.
+
+What: /sys/kernel/debug/hisi_hpre/<bdf>/hpre_dfx/over_thrhld_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of time out requests.
+ Available for both PF and VF, and take no other effect on HPRE.
-What: /sys/kernel/debug/hisi_sec/<bdf>/sec_dfx
-Date: Oct 2019
-Contact: linux-crypto@vger.kernel.org
-Description: Dump the debug registers of SEC cores.
- Only available for PF.
-
-What: /sys/kernel/debug/hisi_sec/<bdf>/clear_enable
+What: /sys/kernel/debug/hisi_sec2/<bdf>/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
0: disable, 1: enable.
Only available for PF, and take no other effect on SEC.
-What: /sys/kernel/debug/hisi_sec/<bdf>/current_qm
+What: /sys/kernel/debug/hisi_sec2/<bdf>/current_qm
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One SEC controller has one PF and multiple VFs, each function
qm refers to.
Only available for PF.
-What: /sys/kernel/debug/hisi_sec/<bdf>/qm/qm_regs
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/qm_regs
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Available for PF and VF in host. VF in guest currently only
has one debug register.
-What: /sys/kernel/debug/hisi_sec/<bdf>/qm/current_q
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/current_q
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: One QM of SEC may contain multiple queues. Select specific
- queue to show its debug registers in above 'qm_regs'.
+ queue to show its debug registers in above 'regs'.
Only available for PF.
-What: /sys/kernel/debug/hisi_sec/<bdf>/qm/clear_enable
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/clear_enable
Date: Oct 2019
Contact: linux-crypto@vger.kernel.org
Description: Enabling/disabling of clear action after reading
the SEC's QM debug registers.
0: disable, 1: enable.
Only available for PF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/err_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of invalid interrupts for
+ QM task completion.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/aeq_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of QM async event queue interrupts.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/abnormal_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of interrupts for QM abnormal event.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/create_qp_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of queue allocation errors.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/mb_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of failed QM mailbox commands.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/qm/status
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the status of the QM.
+ Four states: initiated, started, stopped and closed.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of sent requests.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/recv_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of received requests.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/send_busy_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of requests sent with returning busy.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/err_bd_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of BD type error requests
+ to be received.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/invalid_req_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of invalid requests being received.
+ Available for both PF and VF, and take no other effect on SEC.
+
+What: /sys/kernel/debug/hisi_sec2/<bdf>/sec_dfx/done_flag_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of completed but marked error requests
+ to be received.
+ Available for both PF and VF, and take no other effect on SEC.
has a QM. Select the QM which below qm refers to.
Only available for PF.
-What: /sys/kernel/debug/hisi_zip/<bdf>/qm/qm_regs
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/regs
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: Dump of QM related debug registers.
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
Description: One QM may contain multiple queues. Select specific queue to
- show its debug registers in above qm_regs.
+ show its debug registers in above regs.
Only available for PF.
What: /sys/kernel/debug/hisi_zip/<bdf>/qm/clear_enable
Date: Nov 2018
Contact: linux-crypto@vger.kernel.org
-Description: QM debug registers(qm_regs) read clear control. 1 means enable
+Description: QM debug registers(regs) read clear control. 1 means enable
register read clear, otherwise 0.
Writing to this file has no functional effect, only enable or
disable counters clear after reading of these registers.
Only available for PF.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/err_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of invalid interrupts for
+ QM task completion.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/aeq_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of QM async event queue interrupts.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/abnormal_irq
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of interrupts for QM abnormal event.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/create_qp_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of queue allocation errors.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/mb_err
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the number of failed QM mailbox commands.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/qm/status
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the status of the QM.
+ Four states: initiated, started, stopped and closed.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of sent requests.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/recv_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of received requests.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/send_busy_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of requests received
+ with returning busy.
+ Available for both PF and VF, and take no other effect on ZIP.
+
+What: /sys/kernel/debug/hisi_zip/<bdf>/zip_dfx/err_bd_cnt
+Date: Apr 2020
+Contact: linux-crypto@vger.kernel.org
+Description: Dump the total number of BD type error requests
+ to be received.
+ Available for both PF and VF, and take no other effect on ZIP.
seek after the last record available at the time
the last SYSLOG_ACTION_CLEAR was issued.
+ Due to the record nature of this interface with a "read all"
+ behavior and the specific positions each seek operation sets,
+ SEEK_CUR is not supported, returning -ESPIPE (invalid seek) to
+ errno whenever requested.
+
The output format consists of a prefix carrying the syslog
prefix including priority and facility, the 64 bit message
sequence number and the monotonic timestamp in microseconds,
Scheduler and RCU
~~~~~~~~~~~~~~~~~
-RCU depends on the scheduler, and the scheduler uses RCU to protect some
-of its data structures. The preemptible-RCU ``rcu_read_unlock()``
-implementation must therefore be written carefully to avoid deadlocks
-involving the scheduler's runqueue and priority-inheritance locks. In
-particular, ``rcu_read_unlock()`` must tolerate an interrupt where the
-interrupt handler invokes both ``rcu_read_lock()`` and
-``rcu_read_unlock()``. This possibility requires ``rcu_read_unlock()``
-to use negative nesting levels to avoid destructive recursion via
-interrupt handler's use of RCU.
-
-This scheduler-RCU requirement came as a `complete
-surprise <https://lwn.net/Articles/453002/>`__.
-
-As noted above, RCU makes use of kthreads, and it is necessary to avoid
-excessive CPU-time accumulation by these kthreads. This requirement was
-no surprise, but RCU's violation of it when running context-switch-heavy
-workloads when built with ``CONFIG_NO_HZ_FULL=y`` `did come as a
-surprise
+RCU makes use of kthreads, and it is necessary to avoid excessive CPU-time
+accumulation by these kthreads. This requirement was no surprise, but
+RCU's violation of it when running context-switch-heavy workloads when
+built with ``CONFIG_NO_HZ_FULL=y`` `did come as a surprise
[PDF] <http://www.rdrop.com/users/paulmck/scalability/paper/BareMetal.2015.01.15b.pdf>`__.
RCU has made good progress towards meeting this requirement, even for
context-switch-heavy ``CONFIG_NO_HZ_FULL=y`` workloads, but there is
room for further improvement.
-It is forbidden to hold any of scheduler's runqueue or
-priority-inheritance spinlocks across an ``rcu_read_unlock()`` unless
-interrupts have been disabled across the entire RCU read-side critical
-section, that is, up to and including the matching ``rcu_read_lock()``.
-Violating this restriction can result in deadlocks involving these
-scheduler spinlocks. There was hope that this restriction might be
-lifted when interrupt-disabled calls to ``rcu_read_unlock()`` started
-deferring the reporting of the resulting RCU-preempt quiescent state
-until the end of the corresponding interrupts-disabled region.
-Unfortunately, timely reporting of the corresponding quiescent state to
-expedited grace periods requires a call to ``raise_softirq()``, which
-can acquire these scheduler spinlocks. In addition, real-time systems
-using RCU priority boosting need this restriction to remain in effect
-because deferred quiescent-state reporting would also defer deboosting,
-which in turn would degrade real-time latencies.
-
-In theory, if a given RCU read-side critical section could be guaranteed
-to be less than one second in duration, holding a scheduler spinlock
-across that critical section's ``rcu_read_unlock()`` would require only
-that preemption be disabled across the entire RCU read-side critical
-section, not interrupts. Unfortunately, given the possibility of vCPU
-preemption, long-running interrupts, and so on, it is not possible in
-practice to guarantee that a given RCU read-side critical section will
-complete in less than one second. Therefore, as noted above, if
-scheduler spinlocks are held across a given call to
-``rcu_read_unlock()``, interrupts must be disabled across the entire RCU
-read-side critical section.
+There is no longer any prohibition against holding any of
+scheduler's runqueue or priority-inheritance spinlocks across an
+``rcu_read_unlock()``, even if interrupts and preemption were enabled
+somewhere within the corresponding RCU read-side critical section.
+Therefore, it is now perfectly legal to execute ``rcu_read_lock()``
+with preemption enabled, acquire one of the scheduler locks, and hold
+that lock across the matching ``rcu_read_unlock()``.
+
+Similarly, the RCU flavor consolidation has removed the need for negative
+nesting. The fact that interrupt-disabled regions of code act as RCU
+read-side critical sections implicitly avoids earlier issues that used
+to result in destructive recursion via interrupt handler's use of RCU.
Tracing and RCU
~~~~~~~~~~~~~~~
initrd= [BOOT] Specify the location of the initial ramdisk
+ initrdmem= [KNL] Specify a physical address and size from which to
+ load the initrd. If an initrd is compiled in or
+ specified in the bootparams, it takes priority over this
+ setting.
+ Format: ss[KMG],nn[KMG]
+ Default is 0, 0
+
init_on_alloc= [MM] Fill newly allocated pages and heap objects with
zeroes.
Format: 0 | 1
Duration of CPU stall (s) to test RCU CPU stall
warnings, zero to disable.
+ rcutorture.stall_cpu_block= [KNL]
+ Sleep while stalling if set. This will result
+ in warnings from preemptible RCU in addition
+ to any other stall-related activity.
+
rcutorture.stall_cpu_holdoff= [KNL]
Time to wait (s) after boot before inducing stall.
rcutorture.stall_cpu_irqsoff= [KNL]
Disable interrupts while stalling if set.
+ rcutorture.stall_gp_kthread= [KNL]
+ Duration (s) of forced sleep within RCU
+ grace-period kthread to test RCU CPU stall
+ warnings, zero to disable. If both stall_cpu
+ and stall_gp_kthread are specified, the
+ kthread is starved first, then the CPU.
+
rcutorture.stat_interval= [KNL]
Time (s) between statistics printk()s.
only normal grace-period primitives. No effect
on CONFIG_TINY_RCU kernels.
+ rcupdate.rcu_task_ipi_delay= [KNL]
+ Set time in jiffies during which RCU tasks will
+ avoid sending IPIs, starting with the beginning
+ of a given grace period. Setting a large
+ number avoids disturbing real-time workloads,
+ but lengthens grace periods.
+
rcupdate.rcu_task_stall_timeout= [KNL]
Set timeout in jiffies for RCU task stall warning
messages. Disable with a value less than or equal
.. _perf_security:
-Perf Events and tool security
+Perf events and tool security
=============================
Overview
Data that belong to the fourth category can potentially contain
sensitive process data. If PMUs in some monitoring modes capture values
of execution context registers or data from process memory then access
-to such monitoring capabilities requires to be ordered and secured
-properly. So, perf_events/Perf performance monitoring is the subject for
-security access control management [5]_ .
+to such monitoring modes requires to be ordered and secured properly.
+So, perf_events performance monitoring and observability operations are
+the subject for security access control management [5]_ .
-perf_events/Perf access control
+perf_events access control
-------------------------------
To perform security checks, the Linux implementation splits processes
independently enabled and disabled on per-thread basis for processes and
files of unprivileged users.
-Unprivileged processes with enabled CAP_SYS_ADMIN capability are treated
+Unprivileged processes with enabled CAP_PERFMON capability are treated
as privileged processes with respect to perf_events performance
-monitoring and bypass *scope* permissions checks in the kernel.
-
-Unprivileged processes using perf_events system call API is also subject
+monitoring and observability operations, thus, bypass *scope* permissions
+checks in the kernel. CAP_PERFMON implements the principle of least
+privilege [13]_ (POSIX 1003.1e: 2.2.2.39) for performance monitoring and
+observability operations in the kernel and provides a secure approach to
+perfomance monitoring and observability in the system.
+
+For backward compatibility reasons the access to perf_events monitoring and
+observability operations is also open for CAP_SYS_ADMIN privileged
+processes but CAP_SYS_ADMIN usage for secure monitoring and observability
+use cases is discouraged with respect to the CAP_PERFMON capability.
+If system audit records [14]_ for a process using perf_events system call
+API contain denial records of acquiring both CAP_PERFMON and CAP_SYS_ADMIN
+capabilities then providing the process with CAP_PERFMON capability singly
+is recommended as the preferred secure approach to resolve double access
+denial logging related to usage of performance monitoring and observability.
+
+Unprivileged processes using perf_events system call are also subject
for PTRACE_MODE_READ_REALCREDS ptrace access mode check [7]_ , whose
outcome determines whether monitoring is permitted. So unprivileged
processes provided with CAP_SYS_PTRACE capability are effectively
CAP_SYSLOG capability permits reading kernel space memory addresses from
/proc/kallsyms file.
-perf_events/Perf privileged users
+Privileged Perf users groups
---------------------------------
Mechanisms of capabilities, privileged capability-dumb files [6]_ and
-file system ACLs [10]_ can be used to create a dedicated group of
-perf_events/Perf privileged users who are permitted to execute
-performance monitoring without scope limits. The following steps can be
-taken to create such a group of privileged Perf users.
+file system ACLs [10]_ can be used to create dedicated groups of
+privileged Perf users who are permitted to execute performance monitoring
+and observability without scope limits. The following steps can be
+taken to create such groups of privileged Perf users.
1. Create perf_users group of privileged Perf users, assign perf_users
group to Perf tool executable and limit access to the executable for
-rwxr-x--- 2 root perf_users 11M Oct 19 15:12 perf
2. Assign the required capabilities to the Perf tool executable file and
- enable members of perf_users group with performance monitoring
+ enable members of perf_users group with monitoring and observability
privileges [6]_ :
::
- # setcap "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
- # setcap -v "cap_sys_admin,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
+ # setcap -v "cap_perfmon,cap_sys_ptrace,cap_syslog=ep" perf
perf: OK
# getcap perf
- perf = cap_sys_ptrace,cap_sys_admin,cap_syslog+ep
+ perf = cap_sys_ptrace,cap_syslog,cap_perfmon+ep
+
+If the libcap installed doesn't yet support "cap_perfmon", use "38" instead,
+i.e.:
+
+::
+
+ # setcap "38,cap_ipc_lock,cap_sys_ptrace,cap_syslog=ep" perf
+
+Note that you may need to have 'cap_ipc_lock' in the mix for tools such as
+'perf top', alternatively use 'perf top -m N', to reduce the memory that
+it uses for the perf ring buffer, see the memory allocation section below.
+
+Using a libcap without support for CAP_PERFMON will make cap_get_flag(caps, 38,
+CAP_EFFECTIVE, &val) fail, which will lead the default event to be 'cycles:u',
+so as a workaround explicitly ask for the 'cycles' event, i.e.:
+
+::
+
+ # perf top -e cycles
+
+To get kernel and user samples with a perf binary with just CAP_PERFMON.
As a result, members of perf_users group are capable of conducting
-performance monitoring by using functionality of the configured Perf
-tool executable that, when executes, passes perf_events subsystem scope
-checks.
+performance monitoring and observability by using functionality of the
+configured Perf tool executable that, when executes, passes perf_events
+subsystem scope checks.
This specific access control management is only available to superuser
or root running processes with CAP_SETPCAP, CAP_SETFCAP [6]_
capabilities.
-perf_events/Perf unprivileged users
+Unprivileged users
-----------------------------------
-perf_events/Perf *scope* and *access* control for unprivileged processes
+perf_events *scope* and *access* control for unprivileged processes
is governed by perf_event_paranoid [2]_ setting:
-1:
perf_event_mlock_kb locking limit is imposed but ignored for
unprivileged processes with CAP_IPC_LOCK capability.
-perf_events/Perf resource control
+Resource control
---------------------------------
Open file descriptors
.. [10] `<http://man7.org/linux/man-pages/man5/acl.5.html>`_
.. [11] `<http://man7.org/linux/man-pages/man2/getrlimit.2.html>`_
.. [12] `<http://man7.org/linux/man-pages/man5/limits.conf.5.html>`_
-
+.. [13] `<https://sites.google.com/site/fullycapable>`_
+.. [14] `<http://man7.org/linux/man-pages/man8/auditd.8.html>`_
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+pstore block oops/panic logger
+==============================
+
+Introduction
+------------
+
+pstore block (pstore/blk) is an oops/panic logger that writes its logs to a
+block device and non-block device before the system crashes. You can get
+these log files by mounting pstore filesystem like::
+
+ mount -t pstore pstore /sys/fs/pstore
+
+
+pstore block concepts
+---------------------
+
+pstore/blk provides efficient configuration method for pstore/blk, which
+divides all configurations into two parts, configurations for user and
+configurations for driver.
+
+Configurations for user determine how pstore/blk works, such as pmsg_size,
+kmsg_size and so on. All of them support both Kconfig and module parameters,
+but module parameters have priority over Kconfig.
+
+Configurations for driver are all about block device and non-block device,
+such as total_size of block device and read/write operations.
+
+Configurations for user
+-----------------------
+
+All of these configurations support both Kconfig and module parameters, but
+module parameters have priority over Kconfig.
+
+Here is an example for module parameters::
+
+ pstore_blk.blkdev=179:7 pstore_blk.kmsg_size=64
+
+The detail of each configurations may be of interest to you.
+
+blkdev
+~~~~~~
+
+The block device to use. Most of the time, it is a partition of block device.
+It's required for pstore/blk. It is also used for MTD device.
+
+It accepts the following variants for block device:
+
+1. <hex_major><hex_minor> device number in hexadecimal represents itself; no
+ leading 0x, for example b302.
+#. /dev/<disk_name> represents the device number of disk
+#. /dev/<disk_name><decimal> represents the device number of partition - device
+ number of disk plus the partition number
+#. /dev/<disk_name>p<decimal> - same as the above; this form is used when disk
+ name of partitioned disk ends with a digit.
+#. PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF represents the unique id of
+ a partition if the partition table provides it. The UUID may be either an
+ EFI/GPT UUID, or refer to an MSDOS partition using the format SSSSSSSS-PP,
+ where SSSSSSSS is a zero-filled hex representation of the 32-bit
+ "NT disk signature", and PP is a zero-filled hex representation of the
+ 1-based partition number.
+#. PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to a
+ partition with a known unique id.
+#. <major>:<minor> major and minor number of the device separated by a colon.
+
+It accepts the following variants for MTD device:
+
+1. <device name> MTD device name. "pstore" is recommended.
+#. <device number> MTD device number.
+
+kmsg_size
+~~~~~~~~~
+
+The chunk size in KB for oops/panic front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care oops/panic log.
+
+There are multiple chunks for oops/panic front-end depending on the remaining
+space except other pstore front-ends.
+
+pstore/blk will log to oops/panic chunks one by one, and always overwrite the
+oldest chunk if there is no more free chunk.
+
+pmsg_size
+~~~~~~~~~
+
+The chunk size in KB for pmsg front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care pmsg log.
+
+Unlike oops/panic front-end, there is only one chunk for pmsg front-end.
+
+Pmsg is a user space accessible pstore object. Writes to */dev/pmsg0* are
+appended to the chunk. On reboot the contents are available in
+*/sys/fs/pstore/pmsg-pstore-blk-0*.
+
+console_size
+~~~~~~~~~~~~
+
+The chunk size in KB for console front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to pmsg front-end, there is only one chunk for console front-end.
+
+All log of console will be appended to the chunk. On reboot the contents are
+available in */sys/fs/pstore/console-pstore-blk-0*.
+
+ftrace_size
+~~~~~~~~~~~
+
+The chunk size in KB for ftrace front-end. It **MUST** be a multiple of 4.
+It's optional if you do not care console log.
+
+Similar to oops front-end, there are multiple chunks for ftrace front-end
+depending on the count of cpu processors. Each chunk size is equal to
+ftrace_size / processors_count.
+
+All log of ftrace will be appended to the chunk. On reboot the contents are
+combined and available in */sys/fs/pstore/ftrace-pstore-blk-0*.
+
+Persistent function tracing might be useful for debugging software or hardware
+related hangs. Here is an example of usage::
+
+ # mount -t pstore pstore /sys/fs/pstore
+ # mount -t debugfs debugfs /sys/kernel/debug/
+ # echo 1 > /sys/kernel/debug/pstore/record_ftrace
+ # reboot -f
+ [...]
+ # mount -t pstore pstore /sys/fs/pstore
+ # tail /sys/fs/pstore/ftrace-pstore-blk-0
+ CPU:0 ts:5914676 c0063828 c0063b94 call_cpuidle <- cpu_startup_entry+0x1b8/0x1e0
+ CPU:0 ts:5914678 c039ecdc c006385c cpuidle_enter_state <- call_cpuidle+0x44/0x48
+ CPU:0 ts:5914680 c039e9a0 c039ecf0 cpuidle_enter_freeze <- cpuidle_enter_state+0x304/0x314
+ CPU:0 ts:5914681 c0063870 c039ea30 sched_idle_set_state <- cpuidle_enter_state+0x44/0x314
+ CPU:1 ts:5916720 c0160f59 c015ee04 kernfs_unmap_bin_file <- __kernfs_remove+0x140/0x204
+ CPU:1 ts:5916721 c05ca625 c015ee0c __mutex_lock_slowpath <- __kernfs_remove+0x148/0x204
+ CPU:1 ts:5916723 c05c813d c05ca630 yield_to <- __mutex_lock_slowpath+0x314/0x358
+ CPU:1 ts:5916724 c05ca2d1 c05ca638 __ww_mutex_lock <- __mutex_lock_slowpath+0x31c/0x358
+
+max_reason
+~~~~~~~~~~
+
+Limiting which kinds of kmsg dumps are stored can be controlled via
+the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
+``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
+``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
+``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
+(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
+``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
+otherwise KMSG_DUMP_MAX.
+
+Configurations for driver
+-------------------------
+
+Only a block device driver cares about these configurations. A block device
+driver uses ``register_pstore_blk`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/blk.c
+ :identifiers: register_pstore_blk
+
+A non-block device driver uses ``register_pstore_device`` with
+``struct pstore_device_info`` to register to pstore/blk.
+
+.. kernel-doc:: fs/pstore/blk.c
+ :identifiers: register_pstore_device
+
+.. kernel-doc:: include/linux/pstore_blk.h
+ :identifiers: pstore_device_info
+
+Compression and header
+----------------------
+
+Block device is large enough for uncompressed oops data. Actually we do not
+recommend data compression because pstore/blk will insert some information into
+the first line of oops/panic data. For example::
+
+ Panic: Total 16 times
+
+It means that it's OOPS|Panic for the 16th time since the first booting.
+Sometimes the number of occurrences of oops|panic since the first booting is
+important to judge whether the system is stable.
+
+The following line is inserted by pstore filesystem. For example::
+
+ Oops#2 Part1
+
+It means that it's OOPS for the 2nd time on the last boot.
+
+Reading the data
+----------------
+
+The dump data can be read from the pstore filesystem. The format for these
+files is ``dmesg-pstore-blk-[N]`` for oops/panic front-end,
+``pmsg-pstore-blk-0`` for pmsg front-end and so on. The timestamp of the
+dump file records the trigger time. To delete a stored record from block
+device, simply unlink the respective pstore file.
+
+Attentions in panic read/write APIs
+-----------------------------------
+
+If on panic, the kernel is not going to run for much longer, the tasks will not
+be scheduled and most kernel resources will be out of service. It
+looks like a single-threaded program running on a single-core computer.
+
+The following points require special attention for panic read/write APIs:
+
+1. Can **NOT** allocate any memory.
+ If you need memory, just allocate while the block driver is initializing
+ rather than waiting until the panic.
+#. Must be polled, **NOT** interrupt driven.
+ No task schedule any more. The block driver should delay to ensure the write
+ succeeds, but NOT sleep.
+#. Can **NOT** take any lock.
+ There is no other task, nor any shared resource; you are safe to break all
+ locks.
+#. Just use CPU to transfer.
+ Do not use DMA to transfer unless you are sure that DMA will not keep lock.
+#. Control registers directly.
+ Please control registers directly rather than use Linux kernel resources.
+ Do I/O map while initializing rather than wait until a panic occurs.
+#. Reset your block device and controller if necessary.
+ If you are not sure of the state of your block device and controller when
+ a panic occurs, you are safe to stop and reset them.
+
+pstore/blk supports psblk_blkdev_info(), which is defined in
+*linux/pstore_blk.h*, to get information of using block device, such as the
+device number, sector count and start sector of the whole disk.
+
+pstore block internals
+----------------------
+
+For developer reference, here are all the important structures and APIs:
+
+.. kernel-doc:: fs/pstore/zone.c
+ :internal:
+
+.. kernel-doc:: include/linux/pstore_zone.h
+ :internal:
+
+.. kernel-doc:: fs/pstore/blk.c
+ :export:
+
+.. kernel-doc:: include/linux/pstore_blk.h
+ :internal:
memory are implementation defined, and won't work on many ARMs such as omaps.
The memory area is divided into ``record_size`` chunks (also rounded down to
-power of two) and each oops/panic writes a ``record_size`` chunk of
+power of two) and each kmesg dump writes a ``record_size`` chunk of
information.
-Dumping both oopses and panics can be done by setting 1 in the ``dump_oops``
-variable while setting 0 in that variable dumps only the panics.
+Limiting which kinds of kmsg dumps are stored can be controlled via
+the ``max_reason`` value, as defined in include/linux/kmsg_dump.h's
+``enum kmsg_dump_reason``. For example, to store both Oopses and Panics,
+``max_reason`` should be set to 2 (KMSG_DUMP_OOPS), to store only Panics
+``max_reason`` should be set to 1 (KMSG_DUMP_PANIC). Setting this to 0
+(KMSG_DUMP_UNDEF), means the reason filtering will be controlled by the
+``printk.always_kmsg_dump`` boot param: if unset, it'll be KMSG_DUMP_OOPS,
+otherwise KMSG_DUMP_MAX.
The module uses a counter to record multiple dumps but the counter gets reset
on restart (i.e. new dumps after the restart will overwrite old ones).
.mem_address = <...>,
.mem_type = <...>,
.record_size = <...>,
- .dump_oops = <...>,
+ .max_reason = <...>,
.ecc = <...>,
};
===================
Controls use of the performance events system by unprivileged
-users (without CAP_SYS_ADMIN). The default value is 2.
+users (without CAP_PERFMON). The default value is 2.
+
+For backward compatibility reasons access to system performance
+monitoring and observability remains open for CAP_SYS_ADMIN
+privileged processes but CAP_SYS_ADMIN usage for secure system
+performance monitoring and observability operations is discouraged
+with respect to CAP_PERFMON use cases.
=== ==================================================================
-1 Allow use of (almost) all events by all users.
``CAP_IPC_LOCK``.
>=0 Disallow ftrace function tracepoint by users without
- ``CAP_SYS_ADMIN``.
+ ``CAP_PERFMON``.
- Disallow raw tracepoint access by users without ``CAP_SYS_ADMIN``.
+ Disallow raw tracepoint access by users without ``CAP_PERFMON``.
->=1 Disallow CPU event access by users without ``CAP_SYS_ADMIN``.
+>=1 Disallow CPU event access by users without ``CAP_PERFMON``.
->=2 Disallow kernel profiling by users without ``CAP_SYS_ADMIN``.
+>=2 Disallow kernel profiling by users without ``CAP_PERFMON``.
=== ==================================================================
consideration the effect of compiler optimisations which may occur
when tail-calls are used and marked with the noreturn GCC attribute.
+Probed Pointers from BPF / tracing
+----------------------------------
+
+::
+
+ %pks kernel string
+ %pus user string
+
+The ``k`` and ``u`` specifiers are used for printing prior probed memory from
+either kernel memory (k) or user memory (u). The subsequent ``s`` specifier
+results in printing a string. For direct use in regular vsnprintf() the (k)
+and (u) annotation is ignored, however, when used out of BPF's bpf_trace_printk(),
+for example, it reads the memory it is pointing to without faulting.
+
Kernel Pointers
---------------
%pfwf /ocp@68000000/i2c@48072000/camera@10/port/endpoint - Full name
%pfwP endpoint - Node name
-Time and date (struct rtc_time)
--------------------------------
+Time and date
+-------------
::
- %ptR YYYY-mm-ddTHH:MM:SS
- %ptRd YYYY-mm-dd
- %ptRt HH:MM:SS
- %ptR[dt][r]
+ %pt[RT] YYYY-mm-ddTHH:MM:SS
+ %pt[RT]d YYYY-mm-dd
+ %pt[RT]t HH:MM:SS
+ %pt[RT][dt][r]
-For printing date and time as represented by struct rtc_time structure in
-human readable format.
+For printing date and time as represented by
+ R struct rtc_time structure
+ T time64_t type
+in human readable format.
-By default year will be incremented by 1900 and month by 1. Use %ptRr (raw)
-to suppress this behaviour.
+By default year will be incremented by 1900 and month by 1.
+Use %pt[RT]r (raw) to suppress this behaviour.
Passed by reference.
======================
Memory Protection Keys for Userspace (PKU aka PKEYs) is a feature
-which is found on Intel's Skylake "Scalable Processor" Server CPUs.
-It will be avalable in future non-server parts.
+which is found on Intel's Skylake (and later) "Scalable Processor"
+Server CPUs. It will be available in future non-server Intel parts
+and future AMD processors.
For anyone wishing to test or use this feature, it is available in
Amazon's EC2 C5 instances and is known to work there using an Ubuntu
- compatible :
- "fsl,vf610-edma" for eDMA used similar to that on Vybrid vf610 SoC
- "fsl,imx7ulp-edma" for eDMA2 used similar to that on i.mx7ulp
- - "fsl,fsl,ls1028a-edma" for eDMA used similar to that on Vybrid vf610 SoC
+ - "fsl,ls1028a-edma" followed by "fsl,vf610-edma" for eDMA used on the
+ LS1028A SoC.
- reg : Specifies base physical address(s) and size of the eDMA registers.
The 1st region is eDMA control register's address and size.
The 2nd and the 3rd regions are programmable channel multiplexing
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+# Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/hwmon/baikal,bt1-pvt.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Baikal-T1 PVT Sensor
+
+maintainers:
+ - Serge Semin <fancer.lancer@gmail.com>
+
+description: |
+ Baikal-T1 SoC provides an embedded process, voltage and temperature
+ sensor to monitor an internal SoC environment (chip temperature, supply
+ voltage and process monitor) and on time detect critical situations,
+ which may cause the system instability and even damages. The IP-block
+ is based on the Analog Bits PVT sensor, but is equipped with a dedicated
+ control wrapper, which provides a MMIO registers-based access to the
+ sensor core functionality (APB3-bus based) and exposes an additional
+ functions like thresholds/data ready interrupts, its status and masks,
+ measurements timeout. Its internal structure is depicted on the next
+ diagram:
+
+ Analog Bits core Bakal-T1 PVT control block
+ +--------------------+ +------------------------+
+ | Temperature sensor |-+ +------| Sensors control |
+ |--------------------| |<---En---| |------------------------|
+ | Voltage sensor |-|<--Mode--| +--->| Sampled data |
+ |--------------------| |<--Trim--+ | |------------------------|
+ | Low-Vt sensor |-| | +--| Thresholds comparator |
+ |--------------------| |---Data----| | |------------------------|
+ | High-Vt sensor |-| | +->| Interrupts status |
+ |--------------------| |--Valid--+-+ | |------------------------|
+ | Standard-Vt sensor |-+ +---+--| Interrupts mask |
+ +--------------------+ |------------------------|
+ ^ | Interrupts timeout |
+ | +------------------------+
+ | ^ ^
+ Rclk-----+----------------------------------------+ |
+ APB3-------------------------------------------------+
+
+ This bindings describes the external Baikal-T1 PVT control interfaces
+ like MMIO registers space, interrupt request number and clocks source.
+ These are then used by the corresponding hwmon device driver to
+ implement the sysfs files-based access to the sensors functionality.
+
+properties:
+ compatible:
+ const: baikal,bt1-pvt
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ items:
+ - description: PVT reference clock
+ - description: APB3 interface clock
+
+ clock-names:
+ items:
+ - const: ref
+ - const: pclk
+
+ "#thermal-sensor-cells":
+ description: Baikal-T1 can be referenced as the CPU thermal-sensor
+ const: 0
+
+ baikal,pvt-temp-offset-millicelsius:
+ description: |
+ Temperature sensor trimming factor. It can be used to manually adjust the
+ temperature measurements within 7.130 degrees Celsius.
+ maxItems: 1
+ items:
+ default: 0
+ minimum: 0
+ maximum: 7130
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - clock-names
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/mips-gic.h>
+
+ pvt@1f200000 {
+ compatible = "baikal,bt1-pvt";
+ reg = <0x1f200000 0x1000>;
+ #thermal-sensor-cells = <0>;
+
+ interrupts = <GIC_SHARED 31 IRQ_TYPE_LEVEL_HIGH>;
+
+ baikal,pvt-temp-trim-millicelsius = <1000>;
+
+ clocks = <&ccu_sys>, <&ccu_sys>;
+ clock-names = "ref", "pclk";
+ };
+...
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/mfd/gateworks-gsc.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Gateworks System Controller
+
+description: |
+ The Gateworks System Controller (GSC) is a device present across various
+ Gateworks product families that provides a set of system related features
+ such as the following (refer to the board hardware user manuals to see what
+ features are present)
+ - Watchdog Timer
+ - GPIO
+ - Pushbutton controller
+ - Hardware monitor with ADC's for temperature and voltage rails and
+ fan controller
+
+maintainers:
+ - Tim Harvey <tharvey@gateworks.com>
+ - Robert Jones <rjones@gateworks.com>
+
+properties:
+ $nodename:
+ pattern: "gsc@[0-9a-f]{1,2}"
+ compatible:
+ const: gw,gsc
+
+ reg:
+ description: I2C device address
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ interrupt-controller: true
+
+ "#interrupt-cells":
+ const: 1
+
+ "#address-cells":
+ const: 1
+
+ "#size-cells":
+ const: 0
+
+ adc:
+ type: object
+ description: Optional hardware monitoring module
+
+ properties:
+ compatible:
+ const: gw,gsc-adc
+
+ "#address-cells":
+ const: 1
+
+ "#size-cells":
+ const: 0
+
+ patternProperties:
+ "^channel@[0-9]+$":
+ type: object
+ description: |
+ Properties for a single ADC which can report cooked values
+ (i.e. temperature sensor based on thermister), raw values
+ (i.e. voltage rail with a pre-scaling resistor divider).
+
+ properties:
+ reg:
+ description: Register of the ADC
+ maxItems: 1
+
+ label:
+ description: Name of the ADC input
+
+ gw,mode:
+ description: |
+ conversion mode:
+ 0 - temperature, in C*10
+ 1 - pre-scaled voltage value
+ 2 - scaled voltage based on an optional resistor divider
+ and optional offset
+ $ref: /schemas/types.yaml#/definitions/uint32
+ enum: [0, 1, 2]
+
+ gw,voltage-divider-ohms:
+ description: Values of resistors for divider on raw ADC input
+ maxItems: 2
+ items:
+ minimum: 1000
+ maximum: 1000000
+
+ gw,voltage-offset-microvolt:
+ description: |
+ A positive voltage offset to apply to a raw ADC
+ (i.e. to compensate for a diode drop).
+ minimum: 0
+ maximum: 1000000
+
+ required:
+ - gw,mode
+ - reg
+ - label
+
+ required:
+ - compatible
+ - "#address-cells"
+ - "#size-cells"
+
+patternProperties:
+ "^fan-controller@[0-9a-f]+$":
+ type: object
+ description: Optional fan controller
+
+ properties:
+ compatible:
+ const: gw,gsc-fan
+
+ "#address-cells":
+ const: 1
+
+ "#size-cells":
+ const: 0
+
+ reg:
+ description: The fan controller base address
+ maxItems: 1
+
+ required:
+ - compatible
+ - reg
+ - "#address-cells"
+ - "#size-cells"
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - interrupt-controller
+ - "#interrupt-cells"
+ - "#address-cells"
+ - "#size-cells"
+
+examples:
+ - |
+ #include <dt-bindings/gpio/gpio.h>
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ gsc@20 {
+ compatible = "gw,gsc";
+ reg = <0x20>;
+ interrupt-parent = <&gpio1>;
+ interrupts = <4 GPIO_ACTIVE_LOW>;
+ interrupt-controller;
+ #interrupt-cells = <1>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ adc {
+ compatible = "gw,gsc-adc";
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ channel@0 { /* A0: Board Temperature */
+ reg = <0x00>;
+ label = "temp";
+ gw,mode = <0>;
+ };
+
+ channel@2 { /* A1: Input Voltage (raw ADC) */
+ reg = <0x02>;
+ label = "vdd_vin";
+ gw,mode = <1>;
+ gw,voltage-divider-ohms = <22100 1000>;
+ gw,voltage-offset-microvolt = <800000>;
+ };
+
+ channel@b { /* A2: Battery voltage */
+ reg = <0x0b>;
+ label = "vdd_bat";
+ gw,mode = <1>;
+ };
+ };
+
+ fan-controller@2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "gw,gsc-fan";
+ reg = <0x2c>;
+ };
+ };
+ };
- ESAFEOUT1: (ldo19)
- ESAFEOUT2: (ld020)
+ - CHARGER: main battery charger current control
+
Standard regulator bindings are used inside regulator subnodes. Check
Documentation/devicetree/bindings/regulator/regulator.txt
for more details.
regulator-always-on;
regulator-boot-on;
};
+
+ charger_reg: CHARGER {
+ regulator-name = "CHARGER";
+ regulator-min-microamp = <90000>;
+ regulator-max-microamp = <800000>;
+ };
};
};
#size-cells = <0>;
ports {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
port0@0 {
reg = <0>;
label = "lan1";
+++ /dev/null
-Anatop Voltage regulators
-
-Required properties:
-- compatible: Must be "fsl,anatop-regulator"
-- regulator-name: A string used as a descriptive name for regulator outputs
-- anatop-reg-offset: Anatop MFD register offset
-- anatop-vol-bit-shift: Bit shift for the register
-- anatop-vol-bit-width: Number of bits used in the register
-- anatop-min-bit-val: Minimum value of this register
-- anatop-min-voltage: Minimum voltage of this regulator
-- anatop-max-voltage: Maximum voltage of this regulator
-
-Optional properties:
-- anatop-delay-reg-offset: Anatop MFD step time register offset
-- anatop-delay-bit-shift: Bit shift for the step time register
-- anatop-delay-bit-width: Number of bits used in the step time register
-- vin-supply: The supply for this regulator
-- anatop-enable-bit: Regulator enable bit offset
-
-Any property defined as part of the core regulator
-binding, defined in regulator.txt, can also be used.
-
-Example:
-
- regulator-vddpu {
- compatible = "fsl,anatop-regulator";
- regulator-name = "vddpu";
- regulator-min-microvolt = <725000>;
- regulator-max-microvolt = <1300000>;
- regulator-always-on;
- anatop-reg-offset = <0x140>;
- anatop-vol-bit-shift = <9>;
- anatop-vol-bit-width = <5>;
- anatop-delay-reg-offset = <0x170>;
- anatop-delay-bit-shift = <24>;
- anatop-delay-bit-width = <2>;
- anatop-min-bit-val = <1>;
- anatop-min-voltage = <725000>;
- anatop-max-voltage = <1300000>;
- };
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/anatop-regulator.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Freescale Anatop Voltage Regulators
+
+maintainers:
+ - Ying-Chun Liu (PaulLiu) <paul.liu@linaro.org>
+
+allOf:
+ - $ref: "regulator.yaml#"
+
+properties:
+ compatible:
+ const: fsl,anatop-regulator
+
+ regulator-name: true
+
+ anatop-reg-offset:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the anatop MFD register offset.
+
+ anatop-vol-bit-shift:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the bit shift for the register.
+
+ anatop-vol-bit-width:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the number of bits used in the register.
+
+ anatop-min-bit-val:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the minimum value of this register.
+
+ anatop-min-voltage:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the minimum voltage of this regulator.
+
+ anatop-max-voltage:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the maximum voltage of this regulator.
+
+ anatop-delay-reg-offset:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the anatop MFD step time register offset.
+
+ anatop-delay-bit-shift:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the bit shift for the step time register.
+
+ anatop-delay-bit-width:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing the number of bits used in the step time register.
+
+ anatop-enable-bit:
+ $ref: '/schemas/types.yaml#/definitions/uint32'
+ description: u32 value representing regulator enable bit offset.
+
+ vin-supply:
+ $ref: '/schemas/types.yaml#/definitions/phandle'
+ description: input supply phandle.
+
+required:
+ - compatible
+ - regulator-name
+ - anatop-reg-offset
+ - anatop-vol-bit-shift
+ - anatop-vol-bit-width
+ - anatop-min-bit-val
+ - anatop-min-voltage
+ - anatop-max-voltage
+
+unevaluatedProperties: false
+
+examples:
+ - |
+ regulator-vddpu {
+ compatible = "fsl,anatop-regulator";
+ regulator-name = "vddpu";
+ regulator-min-microvolt = <725000>;
+ regulator-max-microvolt = <1300000>;
+ regulator-always-on;
+ anatop-reg-offset = <0x140>;
+ anatop-vol-bit-shift = <9>;
+ anatop-vol-bit-width = <5>;
+ anatop-delay-reg-offset = <0x170>;
+ anatop-delay-bit-shift = <24>;
+ anatop-delay-bit-width = <2>;
+ anatop-min-bit-val = <1>;
+ anatop-min-voltage = <725000>;
+ anatop-max-voltage = <1300000>;
+ };
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0-only OR BSD-2-Clause
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/regulator/maxim,max77826.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Maxim Integrated MAX77826 PMIC
+
+maintainers:
+ - Iskren Chernev <iskren.chernev@gmail.com>
+
+properties:
+ $nodename:
+ pattern: "pmic@[0-9a-f]{1,2}"
+ compatible:
+ enum:
+ - maxim,max77826
+
+ reg:
+ maxItems: 1
+
+ regulators:
+ type: object
+ allOf:
+ - $ref: regulator.yaml#
+ description: |
+ list of regulators provided by this controller, must be named
+ after their hardware counterparts LDO[1-15], BUCK and BUCKBOOST
+
+ patternProperties:
+ "^LDO([1-9]|1[0-5])$":
+ type: object
+ allOf:
+ - $ref: regulator.yaml#
+
+ "^BUCK|BUCKBOOST$":
+ type: object
+ allOf:
+ - $ref: regulator.yaml#
+
+ additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - regulators
+
+additionalProperties: false
+
+examples:
+ - |
+ i2c {
+ #address-cells = <1>;
+ #size-cells = <0>;
+
+ pmic@69 {
+ compatible = "maxim,max77826";
+ reg = <0x69>;
+
+ regulators {
+ LDO2 {
+ regulator-min-microvolt = <650000>;
+ regulator-max-microvolt = <3587500>;
+ };
+ };
+ };
+ };
+...
- ecc-size: enables ECC support and specifies ECC buffer size in bytes
(defaults to 0: no ECC)
-- record-size: maximum size in bytes of each dump done on oops/panic
+- record-size: maximum size in bytes of each kmsg dump.
(defaults to 0: disabled)
- console-size: size in bytes of log buffer reserved for kernel messages
- unbuffered: if present, use unbuffered mappings to map the reserved region
(defaults to buffered mappings)
-- no-dump-oops: if present, only dump panics (defaults to panics and oops)
+- max-reason: if present, sets maximum type of kmsg dump reasons to store
+ (defaults to 2: log Oopses and Panics). This can be set to INT_MAX to
+ store all kmsg dumps. See include/linux/kmsg_dump.h KMSG_DUMP_* for other
+ kmsg dump reason values. Setting this to 0 (KMSG_DUMP_UNDEF), means the
+ reason filtering will be controlled by the printk.always_kmsg_dump boot
+ param: if unset, it will be KMSG_DUMP_OOPS, otherwise KMSG_DUMP_MAX.
+
+- no-dump-oops: deprecated, use max_reason instead. If present, and
+ max_reason is not specified, it is equivalent to max_reason = 1
+ (KMSG_DUMP_PANIC).
- flags: if present, pass ramoops behavioral flags (defaults to 0,
see include/linux/pstore_ram.h RAMOOPS_FLAG_* for flag values).
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/rng/arm-cctrng.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Arm TrustZone CryptoCell TRNG engine
+
+maintainers:
+ - Hadar Gat <hadar.gat@arm.com>
+
+description: |+
+ Arm TrustZone CryptoCell TRNG (True Random Number Generator) engine.
+
+properties:
+ compatible:
+ enum:
+ - arm,cryptocell-713-trng
+ - arm,cryptocell-703-trng
+
+ interrupts:
+ maxItems: 1
+
+ reg:
+ maxItems: 1
+
+ arm,rosc-ratio:
+ description:
+ Arm TrustZone CryptoCell TRNG engine has 4 ring oscillators.
+ Sampling ratio values for these 4 ring oscillators. (from calibration)
+ allOf:
+ - $ref: /schemas/types.yaml#/definitions/uint32-array
+ - items:
+ maxItems: 4
+
+ clocks:
+ maxItems: 1
+
+required:
+ - compatible
+ - interrupts
+ - reg
+ - arm,rosc-ratio
+
+additionalProperties: false
+
+examples:
+ - |
+ arm_cctrng: rng@60000000 {
+ compatible = "arm,cryptocell-713-trng";
+ interrupts = <0 29 4>;
+ reg = <0x60000000 0x10000>;
+ arm,rosc-ratio = <5000 1000 500 0>;
+ };
"brcm,spi-bcm-qspi", "brcm,spi-brcmstb-qspi" : MSPI+BSPI on BRCMSTB SoCs
"brcm,spi-bcm-qspi", "brcm,spi-brcmstb-mspi" : Second Instance of MSPI
BRCMSTB SoCs
+ "brcm,spi-bcm7425-qspi", "brcm,spi-bcm-qspi", "brcm,spi-brcmstb-mspi" : Second Instance of MSPI
+ BRCMSTB SoCs
+ "brcm,spi-bcm7429-qspi", "brcm,spi-bcm-qspi", "brcm,spi-brcmstb-mspi" : Second Instance of MSPI
+ BRCMSTB SoCs
+ "brcm,spi-bcm7435-qspi", "brcm,spi-bcm-qspi", "brcm,spi-brcmstb-mspi" : Second Instance of MSPI
+ BRCMSTB SoCs
+ "brcm,spi-bcm7216-qspi", "brcm,spi-bcm-qspi", "brcm,spi-brcmstb-mspi" : Second Instance of MSPI
+ BRCMSTB SoCs
+ "brcm,spi-bcm7278-qspi", "brcm,spi-bcm-qspi", "brcm,spi-brcmstb-mspi" : Second Instance of MSPI
+ BRCMSTB SoCs
"brcm,spi-bcm-qspi", "brcm,spi-nsp-qspi" : MSPI+BSPI on Cygnus, NSP
"brcm,spi-bcm-qspi", "brcm,spi-ns2-qspi" : NS2 SoCs
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/mikrotik,rb4xx-spi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: MikroTik RB4xx series SPI master
+
+maintainers:
+ - Gabor Juhos <juhosg@openwrt.org>
+ - Bert Vermeulen <bert@biot.com>
+
+allOf:
+ - $ref: "spi-controller.yaml#"
+
+properties:
+ compatible:
+ const: mikrotik,rb4xx-spi
+
+ reg:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+
+examples:
+ - |
+ spi: spi@1f000000 {
+ #address-cells = <1>;
+ #size-cells = <0>;
+ compatible = "mikrotik,rb4xx-spi";
+ reg = <0x1f000000 0x10>;
+ };
+
+...
\ No newline at end of file
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/renesas,rspi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Renesas (Quad) Serial Peripheral Interface (RSPI/QSPI)
+
+maintainers:
+ - Geert Uytterhoeven <geert+renesas@glider.be>
+
+properties:
+ compatible:
+ oneOf:
+ - items:
+ - enum:
+ - renesas,rspi-sh7757 # SH7757
+ - const: renesas,rspi # Legacy SH
+
+ - items:
+ - enum:
+ - renesas,rspi-r7s72100 # RZ/A1H
+ - renesas,rspi-r7s9210 # RZ/A2
+ - const: renesas,rspi-rz # RZ/A
+
+ - items:
+ - enum:
+ - renesas,qspi-r8a7743 # RZ/G1M
+ - renesas,qspi-r8a7744 # RZ/G1N
+ - renesas,qspi-r8a7745 # RZ/G1E
+ - renesas,qspi-r8a77470 # RZ/G1C
+ - renesas,qspi-r8a7790 # R-Car H2
+ - renesas,qspi-r8a7791 # R-Car M2-W
+ - renesas,qspi-r8a7792 # R-Car V2H
+ - renesas,qspi-r8a7793 # R-Car M2-N
+ - renesas,qspi-r8a7794 # R-Car E2
+ - const: renesas,qspi # R-Car Gen2 and RZ/G1
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ oneOf:
+ - items:
+ - description: A combined interrupt
+ - items:
+ - description: Error interrupt (SPEI)
+ - description: Receive Interrupt (SPRI)
+ - description: Transmit Interrupt (SPTI)
+
+ interrupt-names:
+ oneOf:
+ - items:
+ - const: mux
+ - items:
+ - const: error
+ - const: rx
+ - const: tx
+
+ clocks:
+ maxItems: 1
+
+ power-domains:
+ maxItems: 1
+
+ resets:
+ maxItems: 1
+
+ dmas:
+ description:
+ Must contain a list of pairs of references to DMA specifiers, one for
+ transmission, and one for reception.
+
+ dma-names:
+ minItems: 2
+ maxItems: 4
+ items:
+ enum:
+ - tx
+ - rx
+
+ num-cs:
+ description: |
+ Total number of native chip selects.
+ Hardware limitations related to chip selects:
+ - When using GPIO chip selects, at least one native chip select must
+ be left unused, as it will be driven anyway.
+ minimum: 1
+ maximum: 2
+ default: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - power-domains
+ - '#address-cells'
+ - '#size-cells'
+
+allOf:
+ - $ref: spi-controller.yaml#
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - renesas,rspi-rz
+ then:
+ properties:
+ interrupts:
+ minItems: 3
+ required:
+ - interrupt-names
+
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - renesas,qspi
+ then:
+ required:
+ - resets
+
+examples:
+ - |
+ #include <dt-bindings/clock/r8a7791-cpg-mssr.h>
+ #include <dt-bindings/interrupt-controller/arm-gic.h>
+ #include <dt-bindings/power/r8a7791-sysc.h>
+
+ qspi: spi@e6b10000 {
+ compatible = "renesas,qspi-r8a7791", "renesas,qspi";
+ reg = <0xe6b10000 0x2c>;
+ interrupts = <GIC_SPI 184 IRQ_TYPE_LEVEL_HIGH>;
+ clocks = <&cpg CPG_MOD 917>;
+ dmas = <&dmac0 0x17>, <&dmac0 0x18>, <&dmac1 0x17>, <&dmac1 0x18>;
+ dma-names = "tx", "rx", "tx", "rx";
+ power-domains = <&sysc R8A7791_PD_ALWAYS_ON>;
+ resets = <&cpg 917>;
+ num-cs = <1>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ };
+++ /dev/null
-Synopsys DesignWare AMBA 2.0 Synchronous Serial Interface.
-
-Required properties:
-- compatible : "snps,dw-apb-ssi" or "mscc,<soc>-spi", where soc is "ocelot" or
- "jaguar2", or "amazon,alpine-dw-apb-ssi"
-- reg : The register base for the controller. For "mscc,<soc>-spi", a second
- register set is required (named ICPU_CFG:SPI_MST)
-- interrupts : One interrupt, used by the controller.
-- #address-cells : <1>, as required by generic SPI binding.
-- #size-cells : <0>, also as required by generic SPI binding.
-- clocks : phandles for the clocks, see the description of clock-names below.
- The phandle for the "ssi_clk" is required. The phandle for the "pclk" clock
- is optional. If a single clock is specified but no clock-name, it is the
- "ssi_clk" clock. If both clocks are listed, the "ssi_clk" must be first.
-
-Optional properties:
-- clock-names : Contains the names of the clocks:
- "ssi_clk", for the core clock used to generate the external SPI clock.
- "pclk", the interface clock, required for register access. If a clock domain
- used to enable this clock then it should be named "pclk_clkdomain".
-- cs-gpios : Specifies the gpio pins to be used for chipselects.
-- num-cs : The number of chipselects. If omitted, this will default to 4.
-- reg-io-width : The I/O register width (in bytes) implemented by this
- device. Supported values are 2 or 4 (the default).
-
-Child nodes as per the generic SPI binding.
-
-Example:
-
- spi@fff00000 {
- compatible = "snps,dw-apb-ssi";
- reg = <0xfff00000 0x1000>;
- interrupts = <0 154 4>;
- #address-cells = <1>;
- #size-cells = <0>;
- clocks = <&spi_m_clk>;
- num-cs = <2>;
- cs-gpios = <&gpio0 13 0>,
- <&gpio0 14 0>;
- };
-
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0-only
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/snps,dw-apb-ssi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Synopsys DesignWare AMBA 2.0 Synchronous Serial Interface
+
+maintainers:
+ - Mark Brown <broonie@kernel.org>
+
+allOf:
+ - $ref: "spi-controller.yaml#"
+ - if:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - mscc,ocelot-spi
+ - mscc,jaguar2-spi
+ then:
+ properties:
+ reg:
+ minItems: 2
+
+properties:
+ compatible:
+ oneOf:
+ - description: Generic DW SPI Controller
+ enum:
+ - snps,dw-apb-ssi
+ - snps,dwc-ssi-1.01a
+ - description: Microsemi Ocelot/Jaguar2 SoC SPI Controller
+ items:
+ - enum:
+ - mscc,ocelot-spi
+ - mscc,jaguar2-spi
+ - const: snps,dw-apb-ssi
+ - description: Amazon Alpine SPI Controller
+ const: amazon,alpine-dw-apb-ssi
+ - description: Renesas RZ/N1 SPI Controller
+ items:
+ - const: renesas,rzn1-spi
+ - const: snps,dw-apb-ssi
+ - description: Intel Keem Bay SPI Controller
+ const: intel,keembay-ssi
+
+ reg:
+ minItems: 1
+ items:
+ - description: DW APB SSI controller memory mapped registers
+ - description: SPI MST region map
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ minItems: 1
+ items:
+ - description: SPI Controller reference clock source
+ - description: APB interface clock source
+
+ clock-names:
+ minItems: 1
+ items:
+ - const: ssi_clk
+ - const: pclk
+
+ resets:
+ maxItems: 1
+
+ reset-names:
+ const: spi
+
+ reg-io-width:
+ $ref: /schemas/types.yaml#/definitions/uint32
+ description: I/O register width (in bytes) implemented by this device
+ default: 4
+ enum: [ 2, 4 ]
+
+ num-cs:
+ default: 4
+ minimum: 1
+ maximum: 4
+
+ dmas:
+ items:
+ - description: TX DMA Channel
+ - description: RX DMA Channel
+
+ dma-names:
+ items:
+ - const: tx
+ - const: rx
+
+patternProperties:
+ "^.*@[0-9a-f]+$":
+ type: object
+ properties:
+ reg:
+ minimum: 0
+ maximum: 3
+
+ spi-rx-bus-width:
+ const: 1
+
+ spi-tx-bus-width:
+ const: 1
+
+unevaluatedProperties: false
+
+required:
+ - compatible
+ - reg
+ - "#address-cells"
+ - "#size-cells"
+ - interrupts
+ - clocks
+
+examples:
+ - |
+ spi@fff00000 {
+ compatible = "snps,dw-apb-ssi";
+ reg = <0xfff00000 0x1000>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ interrupts = <0 154 4>;
+ clocks = <&spi_m_clk>;
+ num-cs = <2>;
+ cs-gpios = <&gpio0 13 0>,
+ <&gpio0 14 0>;
+ };
+...
--- /dev/null
+# SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause)
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/spi/socionext,uniphier-spi.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Socionext UniPhier SPI controller
+
+description: |
+ UniPhier SoCs have SCSSI which supports SPI single channel.
+
+maintainers:
+ - Kunihiko Hayashi <hayashi.kunihiko@socionext.com>
+ - Keiji Hayashibara <hayashibara.keiji@socionext.com>
+
+allOf:
+ - $ref: spi-controller.yaml#
+
+properties:
+ "#address-cells": true
+ "#size-cells": true
+
+ compatible:
+ const: socionext,uniphier-scssi
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ clocks:
+ maxItems: 1
+
+ resets:
+ maxItems: 1
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - clocks
+ - resets
+ - "#address-cells"
+ - "#size-cells"
+
+examples:
+ - |
+ spi0: spi@54006000 {
+ compatible = "socionext,uniphier-scssi";
+ reg = <0x54006000 0x100>;
+ #address-cells = <1>;
+ #size-cells = <0>;
+ interrupts = <0 39 4>;
+ clocks = <&peri_clk 11>;
+ resets = <&peri_rst 11>;
+ };
+++ /dev/null
-Synopsys DesignWare SPI master
-
-Required properties:
-- compatible: should be "snps,designware-spi"
-- #address-cells: see spi-bus.txt
-- #size-cells: see spi-bus.txt
-- reg: address and length of the spi master registers
-- interrupts: should contain one interrupt
-- clocks: spi clock phandle
-- num-cs: see spi-bus.txt
-
-Optional properties:
-- cs-gpios: see spi-bus.txt
-
-Example:
-
-spi: spi@4020a000 {
- compatible = "snps,designware-spi";
- interrupts = <11 1>;
- reg = <0x4020a000 0x1000>;
- clocks = <&pclk>;
- num-cs = <2>;
- cs-gpios = <&banka 0 0>;
-};
+++ /dev/null
-Device tree configuration for Renesas RSPI/QSPI driver
-
-Required properties:
-- compatible : For Renesas Serial Peripheral Interface on legacy SH:
- "renesas,rspi-<soctype>", "renesas,rspi" as fallback.
- For Renesas Serial Peripheral Interface on RZ/A:
- "renesas,rspi-<soctype>", "renesas,rspi-rz" as fallback.
- For Quad Serial Peripheral Interface on R-Car Gen2 and
- RZ/G1 devices:
- "renesas,qspi-<soctype>", "renesas,qspi" as fallback.
- Examples with soctypes are:
- - "renesas,rspi-sh7757" (SH)
- - "renesas,rspi-r7s72100" (RZ/A1H)
- - "renesas,rspi-r7s9210" (RZ/A2)
- - "renesas,qspi-r8a7743" (RZ/G1M)
- - "renesas,qspi-r8a7744" (RZ/G1N)
- - "renesas,qspi-r8a7745" (RZ/G1E)
- - "renesas,qspi-r8a77470" (RZ/G1C)
- - "renesas,qspi-r8a7790" (R-Car H2)
- - "renesas,qspi-r8a7791" (R-Car M2-W)
- - "renesas,qspi-r8a7792" (R-Car V2H)
- - "renesas,qspi-r8a7793" (R-Car M2-N)
- - "renesas,qspi-r8a7794" (R-Car E2)
-- reg : Address start and address range size of the device
-- interrupts : A list of interrupt-specifiers, one for each entry in
- interrupt-names.
- If interrupt-names is not present, an interrupt specifier
- for a single muxed interrupt.
-- interrupt-names : A list of interrupt names. Should contain (if present):
- - "error" for SPEI,
- - "rx" for SPRI,
- - "tx" to SPTI,
- - "mux" for a single muxed interrupt.
-- num-cs : Number of chip selects. Some RSPI cores have more than 1.
-- #address-cells : Must be <1>
-- #size-cells : Must be <0>
-
-Optional properties:
-- clocks : Must contain a reference to the functional clock.
-- dmas : Must contain a list of two references to DMA specifiers,
- one for transmission, and one for reception.
-- dma-names : Must contain a list of two DMA names, "tx" and "rx".
-
-Pinctrl properties might be needed, too. See
-Documentation/devicetree/bindings/pinctrl/renesas,*.
-
-Examples:
-
- spi0: spi@e800c800 {
- compatible = "renesas,rspi-r7s72100", "renesas,rspi-rz";
- reg = <0xe800c800 0x24>;
- interrupts = <0 238 IRQ_TYPE_LEVEL_HIGH>,
- <0 239 IRQ_TYPE_LEVEL_HIGH>,
- <0 240 IRQ_TYPE_LEVEL_HIGH>;
- interrupt-names = "error", "rx", "tx";
- interrupt-parent = <&gic>;
- num-cs = <1>;
- #address-cells = <1>;
- #size-cells = <0>;
- };
-
- spi: spi@e6b10000 {
- compatible = "renesas,qspi-r8a7791", "renesas,qspi";
- reg = <0 0xe6b10000 0 0x2c>;
- interrupt-parent = <&gic>;
- interrupts = <0 184 IRQ_TYPE_LEVEL_HIGH>;
- clocks = <&mstp9_clks R8A7791_CLK_QSPI_MOD>;
- num-cs = <1>;
- #address-cells = <1>;
- #size-cells = <0>;
- dmas = <&dmac0 0x17>, <&dmac0 0x18>;
- dma-names = "tx", "rx";
- };
+++ /dev/null
-Socionext UniPhier SPI controller driver
-
-UniPhier SoCs have SCSSI which supports SPI single channel.
-
-Required properties:
- - compatible: should be "socionext,uniphier-scssi"
- - reg: address and length of the spi master registers
- - #address-cells: must be <1>, see spi-bus.txt
- - #size-cells: must be <0>, see spi-bus.txt
- - interrupts: a single interrupt specifier
- - pinctrl-names: should be "default"
- - pinctrl-0: pin control state for the default mode
- - clocks: a phandle to the clock for the device
- - resets: a phandle to the reset control for the device
-
-Example:
-
-spi0: spi@54006000 {
- compatible = "socionext,uniphier-scssi";
- reg = <0x54006000 0x100>;
- #address-cells = <1>;
- #size-cells = <0>;
- interrupts = <0 39 4>;
- pinctrl-names = "default";
- pinctrl-0 = <&pinctrl_spi0>;
- clocks = <&peri_clk 11>;
- resets = <&peri_rst 11>;
-};
Example:
For am4372:
-qspi: qspi@4b300000 {
+qspi: qspi@47900000 {
compatible = "ti,am4372-qspi";
reg = <0x47900000 0x100>, <0x30000000 0x4000000>;
reg-names = "qspi_base", "qspi_mmap";
description: Microsoft Corporation
"^mikroe,.*":
description: MikroElektronika d.o.o.
+ "^mikrotik,.*":
+ description: MikroTik
"^miniand,.*":
description: Miniand Tech
"^minix,.*":
What is efifb?
==============
-This is a generic EFI platform driver for Intel based Apple computers.
-efifb is only for EFI booted Intel Macs.
+This is a generic EFI platform driver for systems with UEFI firmware. The
+system must be booted via the EFI stub for this to be usable. efifb supports
+both firmware with Graphics Output Protocol (GOP) displays as well as older
+systems with only Universal Graphics Adapter (UGA) displays.
Supported Hardware
==================
- Macbook
- Macbook Pro 15"/17"
- MacMini
+- ARM/ARM64/X86 systems with UEFI firmware
How to use it?
==============
-efifb does not have any kind of autodetection of your machine.
+For UGA displays, efifb does not have any kind of autodetection of your
+machine.
+
You have to add the following kernel parameters in your elilo.conf::
Macbook :
Macbook Pro 17", iMac 20" :
video=efifb:i20
+For GOP displays, efifb can autodetect the display's resolution and framebuffer
+address, so these should work out of the box without any special parameters.
+
Accepted options:
======= ===========================================================
when large amounts of console data are written.
======= ===========================================================
+Options for GOP displays:
+
+mode=n
+ The EFI stub will set the mode of the display to mode number n if
+ possible.
+
+<xres>x<yres>[-(rgb|bgr|<bpp>)]
+ The EFI stub will search for a display mode that matches the specified
+ horizontal and vertical resolution, and optionally bit depth, and set
+ the mode of the display to it if one is found. The bit depth can either
+ "rgb" or "bgr" to match specifically those pixel formats, or a number
+ for a mode with matching bits per pixel.
+
+auto
+ The EFI stub will choose the mode with the highest resolution (product
+ of horizontal and vertical resolution). If there are multiple modes
+ with the highest resolution, it will choose one with the highest color
+ depth.
+
+list
+ The EFI stub will list out all the display modes that are available. A
+ specific mode can then be chosen using one of the above options for the
+ next boot.
+
Edgar Hucek <gimli@dark-green.com>
pass, but the performance will regress. "nobarrier" is
based on "posix", but doesn't issue flush command for
non-atomic files likewise "nobarrier" mount option.
-test_dummy_encryption Enable dummy encryption, which provides a fake fscrypt
+test_dummy_encryption
+test_dummy_encryption=%s
+ Enable dummy encryption, which provides a fake fscrypt
context. The fake fscrypt context is used by xfstests.
+ The argument may be either "v1" or "v2", in order to
+ select the corresponding fscrypt policy version.
checkpoint=%s[:%u[%]] Set to "disable" to turn off checkpointing. Set to "enable"
to reenable checkpointing. Is enabled by default. While
disabled, any unmounting or unexpected shutdowns will cause
Consequently, shrinking the filesystem may not be allowed.
This format is optimized for use with inline encryption hardware
-compliant with the UFS or eMMC standards, which support only 64 IV
-bits per I/O request and may have only a small number of keyslots.
+compliant with the UFS standard, which supports only 64 IV bits per
+I/O request and may have only a small number of keyslots.
+
+IV_INO_LBLK_32 policies
+-----------------------
+
+IV_INO_LBLK_32 policies work like IV_INO_LBLK_64, except that for
+IV_INO_LBLK_32, the inode number is hashed with SipHash-2-4 (where the
+SipHash key is derived from the master key) and added to the file
+logical block number mod 2^32 to produce a 32-bit IV.
+
+This format is optimized for use with inline encryption hardware
+compliant with the eMMC v5.2 standard, which supports only 32 IV bits
+per I/O request and may have only a small number of keyslots. This
+format results in some level of IV reuse, so it should only be used
+when necessary due to hardware limitations.
Key identifiers
---------------
to 32 bits and is placed in bits 0-31 of the IV. The inode number
(which is also limited to 32 bits) is placed in bits 32-63.
+- With `IV_INO_LBLK_32 policies`_, the logical block number is limited
+ to 32 bits and is placed in bits 0-31 of the IV. The inode number
+ is then hashed and added mod 2^32.
+
Note that because file logical block numbers are included in the IVs,
filesystems must enforce that blocks are never shifted around within
encrypted files, e.g. via "collapse range" or "insert range".
(0x3).
- FSCRYPT_POLICY_FLAG_DIRECT_KEY: See `DIRECT_KEY policies`_.
- FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64: See `IV_INO_LBLK_64
- policies`_. This is mutually exclusive with DIRECT_KEY and is not
- supported on v1 policies.
+ policies`_.
+ - FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32: See `IV_INO_LBLK_32
+ policies`_.
+
+ v1 encryption policies only support the PAD_* and DIRECT_KEY flags.
+ The other flags are only supported by v2 encryption policies.
+
+ The DIRECT_KEY, IV_INO_LBLK_64, and IV_INO_LBLK_32 flags are
+ mutually exclusive.
- For v2 encryption policies, ``__reserved`` must be zeroed.
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver amd_energy
+==========================
+
+Supported chips:
+
+* AMD Family 17h Processors
+
+ Prefix: 'amd_energy'
+
+ Addresses used: RAPL MSRs
+
+ Datasheets:
+
+ - Processor Programming Reference (PPR) for AMD Family 17h Model 01h, Revision B1 Processors
+
+ https://developer.amd.com/wp-content/resources/55570-B1_PUB.zip
+
+ - Preliminary Processor Programming Reference (PPR) for AMD Family 17h Model 31h, Revision B0 Processors
+
+ https://developer.amd.com/wp-content/resources/56176_ppr_Family_17h_Model_71h_B0_pub_Rev_3.06.zip
+
+Author: Naveen Krishna Chatradhi <nchatrad@amd.com>
+
+Description
+-----------
+
+The Energy driver exposes the energy counters that are
+reported via the Running Average Power Limit (RAPL)
+Model-specific Registers (MSRs) via the hardware monitor
+(HWMON) sysfs interface.
+
+1. Power, Energy and Time Units
+ MSR_RAPL_POWER_UNIT/ C001_0299:
+ shared with all cores in the socket
+
+2. Energy consumed by each Core
+ MSR_CORE_ENERGY_STATUS/ C001_029A:
+ 32-bitRO, Accumulator, core-level power reporting
+
+3. Energy consumed by Socket
+ MSR_PACKAGE_ENERGY_STATUS/ C001_029B:
+ 32-bitRO, Accumulator, socket-level power reporting,
+ shared with all cores in socket
+
+These registers are updated every 1ms and cleared on
+reset of the system.
+
+Note: If SMT is enabled, Linux enumerates all threads as cpus.
+Since, the energy status registers are accessed at core level,
+reading those registers from the sibling threads would result
+in duplicate values. Hence, energy counter entries are not
+populated for the siblings.
+
+Energy Caluclation
+------------------
+
+Energy information (in Joules) is based on the multiplier,
+1/2^ESU; where ESU is an unsigned integer read from
+MSR_RAPL_POWER_UNIT register. Default value is 10000b,
+indicating energy status unit is 15.3 micro-Joules increment.
+
+Reported values are scaled as per the formula
+
+scaled value = ((1/2^ESU) * (Raw value) * 1000000UL) in uJoules
+
+Users calculate power for a given domain by calculating
+ dEnergy/dTime for that domain.
+
+Energy accumulation
+--------------------------
+
+Current, Socket energy status register is 32bit, assuming a 240W
+2P system, the register would wrap around in
+
+ 2^32*15.3 e-6/240 * 2 = 547.60833024 secs to wrap(~9 mins)
+
+The Core energy register may wrap around after several days.
+
+To improve the wrap around time, a kernel thread is implemented
+to accumulate the socket energy counters and one core energy counter
+per run to a respective 64-bit counter. The kernel thread starts
+running during probe, wakes up every 100secs and stops running
+when driver is removed.
+
+A socket and core energy read would return the current register
+value added to the respective energy accumulator.
+
+Sysfs attributes
+----------------
+
+=============== ======== =====================================
+Attribute Label Description
+=============== ======== =====================================
+
+* For index N between [1] and [nr_cpus]
+
+=============== ======== ======================================
+energy[N]_input EcoreX Core Energy X = [0] to [nr_cpus - 1]
+ Measured input core energy
+=============== ======== ======================================
+
+* For N between [nr_cpus] and [nr_cpus + nr_socks]
+
+=============== ======== ======================================
+energy[N]_input EsocketX Socket Energy X = [0] to [nr_socks -1]
+ Measured input socket energy
+=============== ======== ======================================
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0-only
+
+Kernel driver bt1-pvt
+=====================
+
+Supported chips:
+
+ * Baikal-T1 PVT sensor (in SoC)
+
+ Prefix: 'bt1-pvt'
+
+ Addresses scanned: -
+
+ Datasheet: Provided by BAIKAL ELECTRONICS upon request and under NDA
+
+Authors:
+ Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>
+ Serge Semin <Sergey.Semin@baikalelectronics.ru>
+
+Description
+-----------
+
+This driver implements support for the hardware monitoring capabilities of the
+embedded into Baikal-T1 process, voltage and temperature sensors. PVT IP-core
+consists of one temperature and four voltage sensors, which can be used to
+monitor the chip internal environment like heating, supply voltage and
+transistors performance. The driver can optionally provide the hwmon alarms
+for each sensor the PVT controller supports. The alarms functionality is made
+compile-time configurable due to the hardware interface implementation
+peculiarity, which is connected with an ability to convert data from only one
+sensor at a time. Additional limitation is that the controller performs the
+thresholds checking synchronously with the data conversion procedure. Due to
+these in order to have the hwmon alarms automatically detected the driver code
+must switch from one sensor to another, read converted data and manually check
+the threshold status bits. Depending on the measurements timeout settings
+(update_interval sysfs node value) this design may cause additional burden on
+the system performance. So in case if alarms are unnecessary in your system
+design it's recommended to have them disabled to prevent the PVT IRQs being
+periodically raised to get the data cache/alarms status up to date. By default
+in alarm-less configuration the data conversion is performed by the driver
+on demand when read operation is requested via corresponding _input-file.
+
+Temperature Monitoring
+----------------------
+
+Temperature is measured with 10-bit resolution and reported in millidegree
+Celsius. The driver performs all the scaling by itself therefore reports true
+temperatures that don't need any user-space adjustments. While the data
+translation formulae isn't linear, which gives us non-linear discreteness,
+it's close to one, but giving a bit better accuracy for higher temperatures.
+The temperature input is mapped as follows (the last column indicates the input
+ranges)::
+
+ temp1: CPU embedded diode -48.38C - +147.438C
+
+In case if the alarms kernel config is enabled in the driver the temperature input
+has associated min and max limits which trigger an alarm when crossed.
+
+Voltage Monitoring
+------------------
+
+The voltage inputs are also sampled with 10-bit resolution and reported in
+millivolts. But in this case the data translation formulae is linear, which
+provides a constant measurements discreteness. The data scaling is also
+performed by the driver, so returning true millivolts. The voltage inputs are
+mapped as follows (the last column indicates the input ranges)::
+
+ in0: VDD (processor core) 0.62V - 1.168V
+ in1: Low-Vt (low voltage threshold) 0.62V - 1.168V
+ in2: High-Vt (high voltage threshold) 0.62V - 1.168V
+ in3: Standard-Vt (standard voltage threshold) 0.62V - 1.168V
+
+In case if the alarms config is enabled in the driver the voltage inputs
+have associated min and max limits which trigger an alarm when crossed.
+
+Sysfs Attributes
+----------------
+
+Following is a list of all sysfs attributes that the driver provides, their
+permissions and a short description:
+
+=============================== ======= =======================================
+Name Perm Description
+=============================== ======= =======================================
+update_interval RW Measurements update interval per
+ sensor.
+temp1_type RO Sensor type (always 1 as CPU embedded
+ diode).
+temp1_label RO CPU Core Temperature sensor.
+temp1_input RO Measured temperature in millidegree
+ Celsius.
+temp1_min RW Low limit for temp input.
+temp1_max RW High limit for temp input.
+temp1_min_alarm RO Temperature input alarm. Returns 1 if
+ temperature input went below min limit,
+ 0 otherwise.
+temp1_max_alarm RO Temperature input alarm. Returns 1 if
+ temperature input went above max limit,
+ 0 otherwise.
+temp1_offset RW Temperature offset in millidegree
+ Celsius which is added to the
+ temperature reading by the chip. It can
+ be used to manually adjust the
+ temperature measurements within 7.130
+ degrees Celsius.
+in[0-3]_label RO CPU Voltage sensor (either core or
+ low/high/standard thresholds).
+in[0-3]_input RO Measured voltage in millivolts.
+in[0-3]_min RW Low limit for voltage input.
+in[0-3]_max RW High limit for voltage input.
+in[0-3]_min_alarm RO Voltage input alarm. Returns 1 if
+ voltage input went below min limit,
+ 0 otherwise.
+in[0-3]_max_alarm RO Voltage input alarm. Returns 1 if
+ voltage input went above max limit,
+ 0 otherwise.
+=============================== ======= =======================================
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver gsc-hwmon
+=======================
+
+Supported chips: Gateworks GSC
+Datasheet: http://trac.gateworks.com/wiki/gsc
+Author: Tim Harvey <tharvey@gateworks.com>
+
+Description:
+------------
+
+This driver supports hardware monitoring for the temperature sensor,
+various ADC's connected to the GSC, and optional FAN controller available
+on some boards.
+
+
+Voltage Monitoring
+------------------
+
+The voltage inputs are scaled either internally or by the driver depending
+on the GSC version and firmware. The values returned by the driver do not need
+further scaling. The voltage input labels provide the voltage rail name:
+
+inX_input Measured voltage (mV).
+inX_label Name of voltage rail.
+
+
+Temperature Monitoring
+----------------------
+
+Temperatures are measured with 12-bit or 10-bit resolution and are scaled
+either internally or by the driver depending on the GSC version and firmware.
+The values returned by the driver reflect millidegree Celcius:
+
+tempX_input Measured temperature.
+tempX_label Name of temperature input.
+
+
+PWM Output Control
+------------------
+
+The GSC features 1 PWM output that operates in automatic mode where the
+PWM value will be scalled depending on 6 temperature boundaries.
+The tempeature boundaries are read-write and in millidegree Celcius and the
+read-only PWM values range from 0 (off) to 255 (full speed).
+Fan speed will be set to minimum (off) when the temperature sensor reads
+less than pwm1_auto_point1_temp and maximum when the temperature sensor
+equals or exceeds pwm1_auto_point6_temp.
+
+pwm1_auto_point[1-6]_pwm PWM value.
+pwm1_auto_point[1-6]_temp Temperature boundary.
+
------------------------------------------------
======================= ====================================================
+in0_lcrit Critical low shunt voltage
+in0_crit Critical high shunt voltage
+in0_lcrit_alarm Shunt voltage critical low alarm
+in0_crit_alarm Shunt voltage critical high alarm
+in1_lcrit Critical low bus voltage
+in1_crit Critical high bus voltage
+in1_lcrit_alarm Bus voltage critical low alarm
+in1_crit_alarm Bus voltage critical high alarm
+power1_crit Critical high power
+power1_crit_alarm Power critical high alarm
update_interval data conversion time; affects number of samples used
to average results for shunt and bus voltages.
======================= ====================================================
+
+.. note::
+
+ - Configure `shunt_resistor` before configure `power1_crit`, because power
+ value is calculated based on `shunt_resistor` set.
+ - Because of the underlying register implementation, only one `*crit` setting
+ and its `alarm` can be active. Writing to one `*crit` setting clears other
+ `*crit` settings and alarms. Writing 0 to any `*crit` setting clears all
+ `*crit` settings and alarms.
adt7470
adt7475
amc6821
+ amd_energy
asb100
asc7621
aspeed-pwm-tacho
bel-pfe
+ bt1-pvt
coretemp
da9052
da9055
ftsteutates
g760a
g762
+ gsc-hwmon
gl518sm
hih6130
ibmaem
max16064
max16065
max1619
+ max16601
max1668
max197
max20730
http://www.maxim-ic.com/quick_view2.cfm/qv_pk/3497
+ * Maxim MAX6654
+
+ Prefix: 'max6654'
+
+ Addresses scanned: I2C 0x18, 0x19, 0x1a, 0x29, 0x2a, 0x2b,
+
+ 0x4c, 0x4d and 0x4e
+
+ Datasheet: Publicly available at the Maxim website
+
+ https://www.maximintegrated.com/en/products/sensors/MAX6654.html
+
* Maxim MAX6657
Prefix: 'max6657'
* Extended temperature range (breaks compatibility)
* Lower resolution for remote temperature
+MAX6654:
+ * Better local resolution
+ * Selectable address
+ * Remote sensor type selection
+ * Extended temperature range
+ * Extended resolution only available when conversion rate <= 1 Hz
+
MAX6657 and MAX6658:
* Better local resolution
* Remote sensor type selection
All temperature values are given in degrees Celsius. Resolution
is 1.0 degree for the local temperature, 0.125 degree for the remote
-temperature, except for the MAX6657, MAX6658 and MAX6659 which have a
-resolution of 0.125 degree for both temperatures.
+temperature, except for the MAX6654, MAX6657, MAX6658 and MAX6659 which have
+a resolution of 0.125 degree for both temperatures.
Each sensor has its own high and low limits, plus a critical limit.
Additionally, there is a relative hysteresis value common to both critical
--- /dev/null
+.. SPDX-License-Identifier: GPL-2.0
+
+Kernel driver max16601
+======================
+
+Supported chips:
+
+ * Maxim MAX16601
+
+ Prefix: 'max16601'
+
+ Addresses scanned: -
+
+ Datasheet: Not published
+
+Author: Guenter Roeck <linux@roeck-us.net>
+
+
+Description
+-----------
+
+This driver supports the MAX16601 VR13.HC Dual-Output Voltage Regulator
+Chipset.
+
+The driver is a client driver to the core PMBus driver.
+Please see Documentation/hwmon/pmbus.rst for details on PMBus client drivers.
+
+
+Usage Notes
+-----------
+
+This driver does not auto-detect devices. You will have to instantiate the
+devices explicitly. Please see Documentation/i2c/instantiating-devices.rst for
+details.
+
+
+Platform data support
+---------------------
+
+The driver supports standard PMBus driver platform data.
+
+
+Sysfs entries
+-------------
+
+The following attributes are supported.
+
+======================= =======================================================
+in1_label "vin1"
+in1_input VCORE input voltage.
+in1_alarm Input voltage alarm.
+
+in2_label "vout1"
+in2_input VCORE output voltage.
+in2_alarm Output voltage alarm.
+
+curr1_label "iin1"
+curr1_input VCORE input current, derived from duty cycle and output
+ current.
+curr1_max Maximum input current.
+curr1_max_alarm Current high alarm.
+
+curr2_label "iin1.0"
+curr2_input VCORE phase 0 input current.
+
+curr3_label "iin1.1"
+curr3_input VCORE phase 1 input current.
+
+curr4_label "iin1.2"
+curr4_input VCORE phase 2 input current.
+
+curr5_label "iin1.3"
+curr5_input VCORE phase 3 input current.
+
+curr6_label "iin1.4"
+curr6_input VCORE phase 4 input current.
+
+curr7_label "iin1.5"
+curr7_input VCORE phase 5 input current.
+
+curr8_label "iin1.6"
+curr8_input VCORE phase 6 input current.
+
+curr9_label "iin1.7"
+curr9_input VCORE phase 7 input current.
+
+curr10_label "iin2"
+curr10_input VCORE input current, derived from sensor element.
+
+curr11_label "iin3"
+curr11_input VSA input current.
+
+curr12_label "iout1"
+curr12_input VCORE output current.
+curr12_crit Critical output current.
+curr12_crit_alarm Output current critical alarm.
+curr12_max Maximum output current.
+curr12_max_alarm Output current high alarm.
+
+curr13_label "iout1.0"
+curr13_input VCORE phase 0 output current.
+
+curr14_label "iout1.1"
+curr14_input VCORE phase 1 output current.
+
+curr15_label "iout1.2"
+curr15_input VCORE phase 2 output current.
+
+curr16_label "iout1.3"
+curr16_input VCORE phase 3 output current.
+
+curr17_label "iout1.4"
+curr17_input VCORE phase 4 output current.
+
+curr18_label "iout1.5"
+curr18_input VCORE phase 5 output current.
+
+curr19_label "iout1.6"
+curr19_input VCORE phase 6 output current.
+
+curr20_label "iout1.7"
+curr20_input VCORE phase 7 output current.
+
+curr21_label "iout3"
+curr21_input VSA output current.
+curr21_highest Historical maximum VSA output current.
+curr21_reset_history Write any value to reset curr21_highest.
+curr21_crit Critical output current.
+curr21_crit_alarm Output current critical alarm.
+curr21_max Maximum output current.
+curr21_max_alarm Output current high alarm.
+
+power1_label "pin1"
+power1_input Input power, derived from duty cycle and output current.
+power1_alarm Input power alarm.
+
+power2_label "pin2"
+power2_input Input power, derived from input current sensor.
+
+power3_label "pout"
+power3_input Output power.
+
+temp1_input VCORE temperature.
+temp1_crit Critical high temperature.
+temp1_crit_alarm Chip temperature critical high alarm.
+temp1_max Maximum temperature.
+temp1_max_alarm Chip temperature high alarm.
+
+temp2_input TSENSE_0 temperature
+temp3_input TSENSE_1 temperature
+temp4_input TSENSE_2 temperature
+temp5_input TSENSE_3 temperature
+
+temp6_input VSA temperature.
+temp6_crit Critical high temperature.
+temp6_crit_alarm Chip temperature critical high alarm.
+temp6_max Maximum temperature.
+temp6_max_alarm Chip temperature high alarm.
+======================= =======================================================
into two categories:
- Sleeping locks
+ - CPU local locks
- Spinning locks
This document conceptually describes these lock types and provides rules
On PREEMPT_RT kernels, these lock types are converted to sleeping locks:
+ - local_lock
- spinlock_t
- rwlock_t
+
+CPU local locks
+---------------
+
+ - local_lock
+
+On non-PREEMPT_RT kernels, local_lock functions are wrappers around
+preemption and interrupt disabling primitives. Contrary to other locking
+mechanisms, disabling preemption or interrupts are pure CPU local
+concurrency control mechanisms and not suited for inter-CPU concurrency
+control.
+
+
Spinning locks
--------------
_irqsave/restore() Save and disable / restore interrupt disabled state
=================== ====================================================
+
Owner semantics
===============
writer from starving readers.
+local_lock
+==========
+
+local_lock provides a named scope to critical sections which are protected
+by disabling preemption or interrupts.
+
+On non-PREEMPT_RT kernels local_lock operations map to the preemption and
+interrupt disabling and enabling primitives:
+
+ =========================== ======================
+ local_lock(&llock) preempt_disable()
+ local_unlock(&llock) preempt_enable()
+ local_lock_irq(&llock) local_irq_disable()
+ local_unlock_irq(&llock) local_irq_enable()
+ local_lock_save(&llock) local_irq_save()
+ local_lock_restore(&llock) local_irq_save()
+ =========================== ======================
+
+The named scope of local_lock has two advantages over the regular
+primitives:
+
+ - The lock name allows static analysis and is also a clear documentation
+ of the protection scope while the regular primitives are scopeless and
+ opaque.
+
+ - If lockdep is enabled the local_lock gains a lockmap which allows to
+ validate the correctness of the protection. This can detect cases where
+ e.g. a function using preempt_disable() as protection mechanism is
+ invoked from interrupt or soft-interrupt context. Aside of that
+ lockdep_assert_held(&llock) works as with any other locking primitive.
+
+local_lock and PREEMPT_RT
+-------------------------
+
+PREEMPT_RT kernels map local_lock to a per-CPU spinlock_t, thus changing
+semantics:
+
+ - All spinlock_t changes also apply to local_lock.
+
+local_lock usage
+----------------
+
+local_lock should be used in situations where disabling preemption or
+interrupts is the appropriate form of concurrency control to protect
+per-CPU data structures on a non PREEMPT_RT kernel.
+
+local_lock is not suitable to protect against preemption or interrupts on a
+PREEMPT_RT kernel due to the PREEMPT_RT specific spinlock_t semantics.
+
+
raw_spinlock_t and spinlock_t
=============================
PREEMPT_RT caveats
==================
+local_lock on RT
+----------------
+
+The mapping of local_lock to spinlock_t on PREEMPT_RT kernels has a few
+implications. For example, on a non-PREEMPT_RT kernel the following code
+sequence works as expected::
+
+ local_lock_irq(&local_lock);
+ raw_spin_lock(&lock);
+
+and is fully equivalent to::
+
+ raw_spin_lock_irq(&lock);
+
+On a PREEMPT_RT kernel this code sequence breaks because local_lock_irq()
+is mapped to a per-CPU spinlock_t which neither disables interrupts nor
+preemption. The following code sequence works perfectly correct on both
+PREEMPT_RT and non-PREEMPT_RT kernels::
+
+ local_lock_irq(&local_lock);
+ spin_lock(&lock);
+
+Another caveat with local locks is that each local_lock has a specific
+protection scope. So the following substitution is wrong::
+
+ func1()
+ {
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock_1, flags);
+ func3();
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock_1, flags);
+ }
+
+ func2()
+ {
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock_2, flags);
+ func3();
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock_2, flags);
+ }
+
+ func3()
+ {
+ lockdep_assert_irqs_disabled();
+ access_protected_data();
+ }
+
+On a non-PREEMPT_RT kernel this works correctly, but on a PREEMPT_RT kernel
+local_lock_1 and local_lock_2 are distinct and cannot serialize the callers
+of func3(). Also the lockdep assert will trigger on a PREEMPT_RT kernel
+because local_lock_irqsave() does not disable interrupts due to the
+PREEMPT_RT-specific semantics of spinlock_t. The correct substitution is::
+
+ func1()
+ {
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock, flags);
+ func3();
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock, flags);
+ }
+
+ func2()
+ {
+ local_irq_save(flags); -> local_lock_irqsave(&local_lock, flags);
+ func3();
+ local_irq_restore(flags); -> local_lock_irqrestore(&local_lock, flags);
+ }
+
+ func3()
+ {
+ lockdep_assert_held(&local_lock);
+ access_protected_data();
+ }
+
+
spinlock_t and rwlock_t
-----------------------
-These changes in spinlock_t and rwlock_t semantics on PREEMPT_RT kernels
+The changes in spinlock_t and rwlock_t semantics on PREEMPT_RT kernels
have a few implications. For example, on a non-PREEMPT_RT kernel the
following code sequence works as expected::
allowing things like per-CPU interrupt disabled locks to be acquired.
However, this approach should be used only where absolutely necessary.
+A typical scenario is protection of per-CPU variables in thread context::
-raw_spinlock_t
---------------
+ struct foo *p = get_cpu_ptr(&var1);
+
+ spin_lock(&p->lock);
+ p->count += this_cpu_read(var2);
+
+This is correct code on a non-PREEMPT_RT kernel, but on a PREEMPT_RT kernel
+this breaks. The PREEMPT_RT-specific change of spinlock_t semantics does
+not allow to acquire p->lock because get_cpu_ptr() implicitly disables
+preemption. The following substitution works on both kernels::
+
+ struct foo *p;
+
+ migrate_disable();
+ p = this_cpu_ptr(&var1);
+ spin_lock(&p->lock);
+ p->count += this_cpu_read(var2);
+
+On a non-PREEMPT_RT kernel migrate_disable() maps to preempt_disable()
+which makes the above code fully equivalent. On a PREEMPT_RT kernel
+migrate_disable() ensures that the task is pinned on the current CPU which
+in turn guarantees that the per-CPU access to var1 and var2 are staying on
+the same CPU.
+
+The migrate_disable() substitution is not valid for the following
+scenario::
+
+ func()
+ {
+ struct foo *p;
+
+ migrate_disable();
+ p = this_cpu_ptr(&var1);
+ p->val = func2();
+
+While correct on a non-PREEMPT_RT kernel, this breaks on PREEMPT_RT because
+here migrate_disable() does not protect against reentrancy from a
+preempting task. A correct substitution for this case is::
+
+ func()
+ {
+ struct foo *p;
+
+ local_lock(&foo_lock);
+ p = this_cpu_ptr(&var1);
+ p->val = func2();
+
+On a non-PREEMPT_RT kernel this protects against reentrancy by disabling
+preemption. On a PREEMPT_RT kernel this is achieved by acquiring the
+underlying per-CPU spinlock.
+
+
+raw_spinlock_t on RT
+--------------------
Acquiring a raw_spinlock_t disables preemption and possibly also
interrupts, so the critical section must avoid acquiring a regular
The most basic rules are:
- - Lock types of the same lock category (sleeping, spinning) can nest
- arbitrarily as long as they respect the general lock ordering rules to
- prevent deadlocks.
+ - Lock types of the same lock category (sleeping, CPU local, spinning)
+ can nest arbitrarily as long as they respect the general lock ordering
+ rules to prevent deadlocks.
+
+ - Sleeping lock types cannot nest inside CPU local and spinning lock types.
- - Sleeping lock types cannot nest inside spinning lock types.
+ - CPU local and spinning lock types can nest inside sleeping lock types.
- - Spinning lock types can nest inside sleeping lock types.
+ - Spinning lock types can nest inside all lock types
These constraints apply both in PREEMPT_RT and otherwise.
The fact that PREEMPT_RT changes the lock category of spinlock_t and
-rwlock_t from spinning to sleeping means that they cannot be acquired while
-holding a raw spinlock. This results in the following nesting ordering:
+rwlock_t from spinning to sleeping and substitutes local_lock with a
+per-CPU spinlock_t means that they cannot be acquired while holding a raw
+spinlock. This results in the following nesting ordering:
1) Sleeping locks
- 2) spinlock_t and rwlock_t
+ 2) spinlock_t, rwlock_t, local_lock
3) raw_spinlock_t and bit spinlocks
Lockdep will complain if these constraints are violated, both in
|
|
v
- disable_nonboot_cpus()
+ freeze_secondary_cpus()
/* start */
|
v
Release cpu_add_remove_lock
|
v
- /* disable_nonboot_cpus() complete */
+ /* freeze_secondary_cpus() complete */
|
v
Do suspend
Resuming back is likewise, with the counterparts being (in the order of
execution during resume):
-* enable_nonboot_cpus() which involves::
+* thaw_secondary_cpus() which involves::
| Acquire cpu_add_remove_lock
| Decrease cpu_hotplug_disabled, thereby enabling regular cpu hotplug
Coding style is all about readability and maintainability using commonly
available tools.
-The limit on the length of lines is 80 columns and this is a strongly
-preferred limit.
-
-Statements longer than 80 columns will be broken into sensible chunks, unless
-exceeding 80 columns significantly increases readability and does not hide
-information. Descendants are always substantially shorter than the parent and
-are placed substantially to the right. The same applies to function headers
-with a long argument list. However, never break user-visible strings such as
-printk messages, because that breaks the ability to grep for them.
+The preferred limit on the length of a single line is 80 columns.
+
+Statements longer than 80 columns should be broken into sensible chunks,
+unless exceeding 80 columns significantly increases readability and does
+not hide information.
+
+Descendants are always substantially shorter than the parent and are
+are placed substantially to the right. A very commonly used style
+is to align descendants to a function open parenthesis.
+
+These same rules are applied to function headers with a long argument list.
+
+However, never break user-visible strings such as printk messages because
+that breaks the ability to grep for them.
3) Placing Braces and Spaces
SipHash is a cryptographically secure PRF -- a keyed hash function -- that
performs very well for short inputs, hence the name. It was designed by
cryptographers Daniel J. Bernstein and Jean-Philippe Aumasson. It is intended
-as a replacement for some uses of: `jhash`, `md5_transform`, `sha_transform`,
+as a replacement for some uses of: `jhash`, `md5_transform`, `sha1_transform`,
and so forth.
SipHash takes a secret key filled with randomly generated numbers and either
pass the return address pointer as the 'retp' argument to
ftrace_push_return_trace().
-HAVE_FTRACE_NMI_ENTER
----------------------
-
-If you can't trace NMI functions, then skip this option.
-
-<details to be filled>
-
-
HAVE_SYSCALL_TRACEPOINTS
------------------------
3. Raw Gadget provides a way to select a UDC device/driver to bind to,
while GadgetFS currently binds to the first available UDC.
-4. Raw Gadget uses predictable endpoint names (handles) across different
- UDCs (as long as UDCs have enough endpoints of each required transfer
- type).
+4. Raw Gadget explicitly exposes information about endpoints addresses and
+ capabilities allowing a user to write UDC-agnostic gadgets.
5. Raw Gadget has ioctl-based interface instead of a filesystem-based one.
Raw Gadget and react to those depending on what kind of USB device
needs to be emulated.
+Note, that some UDC drivers have fixed addresses assigned to endpoints, and
+therefore arbitrary endpoint addresses can't be used in the descriptors.
+Nevertheles, Raw Gadget provides a UDC-agnostic way to write USB gadgets.
+Once a USB_RAW_EVENT_CONNECT event is received via USB_RAW_IOCTL_EVENT_FETCH,
+the USB_RAW_IOCTL_EPS_INFO ioctl can be used to find out information about
+endpoints that the UDC driver has. Based on that information, the user must
+chose UDC endpoints that will be used for the gadget being emulated, and
+properly assign addresses in endpoint descriptors.
+
+You can find usage examples (along with a test suite) here:
+
+https://github.com/xairy/raw-gadget
+
+Internal details
+~~~~~~~~~~~~~~~~
+
+Currently every endpoint read/write ioctl submits a USB request and waits until
+its completion. This is the desired mode for coverage-guided fuzzing (as we'd
+like all USB request processing happen during the lifetime of a syscall),
+and must be kept in the implementation. (This might be slow for real world
+applications, thus the O_NONBLOCK improvement suggestion below.)
+
Potential future improvements
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
-- Implement ioctl's for setting/clearing halt status on endpoints.
-
-- Reporting more events (suspend, resume, etc.) through
- USB_RAW_IOCTL_EVENT_FETCH.
+- Report more events (suspend, resume, etc.) through USB_RAW_IOCTL_EVENT_FETCH.
- Support O_NONBLOCK I/O.
+
+- Support USB 3 features (accept SS endpoint companion descriptor when
+ enabling endpoints; allow providing stream_id for bulk transfers).
+
+- Support ISO transfer features (expose frame_number for completed requests).
T: git git://people.freedesktop.org/~agd5f/linux
F: drivers/gpu/drm/amd/display/
+AMD ENERGY DRIVER
+M: Naveen Krishna Chatradhi <nchatrad@amd.com>
+L: linux-hwmon@vger.kernel.org
+S: Maintained
+F: Documentation/hwmon/amd_energy.rst
+F: drivers/hwmon/amd_energy.c
+
AMD FAM15H PROCESSOR POWER MONITORING DRIVER
M: Huang Rui <ray.huang@amd.com>
L: linux-hwmon@vger.kernel.org
F: drivers/gpu/drm/amd/include/vi_structs.h
F: include/uapi/linux/kfd_ioctl.h
+AMD SPI DRIVER
+M: Sanjay R Mehta <sanju.mehta@amd.com>
+S: Maintained
+F: drivers/spi/spi-amd.c
+
AMD MP2 I2C DRIVER
M: Elie Morisse <syniurge@gmail.com>
M: Nehal Shah <nehal-bakulchandra.shah@amd.com>
W: https://developer.arm.com/products/system-ip/trustzone-cryptocell/cryptocell-700-family
F: drivers/crypto/ccree/
+CCTRNG ARM TRUSTZONE CRYPTOCELL TRUE RANDOM NUMBER GENERATOR (TRNG) DRIVER
+M: Hadar Gat <hadar.gat@arm.com>
+L: linux-crypto@vger.kernel.org
+S: Supported
+F: drivers/char/hw_random/cctrng.c
+F: drivers/char/hw_random/cctrng.h
+F: Documentation/devicetree/bindings/rng/arm-cctrng.txt
+W: https://developer.arm.com/products/system-ip/trustzone-cryptocell/cryptocell-700-family
+
CEC FRAMEWORK
M: Hans Verkuil <hverkuil-cisco@xs4all.nl>
L: linux-media@vger.kernel.org
DRM DRIVER FOR VMWARE VIRTUAL GPU
M: "VMware Graphics" <linux-graphics-maintainer@vmware.com>
-M: Thomas Hellstrom <thellstrom@vmware.com>
+M: Roland Scheidegger <sroland@vmware.com>
L: dri-devel@lists.freedesktop.org
S: Supported
-T: git git://people.freedesktop.org/~thomash/linux
+T: git git://people.freedesktop.org/~sroland/linux
F: drivers/gpu/drm/vmwgfx/
F: include/uapi/drm/vmwgfx_drm.h
L: linux-edac@vger.kernel.org
S: Supported
F: drivers/edac/sifive_edac.c
-F: drivers/soc/sifive_l2_cache.c
EDAC-SKYLAKE
M: Tony Luck <tony.luck@intel.com>
F: tools/perf/bench/futex*
F: tools/testing/selftests/futex/
+GATEWORKS SYSTEM CONTROLLER (GSC) DRIVER
+M: Tim Harvey <tharvey@gateworks.com>
+M: Robert Jones <rjones@gateworks.com>
+S: Maintained
+F: Documentation/devicetree/bindings/mfd/gateworks-gsc.yaml
+F: drivers/mfd/gateworks-gsc.c
+F: include/linux/mfd/gsc.h
+F: Documentation/hwmon/gsc-hwmon.rst
+F: drivers/hwmon/gsc-hwmon.c
+F: include/linux/platform_data/gsc_hwmon.h
+
GASKET DRIVER FRAMEWORK
M: Rob Springer <rspringer@google.com>
M: Todd Poynor <toddpoynor@google.com>
F: drivers/media/platform/sti/hva
HWPOISON MEMORY FAILURE HANDLING
-M: Naoya Horiguchi <n-horiguchi@ah.jp.nec.com>
+M: Naoya Horiguchi <naoya.horiguchi@nec.com>
L: linux-mm@kvack.org
S: Maintained
F: mm/hwpoison-inject.c
F: drivers/i2c/busses/i2c-parport.c
I2C SUBSYSTEM
-M: Wolfram Sang <wsa@the-dreams.de>
+M: Wolfram Sang <wsa@kernel.org>
L: linux-i2c@vger.kernel.org
S: Maintained
W: https://i2c.wiki.kernel.org/
S: Maintained
W: http://lse.sourceforge.net/kdump/
F: Documentation/admin-guide/kdump/
+F: fs/proc/vmcore.c
+F: include/linux/crash_core.h
+F: include/linux/crash_dump.h
+F: include/uapi/linux/vmcore.h
+F: kernel/crash_*.c
KEENE FM RADIO TRANSMITTER DRIVER
M: Hans Verkuil <hverkuil@xs4all.nl>
F: include/linux/lightnvm.h
F: include/uapi/linux/lightnvm.h
+LINEAR RANGES HELPERS
+M: Mark Brown <broonie@kernel.org>
+R: Matti Vaittinen <matti.vaittinen@fi.rohmeurope.com>
+F: lib/linear_ranges.c
+F: lib/test_linear_ranges.c
+F: include/linux/linear_range.h
+
LINUX FOR POWER MACINTOSH
M: Benjamin Herrenschmidt <benh@kernel.crashing.org>
L: linuxppc-dev@lists.ozlabs.org
S: Maintained
F: drivers/net/ethernet/mediatek/
+MEDIATEK I2C CONTROLLER DRIVER
+M: Qii Wang <qii.wang@mediatek.com>
+L: linux-i2c@vger.kernel.org
+S: Maintained
+F: Documentation/devicetree/bindings/i2c/i2c-mt65xx.txt
+F: drivers/i2c/busses/i2c-mt65xx.c
+
MEDIATEK JPEG DRIVER
M: Rick Chang <rick.chang@mediatek.com>
M: Bin Liu <bin.liu@mediatek.com>
NETWORKING DRIVERS
M: "David S. Miller" <davem@davemloft.net>
+M: Jakub Kicinski <kuba@kernel.org>
L: netdev@vger.kernel.org
-S: Odd Fixes
+S: Maintained
W: http://www.linuxfoundation.org/en/Net
Q: http://patchwork.ozlabs.org/project/netdev/list/
T: git git://git.kernel.org/pub/scm/linux/kernel/git/netdev/net.git
S: Maintained
T: git git://git.kernel.org/pub/scm/linux/kernel/git/kees/linux.git for-next/pstore
F: Documentation/admin-guide/ramoops.rst
+F: Documentation/admin-guide/pstore-blk.rst
F: Documentation/devicetree/bindings/reserved-memory/ramoops.txt
F: drivers/acpi/apei/erst.c
F: drivers/firmware/efi/efi-pstore.c
L: linux-kernel@vger.kernel.org
S: Supported
F: Documentation/x86/resctrl*
-F: arch/x86/include/asm/resctrl_sched.h
+F: arch/x86/include/asm/resctrl.h
F: arch/x86/kernel/cpu/resctrl/
F: tools/testing/selftests/resctrl/
S390 IUCV NETWORK LAYER
M: Julian Wiedmann <jwi@linux.ibm.com>
+M: Karsten Graul <kgraul@linux.ibm.com>
M: Ursula Braun <ubraun@linux.ibm.com>
L: linux-s390@vger.kernel.org
S: Supported
S390 NETWORK DRIVERS
M: Julian Wiedmann <jwi@linux.ibm.com>
+M: Karsten Graul <kgraul@linux.ibm.com>
M: Ursula Braun <ubraun@linux.ibm.com>
L: linux-s390@vger.kernel.org
S: Supported
VERSION = 5
PATCHLEVEL = 7
SUBLEVEL = 0
-EXTRAVERSION = -rc5
+EXTRAVERSION =
NAME = Kleptomaniac Octopus
# *DOCUMENTATION*
CONFIG_DRM_ETNAVIV=y
CONFIG_FB=y
CONFIG_FRAMEBUFFER_CONSOLE=y
+CONFIG_USB=y
CONFIG_USB_EHCI_HCD=y
CONFIG_USB_EHCI_HCD_PLATFORM=y
CONFIG_USB_OHCI_HCD=y
/* clobbers r5 register */
.macro DSP_EARLY_INIT
+#ifdef CONFIG_ISA_ARCV2
lr r5, [ARC_AUX_DSP_BUILD]
bmsk r5, r5, 7
breq r5, 0, 1f
mov r5, DSP_CTRL_DISABLED_ALL
sr r5, [ARC_AUX_DSP_CTRL]
1:
+#endif
.endm
/* clobbers r10, r11 registers pair */
#ifdef CONFIG_ARC_IRQ_NO_AUTOSAVE
__RESTORE_REGFILE_HARD
+
+ ; SP points to PC/STAT32: hw restores them despite NO_AUTOSAVE
add sp, sp, SZ_PT_REGS - 8
#else
add sp, sp, PT_r0
# Copyright (C) 2004, 2007-2010, 2011-2012 Synopsys, Inc. (www.synopsys.com)
#
-# Pass UTS_MACHINE for user_regset definition
-CFLAGS_ptrace.o += -DUTS_MACHINE='"$(UTS_MACHINE)"'
-
obj-y := arcksyms.o setup.o irq.o reset.o ptrace.o process.o devtree.o
obj-y += signal.o traps.o sys.o troubleshoot.o stacktrace.o disasm.o
obj-$(CONFIG_ISA_ARCOMPACT) += entry-compact.o intc-compact.o
};
static const struct user_regset_view user_arc_view = {
- .name = UTS_MACHINE,
+ .name = "arc",
.e_machine = EM_ARC_INUSE,
.regsets = arc_regsets,
.n = ARRAY_SIZE(arc_regsets)
#include <linux/clocksource.h>
#include <linux/console.h>
#include <linux/module.h>
+#include <linux/sizes.h>
#include <linux/cpu.h>
#include <linux/of_clk.h>
#include <linux/of_fdt.h>
if ((unsigned int)__arc_dccm_base != cpu->dccm.base_addr)
panic("Linux built with incorrect DCCM Base address\n");
- if (CONFIG_ARC_DCCM_SZ != cpu->dccm.sz)
+ if (CONFIG_ARC_DCCM_SZ * SZ_1K != cpu->dccm.sz)
panic("Linux built with incorrect DCCM Size\n");
#endif
#ifdef CONFIG_ARC_HAS_ICCM
- if (CONFIG_ARC_ICCM_SZ != cpu->iccm.sz)
+ if (CONFIG_ARC_ICCM_SZ * SZ_1K != cpu->iccm.sz)
panic("Linux built with incorrect ICCM Size\n");
#endif
if (user_mode(regs))
show_faulting_vma(regs->ret); /* faulting code, not data */
- pr_info("ECR: 0x%08lx EFA: 0x%08lx ERET: 0x%08lx\n",
- regs->event, current->thread.fault_address, regs->ret);
-
- pr_info("STAT32: 0x%08lx", regs->status32);
+ pr_info("ECR: 0x%08lx EFA: 0x%08lx ERET: 0x%08lx\nSTAT: 0x%08lx",
+ regs->event, current->thread.fault_address, regs->ret,
+ regs->status32);
#define STS_BIT(r, bit) r->status32 & STATUS_##bit##_MASK ? #bit" " : ""
(regs->status32 & STATUS_U_MASK) ? "U " : "K ",
STS_BIT(regs, DE), STS_BIT(regs, AE));
#endif
- pr_cont(" BTA: 0x%08lx\n", regs->bta);
- pr_info("BLK: %pS\n SP: 0x%08lx FP: 0x%08lx\n",
- (void *)regs->blink, regs->sp, regs->fp);
+ pr_cont(" BTA: 0x%08lx\n SP: 0x%08lx FP: 0x%08lx BLK: %pS\n",
+ regs->bta, regs->sp, regs->fp, (void *)regs->blink);
pr_info("LPS: 0x%08lx\tLPE: 0x%08lx\tLPC: 0x%08lx\n",
- regs->lp_start, regs->lp_end, regs->lp_count);
+ regs->lp_start, regs->lp_end, regs->lp_count);
/* print regs->r0 thru regs->r12
* Sequential printing was generating horrible code
#endif
/* update frame */
-#ifndef CONFIG_AS_CFI_SIGNAL_FRAME
if (frame->call_frame
&& !UNW_DEFAULT_RA(state.regs[retAddrReg], state.dataAlign))
frame->call_frame = 0;
-#endif
cfa = FRAME_REG(state.cfa.reg, unsigned long) + state.cfa.offs;
startLoc = min_t(unsigned long, UNW_SP(frame), cfa);
endLoc = max_t(unsigned long, UNW_SP(frame), cfa);
menuconfig ARC_PLAT_EZNPS
bool "\"EZchip\" ARC dev platform"
+ depends on ISA_ARCOMPACT
select CPU_BIG_ENDIAN
select CLKSRC_NPS if !PHYS_ADDR_T_64BIT
select EZNPS_GIC
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_KCOV
select ARCH_HAS_MEMBARRIER_SYNC_CORE
+ select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_SPECIAL if ARM_LPAE
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_SETUP_DMA_OPS
select UCS2_STRING
select EFI_PARAMS_FROM_FDT
select EFI_STUB
- select EFI_ARMSTUB
+ select EFI_GENERIC_STUB
select EFI_RUNTIME_WRAPPERS
---help---
This option provides support for runtime services provided
.long __pecoff_code_size @ SizeOfCode
.long __pecoff_data_size @ SizeOfInitializedData
.long 0 @ SizeOfUninitializedData
- .long efi_entry - start @ AddressOfEntryPoint
+ .long efi_pe_entry - start @ AddressOfEntryPoint
.long start_offset @ BaseOfCode
.long __pecoff_data_start - start @ BaseOfData
}
.table : ALIGN(4) {
_table_start = .;
- LONG(ZIMAGE_MAGIC(2))
+ LONG(ZIMAGE_MAGIC(4))
LONG(ZIMAGE_MAGIC(0x5a534c4b))
LONG(ZIMAGE_MAGIC(__piggy_size_addr - _start))
LONG(ZIMAGE_MAGIC(_kernel_bss_size))
* The EFI stub always executes from RAM, and runs strictly before the
* decompressor, so we can make an exception for its r/w data, and keep it
*/
- *(.data.efistub)
+ *(.data.efistub .bss.efistub)
__pecoff_data_end = .;
/*
&cpsw_emac0 {
phy-handle = <ðphy0>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
};
&elm {
&cpsw_emac0 {
phy-handle = <ðphy0>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
};
&rtc {
&cpsw_emac0 {
phy-handle = <ðphy0>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
dual_emac_res_vlan = <1>;
};
&cpsw_emac1 {
phy-handle = <ðphy1>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
dual_emac_res_vlan = <2>;
};
&cpsw_port1 {
phy-handle = <ðphy0_sw>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
ti,dual-emac-pvid = <1>;
};
&cpsw_port2 {
phy-handle = <ðphy1_sw>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
ti,dual-emac-pvid = <2>;
};
status = "okay";
dual_emac;
};
+
+&m_can0 {
+ status = "disabled";
+};
&cpsw_emac0 {
phy-handle = <&phy0>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
dual_emac_res_vlan = <1>;
};
&cpsw_emac1 {
phy-handle = <&phy1>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
dual_emac_res_vlan = <2>;
};
&cpsw_emac0 {
phy-handle = <ðphy0>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
dual_emac_res_vlan = <1>;
};
&cpsw_emac1 {
phy-handle = <ðphy1>;
- phy-mode = "rgmii";
+ phy-mode = "rgmii-rxid";
dual_emac_res_vlan = <2>;
};
timer@20200 {
compatible = "arm,cortex-a9-global-timer";
reg = <0x20200 0x100>;
- interrupts = <GIC_PPI 11 IRQ_TYPE_LEVEL_HIGH>;
+ interrupts = <GIC_PPI 11 IRQ_TYPE_EDGE_RISING>;
clocks = <&periph_clk>;
};
compatible = "arm,cortex-a9-twd-timer";
reg = <0x20600 0x20>;
interrupts = <GIC_PPI 13 (GIC_CPU_MASK_SIMPLE(1) |
- IRQ_TYPE_LEVEL_HIGH)>;
+ IRQ_TYPE_EDGE_RISING)>;
clocks = <&periph_clk>;
};
compatible = "arm,cortex-a9-twd-wdt";
reg = <0x20620 0x20>;
interrupts = <GIC_PPI 14 (GIC_CPU_MASK_SIMPLE(1) |
- IRQ_TYPE_LEVEL_HIGH)>;
+ IRQ_TYPE_EDGE_RISING)>;
clocks = <&periph_clk>;
};
leds {
act {
- gpios = <&gpio 47 GPIO_ACTIVE_HIGH>;
+ gpios = <&gpio 47 GPIO_ACTIVE_LOW>;
};
};
davinci_mdio: mdio@800 {
compatible = "ti,cpsw-mdio", "ti,davinci_mdio";
- clocks = <&alwon_ethernet_clkctrl DM814_ETHERNET_CPGMAC0_CLKCTRL 0>;
+ clocks = <&cpsw_125mhz_gclk>;
clock-names = "fck";
#address-cells = <1>;
#size-cells = <0>;
#address-cells = <1>;
ranges = <0x51000000 0x51000000 0x3000
0x0 0x20000000 0x10000000>;
+ dma-ranges;
/**
* To enable PCI endpoint mode, disable the pcie1_rc
* node and enable pcie1_ep mode.
device_type = "pci";
ranges = <0x81000000 0 0 0x03000 0 0x00010000
0x82000000 0 0x20013000 0x13000 0 0xffed000>;
- dma-ranges = <0x02000000 0x0 0x00000000 0x00000000 0x1 0x00000000>;
bus-range = <0x00 0xff>;
#interrupt-cells = <1>;
num-lanes = <1>;
#address-cells = <1>;
ranges = <0x51800000 0x51800000 0x3000
0x0 0x30000000 0x10000000>;
+ dma-ranges;
status = "disabled";
pcie2_rc: pcie@51800000 {
reg = <0x51800000 0x2000>, <0x51802000 0x14c>, <0x1000 0x2000>;
device_type = "pci";
ranges = <0x81000000 0 0 0x03000 0 0x00010000
0x82000000 0 0x30013000 0x13000 0 0xffed000>;
- dma-ranges = <0x02000000 0x0 0x00000000 0x00000000 0x1 0x00000000>;
bus-range = <0x00 0xff>;
#interrupt-cells = <1>;
num-lanes = <1>;
imx27-phycard-s-rdk {
pinctrl_i2c1: i2c1grp {
fsl,pins = <
- MX27_PAD_I2C2_SDA__I2C2_SDA 0x0
- MX27_PAD_I2C2_SCL__I2C2_SCL 0x0
+ MX27_PAD_I2C_DATA__I2C_DATA 0x0
+ MX27_PAD_I2C_CLK__I2C_CLK 0x0
>;
};
};
&switch_ports {
- /delete-node/ port@2;
+ /delete-node/ port@3;
};
&touchscreen {
};
};
-&clks {
- assigned-clocks = <&clks IMX6QDL_CLK_LDB_DI0_SEL>,
- <&clks IMX6QDL_CLK_LDB_DI1_SEL>;
- assigned-clock-parents = <&clks IMX6QDL_CLK_PLL3_USB_OTG>,
- <&clks IMX6QDL_CLK_PLL3_USB_OTG>;
-};
-
&ldb {
status = "okay";
};
};
-&clks {
- assigned-clocks = <&clks IMX6QDL_CLK_LDB_DI0_SEL>,
- <&clks IMX6QDL_CLK_LDB_DI1_SEL>;
- assigned-clock-parents = <&clks IMX6QDL_CLK_PLL3_USB_OTG>,
- <&clks IMX6QDL_CLK_PLL3_USB_OTG>;
-};
-
&ldb {
status = "okay";
};
};
-&clks {
- assigned-clocks = <&clks IMX6QDL_CLK_LDB_DI0_SEL>,
- <&clks IMX6QDL_CLK_LDB_DI1_SEL>,
- <&clks IMX6QDL_CLK_IPU1_DI0_PRE_SEL>,
- <&clks IMX6QDL_CLK_IPU2_DI0_PRE_SEL>;
- assigned-clock-parents = <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>,
- <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>,
- <&clks IMX6QDL_CLK_PLL2_PFD2_396M>,
- <&clks IMX6QDL_CLK_PLL2_PFD2_396M>;
-};
-
&ldb {
fsl,dual-channel;
status = "okay";
#interrupt-cells = <1>;
};
};
+
+&clks {
+ assigned-clocks = <&clks IMX6QDL_CLK_LDB_DI0_SEL>,
+ <&clks IMX6QDL_CLK_LDB_DI1_SEL>,
+ <&clks IMX6QDL_CLK_IPU1_DI0_PRE_SEL>,
+ <&clks IMX6QDL_CLK_IPU1_DI1_PRE_SEL>,
+ <&clks IMX6QDL_CLK_IPU2_DI0_PRE_SEL>,
+ <&clks IMX6QDL_CLK_IPU2_DI1_PRE_SEL>;
+ assigned-clock-parents = <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>,
+ <&clks IMX6QDL_CLK_PLL5_VIDEO_DIV>,
+ <&clks IMX6QDL_CLK_PLL2_PFD0_352M>,
+ <&clks IMX6QDL_CLK_PLL2_PFD0_352M>,
+ <&clks IMX6QDL_CLK_PLL2_PFD0_352M>,
+ <&clks IMX6QDL_CLK_PLL2_PFD0_352M>;
+};
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
status = "okay";
};
-&ssp3 {
+&ssp1 {
status = "okay";
- cs-gpios = <&gpio 46 GPIO_ACTIVE_HIGH>;
+ cs-gpios = <&gpio 46 GPIO_ACTIVE_LOW>;
firmware-flash@0 {
- compatible = "st,m25p80", "jedec,spi-nor";
+ compatible = "winbond,w25q32", "jedec,spi-nor";
reg = <0>;
- spi-max-frequency = <40000000>;
+ spi-max-frequency = <104000000>;
m25p,fast-read;
};
};
-&ssp4 {
- cs-gpios = <&gpio 56 GPIO_ACTIVE_HIGH>;
+&ssp2 {
+ cs-gpios = <&gpio 56 GPIO_ACTIVE_LOW>;
status = "okay";
};
};
hsic_phy0: hsic-phy@f0001800 {
- compatible = "marvell,mmp3-hsic-phy",
- "usb-nop-xceiv";
+ compatible = "marvell,mmp3-hsic-phy";
reg = <0xf0001800 0x40>;
#phy-cells = <0>;
status = "disabled";
};
hsic_phy1: hsic-phy@f0002800 {
- compatible = "marvell,mmp3-hsic-phy",
- "usb-nop-xceiv";
+ compatible = "marvell,mmp3-hsic-phy";
reg = <0xf0002800 0x40>;
#phy-cells = <0>;
status = "disabled";
};
soc_clocks: clocks@d4050000 {
- compatible = "marvell,mmp2-clock";
+ compatible = "marvell,mmp3-clock";
reg = <0xd4050000 0x1000>,
<0xd4282800 0x400>,
<0xd4015000 0x1000>;
};
&mmc3 {
+ pinctrl-names = "default";
+ pinctrl-0 = <&mmc3_pins>;
vmmc-supply = <&wl12xx_vmmc>;
/* uart2_tx.sdmmc3_dat1 pad as wakeirq */
interrupts-extended = <&wakeupgen GIC_SPI 94 IRQ_TYPE_LEVEL_HIGH
>;
};
+ /*
+ * Android uses PIN_OFF_INPUT_PULLDOWN | PIN_INPUT_PULLUP | MUX_MODE3
+ * for gpio_100, but the internal pull makes wlan flakey on some
+ * devices. Off mode value should be tested if we have off mode working
+ * later on.
+ */
+ mmc3_pins: pinmux_mmc3_pins {
+ pinctrl-single,pins = <
+ /* 0x4a10008e gpmc_wait2.gpio_100 d23 */
+ OMAP4_IOPAD(0x08e, PIN_INPUT | MUX_MODE3)
+
+ /* 0x4a100102 abe_mcbsp1_dx.sdmmc3_dat2 ab25 */
+ OMAP4_IOPAD(0x102, PIN_INPUT_PULLUP | MUX_MODE1)
+
+ /* 0x4a100104 abe_mcbsp1_fsx.sdmmc3_dat3 ac27 */
+ OMAP4_IOPAD(0x104, PIN_INPUT_PULLUP | MUX_MODE1)
+
+ /* 0x4a100118 uart2_cts.sdmmc3_clk ab26 */
+ OMAP4_IOPAD(0x118, PIN_INPUT | MUX_MODE1)
+
+ /* 0x4a10011a uart2_rts.sdmmc3_cmd ab27 */
+ OMAP4_IOPAD(0x11a, PIN_INPUT_PULLUP | MUX_MODE1)
+
+ /* 0x4a10011c uart2_rx.sdmmc3_dat0 aa25 */
+ OMAP4_IOPAD(0x11c, PIN_INPUT_PULLUP | MUX_MODE1)
+
+ /* 0x4a10011e uart2_tx.sdmmc3_dat1 aa26 */
+ OMAP4_IOPAD(0x11e, PIN_INPUT_PULLUP | MUX_MODE1)
+ >;
+ };
+
/* gpmc_ncs0.gpio_50 */
poweroff_gpio: pinmux_poweroff_pins {
pinctrl-single,pins = <
};
/*
- * As uart1 is wired to mdm6600 with rts and cts, we can use the cts pin for
- * uart1 wakeirq.
+ * The uart1 port is wired to mdm6600 with rts and cts. The modem uses gpio_149
+ * for wake-up events for both the USB PHY and the UART. We can use gpio_149
+ * pad as the shared wakeirq for the UART rather than the RX or CTS pad as we
+ * have gpio_149 trigger before the UART transfer starts.
*/
&uart1 {
pinctrl-names = "default";
pinctrl-0 = <&uart1_pins>;
interrupts-extended = <&wakeupgen GIC_SPI 72 IRQ_TYPE_LEVEL_HIGH
- &omap4_pmx_core 0xfc>;
+ &omap4_pmx_core 0x110>;
+ uart-has-rtscts;
+ current-speed = <115200>;
};
&uart3 {
reg = <0xe803b000 0x30>;
interrupts = <GIC_SPI 56 IRQ_TYPE_EDGE_RISING>;
clocks = <&cpg CPG_MOD 36>;
- clock-names = "ostm0";
power-domains = <&cpg>;
status = "disabled";
};
reg = <0xe803c000 0x30>;
interrupts = <GIC_SPI 57 IRQ_TYPE_EDGE_RISING>;
clocks = <&cpg CPG_MOD 35>;
- clock-names = "ostm1";
power-domains = <&cpg>;
status = "disabled";
};
reg = <0xe803d000 0x30>;
interrupts = <GIC_SPI 58 IRQ_TYPE_EDGE_RISING>;
clocks = <&cpg CPG_MOD 34>;
- clock-names = "ostm2";
power-domains = <&cpg>;
status = "disabled";
};
cmt1: timer@e6130000 {
compatible = "renesas,r8a73a4-cmt1", "renesas,rcar-gen2-cmt1";
reg = <0 0xe6130000 0 0x1004>;
- interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>;
+ interrupts = <GIC_SPI 120 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 121 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 122 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 123 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 124 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 125 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 126 IRQ_TYPE_LEVEL_HIGH>,
+ <GIC_SPI 127 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&mstp3_clks R8A73A4_CLK_CMT1>;
clock-names = "fck";
power-domains = <&pd_c5>;
cpg_clocks: cpg_clocks@e6150000 {
compatible = "renesas,r8a7740-cpg-clocks";
reg = <0xe6150000 0x10000>;
- clocks = <&extal1_clk>, <&extalr_clk>;
+ clocks = <&extal1_clk>, <&extal2_clk>, <&extalr_clk>;
#clock-cells = <1>;
clock-output-names = "system", "pllc0", "pllc1",
"pllc2", "r",
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
*/
hdmi@3d {
compatible = "adi,adv7513";
- reg = <0x3d>, <0x2d>, <0x4d>, <0x5d>;
- reg-names = "main", "cec", "edid", "packet";
+ reg = <0x3d>, <0x4d>, <0x2d>, <0x5d>;
+ reg-names = "main", "edid", "cec", "packet";
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
hdmi@39 {
compatible = "adi,adv7513";
- reg = <0x39>, <0x29>, <0x49>, <0x59>;
- reg-names = "main", "cec", "edid", "packet";
+ reg = <0x39>, <0x49>, <0x29>, <0x59>;
+ reg-names = "main", "edid", "cec", "packet";
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
assigned-clocks = <&cru SCLK_GPU>;
assigned-clock-rates = <100000000>;
clocks = <&cru SCLK_GPU>, <&cru SCLK_GPU>;
- clock-names = "core", "bus";
+ clock-names = "bus", "core";
resets = <&cru SRST_GPU>;
status = "disabled";
};
#address-cells = <1>;
#size-cells = <0>;
- phy: phy@0 {
+ phy: ethernet-phy@0 {
compatible = "ethernet-phy-id1234.d400", "ethernet-phy-ieee802.3-c22";
reg = <0>;
clocks = <&cru SCLK_MAC_PHY>;
#address-cells = <1>;
#size-cells = <0>;
- phy: phy@0 {
+ phy: ethernet-phy@0 {
compatible = "ethernet-phy-id1234.d400",
"ethernet-phy-ieee802.3-c22";
reg = <0>;
"pp1",
"ppmmu1";
clocks = <&cru ACLK_GPU>, <&cru ACLK_GPU>;
- clock-names = "core", "bus";
+ clock-names = "bus", "core";
resets = <&cru SRST_GPU_A>;
status = "disabled";
};
};
};
- spi-0 {
+ spi0 {
spi0_clk: spi0-clk {
rockchip,pins = <0 RK_PB1 2 &pcfg_pull_up>;
};
};
};
- spi-1 {
+ spi1 {
spi1_clk: spi1-clk {
rockchip,pins = <0 RK_PC7 2 &pcfg_pull_up>;
};
compatible = "arm,mali-400";
reg = <0x10090000 0x10000>;
clocks = <&cru ACLK_GPU>, <&cru ACLK_GPU>;
- clock-names = "core", "bus";
+ clock-names = "bus", "core";
assigned-clocks = <&cru ACLK_GPU>;
assigned-clock-rates = <100000000>;
resets = <&cru SRST_GPU>;
CONFIG_SPI=y
CONFIG_SPI_DAVINCI=y
CONFIG_SPI_SPIDEV=y
+CONFIG_PTP_1588_CLOCK=y
CONFIG_PINCTRL_SINGLE=y
CONFIG_GPIOLIB=y
CONFIG_GPIO_SYSFS=y
CONFIG_HSI=m
CONFIG_OMAP_SSI=m
CONFIG_SSI_PROTOCOL=m
+CONFIG_PTP_1588_CLOCK=y
CONFIG_PINCTRL_SINGLE=y
CONFIG_DEBUG_GPIO=y
CONFIG_GPIO_SYSFS=y
#include <crypto/internal/hash.h>
#include <linux/init.h>
#include <linux/module.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
#include <crypto/sha.h>
#include <crypto/internal/hash.h>
#include <crypto/internal/simd.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
#include <crypto/sha.h>
#endif
#include <asm/ptrace.h>
-#include <asm/domain.h>
#include <asm/opcodes-virt.h>
#include <asm/asm-offsets.h>
#include <asm/page.h>
#include <asm/thread_info.h>
+#include <asm/uaccess-asm.h>
#define IOMEM(x) (x)
.size \name , . - \name
.endm
- .macro csdb
-#ifdef CONFIG_THUMB2_KERNEL
- .inst.w 0xf3af8014
-#else
- .inst 0xe320f014
-#endif
- .endm
-
- .macro check_uaccess, addr:req, size:req, limit:req, tmp:req, bad:req
-#ifndef CONFIG_CPU_USE_DOMAINS
- adds \tmp, \addr, #\size - 1
- sbcscc \tmp, \tmp, \limit
- bcs \bad
-#ifdef CONFIG_CPU_SPECTRE
- movcs \addr, #0
- csdb
-#endif
-#endif
- .endm
-
- .macro uaccess_mask_range_ptr, addr:req, size:req, limit:req, tmp:req
-#ifdef CONFIG_CPU_SPECTRE
- sub \tmp, \limit, #1
- subs \tmp, \tmp, \addr @ tmp = limit - 1 - addr
- addhs \tmp, \tmp, #1 @ if (tmp >= 0) {
- subshs \tmp, \tmp, \size @ tmp = limit - (addr + size) }
- movlo \addr, #0 @ if (tmp < 0) addr = NULL
- csdb
-#endif
- .endm
-
- .macro uaccess_disable, tmp, isb=1
-#ifdef CONFIG_CPU_SW_DOMAIN_PAN
- /*
- * Whenever we re-enter userspace, the domains should always be
- * set appropriately.
- */
- mov \tmp, #DACR_UACCESS_DISABLE
- mcr p15, 0, \tmp, c3, c0, 0 @ Set domain register
- .if \isb
- instr_sync
- .endif
-#endif
- .endm
-
- .macro uaccess_enable, tmp, isb=1
-#ifdef CONFIG_CPU_SW_DOMAIN_PAN
- /*
- * Whenever we re-enter userspace, the domains should always be
- * set appropriately.
- */
- mov \tmp, #DACR_UACCESS_ENABLE
- mcr p15, 0, \tmp, c3, c0, 0
- .if \isb
- instr_sync
- .endif
-#endif
- .endm
-
- .macro uaccess_save, tmp
-#ifdef CONFIG_CPU_SW_DOMAIN_PAN
- mrc p15, 0, \tmp, c3, c0, 0
- str \tmp, [sp, #SVC_DACR]
-#endif
- .endm
-
- .macro uaccess_restore
-#ifdef CONFIG_CPU_SW_DOMAIN_PAN
- ldr r0, [sp, #SVC_DACR]
- mcr p15, 0, r0, c3, c0, 0
-#endif
- .endm
-
.irp c,,eq,ne,cs,cc,mi,pl,vs,vc,hi,ls,ge,lt,gt,le,hs,lo
.macro ret\c, reg
#if __LINUX_ARM_ARCH__ < 6
/* arch specific definitions used by the stub code */
-#define efi_bs_call(func, ...) efi_system_table()->boottime->func(__VA_ARGS__)
-#define efi_rt_call(func, ...) efi_system_table()->runtime->func(__VA_ARGS__)
-#define efi_is_native() (true)
-
-#define efi_table_attr(inst, attr) (inst->attr)
-
-#define efi_call_proto(inst, func, ...) inst->func(inst, ##__VA_ARGS__)
-
struct screen_info *alloc_screen_info(void);
void free_screen_info(struct screen_info *si);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-only */
+
+#ifndef __ASM_UACCESS_ASM_H__
+#define __ASM_UACCESS_ASM_H__
+
+#include <asm/asm-offsets.h>
+#include <asm/domain.h>
+#include <asm/memory.h>
+#include <asm/thread_info.h>
+
+ .macro csdb
+#ifdef CONFIG_THUMB2_KERNEL
+ .inst.w 0xf3af8014
+#else
+ .inst 0xe320f014
+#endif
+ .endm
+
+ .macro check_uaccess, addr:req, size:req, limit:req, tmp:req, bad:req
+#ifndef CONFIG_CPU_USE_DOMAINS
+ adds \tmp, \addr, #\size - 1
+ sbcscc \tmp, \tmp, \limit
+ bcs \bad
+#ifdef CONFIG_CPU_SPECTRE
+ movcs \addr, #0
+ csdb
+#endif
+#endif
+ .endm
+
+ .macro uaccess_mask_range_ptr, addr:req, size:req, limit:req, tmp:req
+#ifdef CONFIG_CPU_SPECTRE
+ sub \tmp, \limit, #1
+ subs \tmp, \tmp, \addr @ tmp = limit - 1 - addr
+ addhs \tmp, \tmp, #1 @ if (tmp >= 0) {
+ subshs \tmp, \tmp, \size @ tmp = limit - (addr + size) }
+ movlo \addr, #0 @ if (tmp < 0) addr = NULL
+ csdb
+#endif
+ .endm
+
+ .macro uaccess_disable, tmp, isb=1
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+ /*
+ * Whenever we re-enter userspace, the domains should always be
+ * set appropriately.
+ */
+ mov \tmp, #DACR_UACCESS_DISABLE
+ mcr p15, 0, \tmp, c3, c0, 0 @ Set domain register
+ .if \isb
+ instr_sync
+ .endif
+#endif
+ .endm
+
+ .macro uaccess_enable, tmp, isb=1
+#ifdef CONFIG_CPU_SW_DOMAIN_PAN
+ /*
+ * Whenever we re-enter userspace, the domains should always be
+ * set appropriately.
+ */
+ mov \tmp, #DACR_UACCESS_ENABLE
+ mcr p15, 0, \tmp, c3, c0, 0
+ .if \isb
+ instr_sync
+ .endif
+#endif
+ .endm
+
+#if defined(CONFIG_CPU_SW_DOMAIN_PAN) || defined(CONFIG_CPU_USE_DOMAINS)
+#define DACR(x...) x
+#else
+#define DACR(x...)
+#endif
+
+ /*
+ * Save the address limit on entry to a privileged exception.
+ *
+ * If we are using the DACR for kernel access by the user accessors
+ * (CONFIG_CPU_USE_DOMAINS=y), always reset the DACR kernel domain
+ * back to client mode, whether or not \disable is set.
+ *
+ * If we are using SW PAN, set the DACR user domain to no access
+ * if \disable is set.
+ */
+ .macro uaccess_entry, tsk, tmp0, tmp1, tmp2, disable
+ ldr \tmp1, [\tsk, #TI_ADDR_LIMIT]
+ mov \tmp2, #TASK_SIZE
+ str \tmp2, [\tsk, #TI_ADDR_LIMIT]
+ DACR( mrc p15, 0, \tmp0, c3, c0, 0)
+ DACR( str \tmp0, [sp, #SVC_DACR])
+ str \tmp1, [sp, #SVC_ADDR_LIMIT]
+ .if \disable && IS_ENABLED(CONFIG_CPU_SW_DOMAIN_PAN)
+ /* kernel=client, user=no access */
+ mov \tmp2, #DACR_UACCESS_DISABLE
+ mcr p15, 0, \tmp2, c3, c0, 0
+ instr_sync
+ .elseif IS_ENABLED(CONFIG_CPU_USE_DOMAINS)
+ /* kernel=client */
+ bic \tmp2, \tmp0, #domain_mask(DOMAIN_KERNEL)
+ orr \tmp2, \tmp2, #domain_val(DOMAIN_KERNEL, DOMAIN_CLIENT)
+ mcr p15, 0, \tmp2, c3, c0, 0
+ instr_sync
+ .endif
+ .endm
+
+ /* Restore the user access state previously saved by uaccess_entry */
+ .macro uaccess_exit, tsk, tmp0, tmp1
+ ldr \tmp1, [sp, #SVC_ADDR_LIMIT]
+ DACR( ldr \tmp0, [sp, #SVC_DACR])
+ str \tmp1, [\tsk, #TI_ADDR_LIMIT]
+ DACR( mcr p15, 0, \tmp0, c3, c0, 0)
+ .endm
+
+#undef DACR
+
+#endif /* __ASM_UACCESS_ASM_H__ */
#include <linux/export.h>
#include <linux/sched.h>
#include <linux/string.h>
-#include <linux/cryptohash.h>
#include <linux/delay.h>
#include <linux/in6.h>
#include <linux/syscalls.h>
size_t size;
if (tag->hdr.tag != ATAG_CORE) {
- pr_info("No ATAGs?");
+ pr_info("No ATAGs?\n");
return -EINVAL;
}
#include <asm/unistd.h>
#include <asm/tls.h>
#include <asm/system_info.h>
+#include <asm/uaccess-asm.h>
#include "entry-header.S"
#include <asm/entry-macro-multi.S>
stmia r7, {r2 - r6}
get_thread_info tsk
- ldr r0, [tsk, #TI_ADDR_LIMIT]
- mov r1, #TASK_SIZE
- str r1, [tsk, #TI_ADDR_LIMIT]
- str r0, [sp, #SVC_ADDR_LIMIT]
-
- uaccess_save r0
- .if \uaccess
- uaccess_disable r0
- .endif
+ uaccess_entry tsk, r0, r1, r2, \uaccess
.if \trace
#ifdef CONFIG_TRACE_IRQFLAGS
#include <asm/asm-offsets.h>
#include <asm/errno.h>
#include <asm/thread_info.h>
+#include <asm/uaccess-asm.h>
#include <asm/v7m.h>
@ Bad Abort numbers
blne trace_hardirqs_off
#endif
.endif
- ldr r1, [sp, #SVC_ADDR_LIMIT]
- uaccess_restore
- str r1, [tsk, #TI_ADDR_LIMIT]
+ uaccess_exit tsk, r0, r1
#ifndef CONFIG_THUMB2_KERNEL
@ ARM mode SVC restore
@ on the stack remains correct).
@
.macro svc_exit_via_fiq
- ldr r1, [sp, #SVC_ADDR_LIMIT]
- uaccess_restore
- str r1, [tsk, #TI_ADDR_LIMIT]
+ uaccess_exit tsk, r0, r1
#ifndef CONFIG_THUMB2_KERNEL
@ ARM mode restore
mov r0, sp
};
static struct undef_hook thumb_break_hook = {
- .instr_mask = 0xffff,
- .instr_val = 0xde01,
+ .instr_mask = 0xffffffff,
+ .instr_val = 0x0000de01,
.cpsr_mask = PSR_T_BIT,
.cpsr_val = PSR_T_BIT,
.fn = break_trap,
#define GIC_CPU_CTRL 0x00
#define GIC_CPU_CTRL_ENABLE 1
-int __init ox820_boot_secondary(unsigned int cpu, struct task_struct *idle)
+static int __init ox820_boot_secondary(unsigned int cpu,
+ struct task_struct *idle)
{
/*
* Write the address of secondary startup into the
select ARCH_HAS_KCOV
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
+ select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SETUP_DMA_OPS
select EFI_PARAMS_FROM_FDT
select EFI_RUNTIME_WRAPPERS
select EFI_STUB
- select EFI_ARMSTUB
+ select EFI_GENERIC_STUB
default y
help
This option provides support for runtime services provided
};
&codec_analog {
- hpvcc-supply = <®_eldo1>;
+ cpvdd-supply = <®_eldo1>;
status = "okay";
};
};
};
- sound_spdif {
- compatible = "simple-audio-card";
- simple-audio-card,name = "On-board SPDIF";
-
- simple-audio-card,cpu {
- sound-dai = <&spdif>;
- };
-
- simple-audio-card,codec {
- sound-dai = <&spdif_out>;
- };
- };
-
- spdif_out: spdif-out {
- #sound-dai-cells = <0>;
- compatible = "linux,spdif-dit";
- };
-
timer {
compatible = "arm,armv8-timer";
allwinner,erratum-unknown1;
reg = <0x0 0xff400000 0x0 0x40000>;
interrupts = <GIC_SPI 31 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clkc CLKID_USB1_DDR_BRIDGE>;
- clock-names = "ddr";
+ clock-names = "otg";
phys = <&usb2_phy1>;
phy-names = "usb2-phy";
dr_mode = "peripheral";
-
// SPDX-License-Identifier: (GPL-2.0+ OR MIT)
/*
* Copyright (c) 2019 BayLibre, SAS
clock-latency = <50000>;
};
+&frddr_a {
+ status = "okay";
+};
+
&frddr_b {
status = "okay";
};
&usb {
status = "okay";
dr_mode = "host";
- vbus-regulator = <&usb_pwr_en>;
+ vbus-supply = <&usb_pwr_en>;
};
&usb2_phy0 {
edma0: dma-controller@22c0000 {
#dma-cells = <2>;
- compatible = "fsl,ls1028a-edma";
+ compatible = "fsl,ls1028a-edma", "fsl,vf610-edma";
reg = <0x0 0x22c0000 0x0 0x10000>,
<0x0 0x22d0000 0x0 0x10000>,
<0x0 0x22e0000 0x0 0x10000>;
aips1: bus@30000000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x301f0000 0x10000>;
+ reg = <0x30000000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x30000000 0x30000000 0x400000>;
aips2: bus@30400000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x305f0000 0x10000>;
+ reg = <0x30400000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x30400000 0x30400000 0x400000>;
aips3: bus@30800000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x309f0000 0x10000>;
+ reg = <0x30800000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x30800000 0x30800000 0x400000>,
aips4: bus@32c00000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x32df0000 0x10000>;
+ reg = <0x32c00000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x32c00000 0x32c00000 0x400000>;
aips1: bus@30000000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x301f0000 0x10000>;
+ reg = <0x30000000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
aips2: bus@30400000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x305f0000 0x10000>;
+ reg = <0x30400000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
aips3: bus@30800000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x309f0000 0x10000>;
+ reg = <0x30800000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
reg = <0x30bd0000 0x10000>;
interrupts = <GIC_SPI 2 IRQ_TYPE_LEVEL_HIGH>;
clocks = <&clk IMX8MN_CLK_SDMA1_ROOT>,
- <&clk IMX8MN_CLK_SDMA1_ROOT>;
+ <&clk IMX8MN_CLK_AHB>;
clock-names = "ipg", "ahb";
#dma-cells = <3>;
fsl,sdma-ram-script-name = "imx/sdma/sdma-imx7d.bin";
aips4: bus@32c00000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x32df0000 0x10000>;
+ reg = <0x32c00000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
#define MX8MP_IOMUXC_ENET_TXC__SIM_M_HADDR22 0x070 0x2D0 0x000 0x7 0x0
#define MX8MP_IOMUXC_ENET_RX_CTL__ENET_QOS_RGMII_RX_CTL 0x074 0x2D4 0x000 0x0 0x0
#define MX8MP_IOMUXC_ENET_RX_CTL__AUDIOMIX_SAI7_TX_SYNC 0x074 0x2D4 0x540 0x2 0x0
-#define MX8MP_IOMUXC_ENET_RX_CTL__AUDIOMIX_BIT_STREAM03 0x074 0x2D4 0x4CC 0x3 0x0
+#define MX8MP_IOMUXC_ENET_RX_CTL__AUDIOMIX_BIT_STREAM03 0x074 0x2D4 0x4CC 0x3 0x1
#define MX8MP_IOMUXC_ENET_RX_CTL__GPIO1_IO24 0x074 0x2D4 0x000 0x5 0x0
#define MX8MP_IOMUXC_ENET_RX_CTL__USDHC3_DATA2 0x074 0x2D4 0x618 0x6 0x0
#define MX8MP_IOMUXC_ENET_RX_CTL__SIM_M_HADDR23 0x074 0x2D4 0x000 0x7 0x0
#define MX8MP_IOMUXC_ENET_RXC__CCM_ENET_QOS_CLOCK_GENERATE_RX_CLK 0x078 0x2D8 0x000 0x0 0x0
#define MX8MP_IOMUXC_ENET_RXC__ENET_QOS_RX_ER 0x078 0x2D8 0x000 0x1 0x0
#define MX8MP_IOMUXC_ENET_RXC__AUDIOMIX_SAI7_TX_BCLK 0x078 0x2D8 0x53C 0x2 0x0
-#define MX8MP_IOMUXC_ENET_RXC__AUDIOMIX_BIT_STREAM02 0x078 0x2D8 0x4C8 0x3 0x0
+#define MX8MP_IOMUXC_ENET_RXC__AUDIOMIX_BIT_STREAM02 0x078 0x2D8 0x4C8 0x3 0x1
#define MX8MP_IOMUXC_ENET_RXC__GPIO1_IO25 0x078 0x2D8 0x000 0x5 0x0
#define MX8MP_IOMUXC_ENET_RXC__USDHC3_DATA3 0x078 0x2D8 0x61C 0x6 0x0
#define MX8MP_IOMUXC_ENET_RXC__SIM_M_HADDR24 0x078 0x2D8 0x000 0x7 0x0
#define MX8MP_IOMUXC_ENET_RD0__ENET_QOS_RGMII_RD0 0x07C 0x2DC 0x000 0x0 0x0
#define MX8MP_IOMUXC_ENET_RD0__AUDIOMIX_SAI7_RX_DATA00 0x07C 0x2DC 0x534 0x2 0x0
-#define MX8MP_IOMUXC_ENET_RD0__AUDIOMIX_BIT_STREAM01 0x07C 0x2DC 0x4C4 0x3 0x0
+#define MX8MP_IOMUXC_ENET_RD0__AUDIOMIX_BIT_STREAM01 0x07C 0x2DC 0x4C4 0x3 0x1
#define MX8MP_IOMUXC_ENET_RD0__GPIO1_IO26 0x07C 0x2DC 0x000 0x5 0x0
#define MX8MP_IOMUXC_ENET_RD0__USDHC3_DATA4 0x07C 0x2DC 0x620 0x6 0x0
#define MX8MP_IOMUXC_ENET_RD0__SIM_M_HADDR25 0x07C 0x2DC 0x000 0x7 0x0
#define MX8MP_IOMUXC_ENET_RD1__ENET_QOS_RGMII_RD1 0x080 0x2E0 0x000 0x0 0x0
#define MX8MP_IOMUXC_ENET_RD1__AUDIOMIX_SAI7_RX_SYNC 0x080 0x2E0 0x538 0x2 0x0
-#define MX8MP_IOMUXC_ENET_RD1__AUDIOMIX_BIT_STREAM00 0x080 0x2E0 0x4C0 0x3 0x0
+#define MX8MP_IOMUXC_ENET_RD1__AUDIOMIX_BIT_STREAM00 0x080 0x2E0 0x4C0 0x3 0x1
#define MX8MP_IOMUXC_ENET_RD1__GPIO1_IO27 0x080 0x2E0 0x000 0x5 0x0
#define MX8MP_IOMUXC_ENET_RD1__USDHC3_RESET_B 0x080 0x2E0 0x000 0x6 0x0
#define MX8MP_IOMUXC_ENET_RD1__SIM_M_HADDR26 0x080 0x2E0 0x000 0x7 0x0
#define MX8MP_IOMUXC_SD2_DATA0__I2C4_SDA 0x0C8 0x328 0x5C0 0x2 0x1
#define MX8MP_IOMUXC_SD2_DATA0__UART2_DCE_RX 0x0C8 0x328 0x5F0 0x3 0x2
#define MX8MP_IOMUXC_SD2_DATA0__UART2_DTE_TX 0x0C8 0x328 0x000 0x3 0x0
-#define MX8MP_IOMUXC_SD2_DATA0__AUDIOMIX_BIT_STREAM00 0x0C8 0x328 0x4C0 0x4 0x1
+#define MX8MP_IOMUXC_SD2_DATA0__AUDIOMIX_BIT_STREAM00 0x0C8 0x328 0x4C0 0x4 0x2
#define MX8MP_IOMUXC_SD2_DATA0__GPIO2_IO15 0x0C8 0x328 0x000 0x5 0x0
#define MX8MP_IOMUXC_SD2_DATA0__CCMSRCGPCMIX_OBSERVE2 0x0C8 0x328 0x000 0x6 0x0
#define MX8MP_IOMUXC_SD2_DATA0__OBSERVE_MUX_OUT02 0x0C8 0x328 0x000 0x7 0x0
#define MX8MP_IOMUXC_SD2_DATA3__USDHC2_DATA3 0x0D4 0x334 0x000 0x0 0x0
#define MX8MP_IOMUXC_SD2_DATA3__ECSPI2_MISO 0x0D4 0x334 0x56C 0x2 0x0
#define MX8MP_IOMUXC_SD2_DATA3__AUDIOMIX_SPDIF_IN 0x0D4 0x334 0x544 0x3 0x1
-#define MX8MP_IOMUXC_SD2_DATA3__AUDIOMIX_BIT_STREAM03 0x0D4 0x334 0x4CC 0x4 0x1
+#define MX8MP_IOMUXC_SD2_DATA3__AUDIOMIX_BIT_STREAM03 0x0D4 0x334 0x4CC 0x4 0x2
#define MX8MP_IOMUXC_SD2_DATA3__GPIO2_IO18 0x0D4 0x334 0x000 0x5 0x0
#define MX8MP_IOMUXC_SD2_DATA3__CCMSRCGPCMIX_EARLY_RESET 0x0D4 0x334 0x000 0x6 0x0
#define MX8MP_IOMUXC_SD2_RESET_B__USDHC2_RESET_B 0x0D8 0x338 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI5_RXD0__AUDIOMIX_SAI1_TX_DATA02 0x134 0x394 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI5_RXD0__PWM2_OUT 0x134 0x394 0x000 0x2 0x0
#define MX8MP_IOMUXC_SAI5_RXD0__I2C5_SCL 0x134 0x394 0x5C4 0x3 0x1
-#define MX8MP_IOMUXC_SAI5_RXD0__AUDIOMIX_BIT_STREAM00 0x134 0x394 0x4C0 0x4 0x2
+#define MX8MP_IOMUXC_SAI5_RXD0__AUDIOMIX_BIT_STREAM00 0x134 0x394 0x4C0 0x4 0x3
#define MX8MP_IOMUXC_SAI5_RXD0__GPIO3_IO21 0x134 0x394 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI5_RXD1__AUDIOMIX_SAI5_RX_DATA01 0x138 0x398 0x4FC 0x0 0x0
#define MX8MP_IOMUXC_SAI5_RXD1__AUDIOMIX_SAI1_TX_DATA03 0x138 0x398 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI5_RXD1__AUDIOMIX_SAI1_TX_SYNC 0x138 0x398 0x4D8 0x2 0x0
#define MX8MP_IOMUXC_SAI5_RXD1__AUDIOMIX_SAI5_TX_SYNC 0x138 0x398 0x510 0x3 0x0
-#define MX8MP_IOMUXC_SAI5_RXD1__AUDIOMIX_BIT_STREAM01 0x138 0x398 0x4C4 0x4 0x2
+#define MX8MP_IOMUXC_SAI5_RXD1__AUDIOMIX_BIT_STREAM01 0x138 0x398 0x4C4 0x4 0x3
#define MX8MP_IOMUXC_SAI5_RXD1__GPIO3_IO22 0x138 0x398 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI5_RXD1__CAN1_TX 0x138 0x398 0x000 0x6 0x0
#define MX8MP_IOMUXC_SAI5_RXD2__AUDIOMIX_SAI5_RX_DATA02 0x13C 0x39C 0x500 0x0 0x0
#define MX8MP_IOMUXC_SAI5_RXD2__AUDIOMIX_SAI1_TX_DATA04 0x13C 0x39C 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI5_RXD2__AUDIOMIX_SAI1_TX_SYNC 0x13C 0x39C 0x4D8 0x2 0x1
#define MX8MP_IOMUXC_SAI5_RXD2__AUDIOMIX_SAI5_TX_BCLK 0x13C 0x39C 0x50C 0x3 0x0
-#define MX8MP_IOMUXC_SAI5_RXD2__AUDIOMIX_BIT_STREAM02 0x13C 0x39C 0x4C8 0x4 0x2
+#define MX8MP_IOMUXC_SAI5_RXD2__AUDIOMIX_BIT_STREAM02 0x13C 0x39C 0x4C8 0x4 0x3
#define MX8MP_IOMUXC_SAI5_RXD2__GPIO3_IO23 0x13C 0x39C 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI5_RXD2__CAN1_RX 0x13C 0x39C 0x54C 0x6 0x0
#define MX8MP_IOMUXC_SAI5_RXD3__AUDIOMIX_SAI5_RX_DATA03 0x140 0x3A0 0x504 0x0 0x0
#define MX8MP_IOMUXC_SAI5_RXD3__AUDIOMIX_SAI1_TX_DATA05 0x140 0x3A0 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI5_RXD3__AUDIOMIX_SAI1_TX_SYNC 0x140 0x3A0 0x4D8 0x2 0x2
#define MX8MP_IOMUXC_SAI5_RXD3__AUDIOMIX_SAI5_TX_DATA00 0x140 0x3A0 0x000 0x3 0x0
-#define MX8MP_IOMUXC_SAI5_RXD3__AUDIOMIX_BIT_STREAM03 0x140 0x3A0 0x4CC 0x4 0x2
+#define MX8MP_IOMUXC_SAI5_RXD3__AUDIOMIX_BIT_STREAM03 0x140 0x3A0 0x4CC 0x4 0x3
#define MX8MP_IOMUXC_SAI5_RXD3__GPIO3_IO24 0x140 0x3A0 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI5_RXD3__CAN2_TX 0x140 0x3A0 0x000 0x6 0x0
#define MX8MP_IOMUXC_SAI5_MCLK__AUDIOMIX_SAI5_MCLK 0x144 0x3A4 0x4F0 0x0 0x0
#define MX8MP_IOMUXC_SAI1_RXD0__AUDIOMIX_SAI1_RX_DATA00 0x150 0x3B0 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI1_RXD0__AUDIOMIX_SAI5_RX_DATA00 0x150 0x3B0 0x4F8 0x1 0x1
#define MX8MP_IOMUXC_SAI1_RXD0__AUDIOMIX_SAI1_TX_DATA01 0x150 0x3B0 0x000 0x2 0x0
-#define MX8MP_IOMUXC_SAI1_RXD0__AUDIOMIX_BIT_STREAM00 0x150 0x3B0 0x4C0 0x3 0x3
+#define MX8MP_IOMUXC_SAI1_RXD0__AUDIOMIX_BIT_STREAM00 0x150 0x3B0 0x4C0 0x3 0x4
#define MX8MP_IOMUXC_SAI1_RXD0__ENET1_1588_EVENT1_IN 0x150 0x3B0 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI1_RXD0__GPIO4_IO02 0x150 0x3B0 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI1_RXD1__AUDIOMIX_SAI1_RX_DATA01 0x154 0x3B4 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI1_RXD1__AUDIOMIX_SAI5_RX_DATA01 0x154 0x3B4 0x4FC 0x1 0x1
-#define MX8MP_IOMUXC_SAI1_RXD1__AUDIOMIX_BIT_STREAM01 0x154 0x3B4 0x4C4 0x3 0x3
+#define MX8MP_IOMUXC_SAI1_RXD1__AUDIOMIX_BIT_STREAM01 0x154 0x3B4 0x4C4 0x3 0x4
#define MX8MP_IOMUXC_SAI1_RXD1__ENET1_1588_EVENT1_OUT 0x154 0x3B4 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI1_RXD1__GPIO4_IO03 0x154 0x3B4 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI1_RXD2__AUDIOMIX_SAI1_RX_DATA02 0x158 0x3B8 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI1_RXD2__AUDIOMIX_SAI5_RX_DATA02 0x158 0x3B8 0x500 0x1 0x1
-#define MX8MP_IOMUXC_SAI1_RXD2__AUDIOMIX_BIT_STREAM02 0x158 0x3B8 0x4C8 0x3 0x3
+#define MX8MP_IOMUXC_SAI1_RXD2__AUDIOMIX_BIT_STREAM02 0x158 0x3B8 0x4C8 0x3 0x4
#define MX8MP_IOMUXC_SAI1_RXD2__ENET1_MDC 0x158 0x3B8 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI1_RXD2__GPIO4_IO04 0x158 0x3B8 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI1_RXD3__AUDIOMIX_SAI1_RX_DATA03 0x15C 0x3BC 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI1_RXD3__AUDIOMIX_SAI5_RX_DATA03 0x15C 0x3BC 0x504 0x1 0x1
-#define MX8MP_IOMUXC_SAI1_RXD3__AUDIOMIX_BIT_STREAM03 0x15C 0x3BC 0x4CC 0x3 0x3
+#define MX8MP_IOMUXC_SAI1_RXD3__AUDIOMIX_BIT_STREAM03 0x15C 0x3BC 0x4CC 0x3 0x4
#define MX8MP_IOMUXC_SAI1_RXD3__ENET1_MDIO 0x15C 0x3BC 0x57C 0x4 0x1
#define MX8MP_IOMUXC_SAI1_RXD3__GPIO4_IO05 0x15C 0x3BC 0x000 0x5 0x0
#define MX8MP_IOMUXC_SAI1_RXD4__AUDIOMIX_SAI1_RX_DATA04 0x160 0x3C0 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI2_RXFS__UART1_DCE_TX 0x19C 0x3FC 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI2_RXFS__UART1_DTE_RX 0x19C 0x3FC 0x5E8 0x4 0x2
#define MX8MP_IOMUXC_SAI2_RXFS__GPIO4_IO21 0x19C 0x3FC 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI2_RXFS__AUDIOMIX_BIT_STREAM02 0x19C 0x3FC 0x4C8 0x6 0x4
+#define MX8MP_IOMUXC_SAI2_RXFS__AUDIOMIX_BIT_STREAM02 0x19C 0x3FC 0x4C8 0x6 0x5
#define MX8MP_IOMUXC_SAI2_RXFS__SIM_M_HSIZE00 0x19C 0x3FC 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI2_RXC__AUDIOMIX_SAI2_RX_BCLK 0x1A0 0x400 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI2_RXC__AUDIOMIX_SAI5_TX_BCLK 0x1A0 0x400 0x50C 0x1 0x2
#define MX8MP_IOMUXC_SAI2_RXC__UART1_DCE_RX 0x1A0 0x400 0x5E8 0x4 0x3
#define MX8MP_IOMUXC_SAI2_RXC__UART1_DTE_TX 0x1A0 0x400 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI2_RXC__GPIO4_IO22 0x1A0 0x400 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI2_RXC__AUDIOMIX_BIT_STREAM01 0x1A0 0x400 0x4C4 0x6 0x4
+#define MX8MP_IOMUXC_SAI2_RXC__AUDIOMIX_BIT_STREAM01 0x1A0 0x400 0x4C4 0x6 0x5
#define MX8MP_IOMUXC_SAI2_RXC__SIM_M_HSIZE01 0x1A0 0x400 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI2_RXD0__AUDIOMIX_SAI2_RX_DATA00 0x1A4 0x404 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI2_RXD0__AUDIOMIX_SAI5_TX_DATA00 0x1A4 0x404 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI2_RXD0__UART1_DCE_RTS 0x1A4 0x404 0x5E4 0x4 0x2
#define MX8MP_IOMUXC_SAI2_RXD0__UART1_DTE_CTS 0x1A4 0x404 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI2_RXD0__GPIO4_IO23 0x1A4 0x404 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI2_RXD0__AUDIOMIX_BIT_STREAM03 0x1A4 0x404 0x4CC 0x6 0x4
+#define MX8MP_IOMUXC_SAI2_RXD0__AUDIOMIX_BIT_STREAM03 0x1A4 0x404 0x4CC 0x6 0x5
#define MX8MP_IOMUXC_SAI2_RXD0__SIM_M_HSIZE02 0x1A4 0x404 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI2_TXFS__AUDIOMIX_SAI2_TX_SYNC 0x1A8 0x408 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI2_TXFS__AUDIOMIX_SAI5_TX_DATA01 0x1A8 0x408 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI2_TXFS__UART1_DCE_CTS 0x1A8 0x408 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI2_TXFS__UART1_DTE_RTS 0x1A8 0x408 0x5E4 0x4 0x3
#define MX8MP_IOMUXC_SAI2_TXFS__GPIO4_IO24 0x1A8 0x408 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI2_TXFS__AUDIOMIX_BIT_STREAM02 0x1A8 0x408 0x4C8 0x6 0x5
+#define MX8MP_IOMUXC_SAI2_TXFS__AUDIOMIX_BIT_STREAM02 0x1A8 0x408 0x4C8 0x6 0x6
#define MX8MP_IOMUXC_SAI2_TXFS__SIM_M_HWRITE 0x1A8 0x408 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI2_TXC__AUDIOMIX_SAI2_TX_BCLK 0x1AC 0x40C 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI2_TXC__AUDIOMIX_SAI5_TX_DATA02 0x1AC 0x40C 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI2_TXC__CAN1_RX 0x1AC 0x40C 0x54C 0x3 0x1
#define MX8MP_IOMUXC_SAI2_TXC__GPIO4_IO25 0x1AC 0x40C 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI2_TXC__AUDIOMIX_BIT_STREAM01 0x1AC 0x40C 0x4C4 0x6 0x5
+#define MX8MP_IOMUXC_SAI2_TXC__AUDIOMIX_BIT_STREAM01 0x1AC 0x40C 0x4C4 0x6 0x6
#define MX8MP_IOMUXC_SAI2_TXC__SIM_M_HREADYOUT 0x1AC 0x40C 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI2_TXD0__AUDIOMIX_SAI2_TX_DATA00 0x1B0 0x410 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI2_TXD0__AUDIOMIX_SAI5_TX_DATA03 0x1B0 0x410 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI3_RXFS__AUDIOMIX_SAI3_RX_DATA01 0x1B8 0x418 0x000 0x3 0x0
#define MX8MP_IOMUXC_SAI3_RXFS__AUDIOMIX_SPDIF_IN 0x1B8 0x418 0x544 0x4 0x2
#define MX8MP_IOMUXC_SAI3_RXFS__GPIO4_IO28 0x1B8 0x418 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI3_RXFS__AUDIOMIX_BIT_STREAM00 0x1B8 0x418 0x4C0 0x6 0x4
+#define MX8MP_IOMUXC_SAI3_RXFS__AUDIOMIX_BIT_STREAM00 0x1B8 0x418 0x4C0 0x6 0x5
#define MX8MP_IOMUXC_SAI3_RXFS__TPSMP_HTRANS00 0x1B8 0x418 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI3_RXC__AUDIOMIX_SAI3_RX_BCLK 0x1BC 0x41C 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI3_RXC__AUDIOMIX_SAI2_RX_DATA02 0x1BC 0x41C 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI3_RXD__UART2_DCE_RTS 0x1C0 0x420 0x5EC 0x4 0x3
#define MX8MP_IOMUXC_SAI3_RXD__UART2_DTE_CTS 0x1C0 0x420 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI3_RXD__GPIO4_IO30 0x1C0 0x420 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI3_RXD__AUDIOMIX_BIT_STREAM01 0x1C0 0x420 0x4C4 0x6 0x6
+#define MX8MP_IOMUXC_SAI3_RXD__AUDIOMIX_BIT_STREAM01 0x1C0 0x420 0x4C4 0x6 0x7
#define MX8MP_IOMUXC_SAI3_RXD__TPSMP_HDATA00 0x1C0 0x420 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI3_TXFS__AUDIOMIX_SAI3_TX_SYNC 0x1C4 0x424 0x4EC 0x0 0x1
#define MX8MP_IOMUXC_SAI3_TXFS__AUDIOMIX_SAI2_TX_DATA01 0x1C4 0x424 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI3_TXFS__UART2_DCE_RX 0x1C4 0x424 0x5F0 0x4 0x4
#define MX8MP_IOMUXC_SAI3_TXFS__UART2_DTE_TX 0x1C4 0x424 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI3_TXFS__GPIO4_IO31 0x1C4 0x424 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI3_TXFS__AUDIOMIX_BIT_STREAM03 0x1C4 0x424 0x4CC 0x6 0x5
+#define MX8MP_IOMUXC_SAI3_TXFS__AUDIOMIX_BIT_STREAM03 0x1C4 0x424 0x4CC 0x6 0x6
#define MX8MP_IOMUXC_SAI3_TXFS__TPSMP_HDATA01 0x1C4 0x424 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI3_TXC__AUDIOMIX_SAI3_TX_BCLK 0x1C8 0x428 0x4E8 0x0 0x1
#define MX8MP_IOMUXC_SAI3_TXC__AUDIOMIX_SAI2_TX_DATA02 0x1C8 0x428 0x000 0x1 0x0
#define MX8MP_IOMUXC_SAI3_TXC__UART2_DCE_TX 0x1C8 0x428 0x000 0x4 0x0
#define MX8MP_IOMUXC_SAI3_TXC__UART2_DTE_RX 0x1C8 0x428 0x5F0 0x4 0x5
#define MX8MP_IOMUXC_SAI3_TXC__GPIO5_IO00 0x1C8 0x428 0x000 0x5 0x0
-#define MX8MP_IOMUXC_SAI3_TXC__AUDIOMIX_BIT_STREAM02 0x1C8 0x428 0x4C8 0x6 0x6
+#define MX8MP_IOMUXC_SAI3_TXC__AUDIOMIX_BIT_STREAM02 0x1C8 0x428 0x4C8 0x6 0x7
#define MX8MP_IOMUXC_SAI3_TXC__TPSMP_HDATA02 0x1C8 0x428 0x000 0x7 0x0
#define MX8MP_IOMUXC_SAI3_TXD__AUDIOMIX_SAI3_TX_DATA00 0x1CC 0x42C 0x000 0x0 0x0
#define MX8MP_IOMUXC_SAI3_TXD__AUDIOMIX_SAI2_TX_DATA03 0x1CC 0x42C 0x000 0x1 0x0
aips1: bus@30000000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x301f0000 0x10000>;
+ reg = <0x30000000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
aips2: bus@30400000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x305f0000 0x400000>;
+ reg = <0x30400000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
aips3: bus@30800000 {
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x309f0000 0x400000>;
+ reg = <0x30800000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges;
bus@30000000 { /* AIPS1 */
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x301f0000 0x10000>;
+ reg = <0x30000000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x30000000 0x30000000 0x400000>;
bus@30400000 { /* AIPS2 */
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x305f0000 0x10000>;
+ reg = <0x30400000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x30400000 0x30400000 0x400000>;
bus@30800000 { /* AIPS3 */
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x309f0000 0x10000>;
+ reg = <0x30800000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x30800000 0x30800000 0x400000>,
bus@32c00000 { /* AIPS4 */
compatible = "fsl,aips-bus", "simple-bus";
- reg = <0x32df0000 0x10000>;
+ reg = <0x32c00000 0x400000>;
#address-cells = <1>;
#size-cells = <1>;
ranges = <0x32c00000 0x32c00000 0x400000>;
"venc_lt_sel";
assigned-clocks = <&topckgen CLK_TOP_VENC_SEL>,
<&topckgen CLK_TOP_VENC_LT_SEL>;
- assigned-clock-parents = <&topckgen CLK_TOP_VENCPLL_D2>,
- <&topckgen CLK_TOP_UNIVPLL1_D2>;
+ assigned-clock-parents = <&topckgen CLK_TOP_VCODECPLL>,
+ <&topckgen CLK_TOP_VCODECPLL_370P5>;
};
jpegdec: jpegdec@18004000 {
s11 {
qcom,saw-leader;
regulator-always-on;
- regulator-min-microvolt = <1230000>;
- regulator-max-microvolt = <1230000>;
+ regulator-min-microvolt = <980000>;
+ regulator-max-microvolt = <980000>;
};
};
status = "okay";
};
+&q6asmdai {
+ dai@0 {
+ reg = <0>;
+ };
+
+ dai@1 {
+ reg = <1>;
+ };
+
+ dai@2 {
+ reg = <2>;
+ };
+};
+
&sound {
compatible = "qcom,apq8096-sndcard";
model = "DB820c";
- audio-routing = "RX_BIAS", "MCLK";
+ audio-routing = "RX_BIAS", "MCLK",
+ "MM_DL1", "MultiMedia1 Playback",
+ "MM_DL2", "MultiMedia2 Playback",
+ "MultiMedia3 Capture", "MM_UL3";
mm1-dai-link {
link-name = "MultiMedia1";
reg = <APR_SVC_ASM>;
q6asmdai: dais {
compatible = "qcom,q6asm-dais";
+ #address-cells = <1>;
+ #size-cells = <0>;
#sound-dai-cells = <1>;
iommus = <&lpass_q6_smmu 1>;
};
&q6asmdai {
dai@0 {
reg = <0>;
- direction = <2>;
};
dai@1 {
reg = <1>;
- direction = <2>;
};
dai@2 {
reg = <2>;
- direction = <1>;
};
dai@3 {
&q6asmdai {
dai@0 {
reg = <0>;
- direction = <2>;
};
dai@1 {
reg = <1>;
- direction = <1>;
};
};
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
ipmmu_vip0: mmu@e7b00000 {
compatible = "renesas,ipmmu-r8a77980";
reg = <0 0xe7b00000 0 0x1000>;
+ renesas,ipmmu-main = <&ipmmu_mm 4>;
power-domains = <&sysc R8A77980_PD_ALWAYS_ON>;
#iommu-cells = <1>;
};
ipmmu_vip1: mmu@e7960000 {
compatible = "renesas,ipmmu-r8a77980";
reg = <0 0xe7960000 0 0x1000>;
+ renesas,ipmmu-main = <&ipmmu_mm 11>;
power-domains = <&sysc R8A77980_PD_ALWAYS_ON>;
#iommu-cells = <1>;
};
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
hdmi-encoder@39 {
compatible = "adi,adv7511w";
- reg = <0x39>, <0x3f>, <0x38>, <0x3c>;
- reg-names = "main", "edid", "packet", "cec";
+ reg = <0x39>, <0x3f>, <0x3c>, <0x38>;
+ reg-names = "main", "edid", "cec", "packet";
interrupt-parent = <&gpio1>;
interrupts = <28 IRQ_TYPE_LEVEL_LOW>;
adi,input-depth = <8>;
adi,input-colorspace = "rgb";
adi,input-clock = "1x";
- adi,input-style = <1>;
- adi,input-justification = "evenly";
ports {
#address-cells = <1>;
};
arm-pmu {
- compatible = "arm,cortex-a53-pmu";
+ compatible = "arm,cortex-a35-pmu";
interrupts = <GIC_SPI 100 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 101 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 102 IRQ_TYPE_LEVEL_HIGH>,
};
arm-pmu {
- compatible = "arm,cortex-a53-pmu";
+ compatible = "arm,cortex-a35-pmu";
interrupts = <GIC_SPI 83 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 84 IRQ_TYPE_LEVEL_HIGH>,
<GIC_SPI 85 IRQ_TYPE_LEVEL_HIGH>,
&gmac2phy {
phy-supply = <&vcc_phy>;
clock_in_out = "output";
- assigned-clocks = <&cru SCLK_MAC2PHY_SRC>;
assigned-clock-rate = <50000000>;
assigned-clocks = <&cru SCLK_MAC2PHY>;
assigned-clock-parents = <&cru SCLK_MAC2PHY_SRC>;
-
+ status = "okay";
};
&i2c1 {
status = "okay";
- rk805: rk805@18 {
+ rk805: pmic@18 {
compatible = "rockchip,rk805";
reg = <0x18>;
interrupt-parent = <&gpio2>;
&i2c1 {
status = "okay";
- rk805: rk805@18 {
+ rk805: pmic@18 {
compatible = "rockchip,rk805";
reg = <0x18>;
interrupt-parent = <&gpio2>;
grf: syscon@ff100000 {
compatible = "rockchip,rk3328-grf", "syscon", "simple-mfd";
reg = <0x0 0xff100000 0x0 0x1000>;
- #address-cells = <1>;
- #size-cells = <1>;
io_domains: io-domains {
compatible = "rockchip,rk3328-io-voltage-domain";
};
gmac2phy {
- fephyled_speed100: fephyled-speed100 {
- rockchip,pins = <0 RK_PD7 1 &pcfg_pull_none>;
- };
-
fephyled_speed10: fephyled-speed10 {
rockchip,pins = <0 RK_PD6 1 &pcfg_pull_none>;
};
rockchip,pins = <0 RK_PD6 2 &pcfg_pull_none>;
};
- fephyled_rxm0: fephyled-rxm0 {
- rockchip,pins = <0 RK_PD5 1 &pcfg_pull_none>;
- };
-
- fephyled_txm0: fephyled-txm0 {
- rockchip,pins = <0 RK_PD5 2 &pcfg_pull_none>;
- };
-
- fephyled_linkm0: fephyled-linkm0 {
- rockchip,pins = <0 RK_PD4 1 &pcfg_pull_none>;
- };
-
fephyled_rxm1: fephyled-rxm1 {
rockchip,pins = <2 RK_PD1 2 &pcfg_pull_none>;
};
"Speaker", "Speaker Amplifier OUTL",
"Speaker", "Speaker Amplifier OUTR";
- simple-audio-card,hp-det-gpio = <&gpio0 RK_PB0 GPIO_ACTIVE_LOW>;
+ simple-audio-card,hp-det-gpio = <&gpio0 RK_PB0 GPIO_ACTIVE_HIGH>;
simple-audio-card,aux-devs = <&speaker_amp>;
simple-audio-card,pin-switches = "Speaker";
fusb0: fusb30x@22 {
compatible = "fcs,fusb302";
reg = <0x22>;
- fcs,int_n = <&gpio1 RK_PA2 GPIO_ACTIVE_HIGH>;
+ interrupt-parent = <&gpio1>;
+ interrupts = <RK_PA2 IRQ_TYPE_LEVEL_LOW>;
pinctrl-names = "default";
pinctrl-0 = <&fusb0_int_gpio>;
vbus-supply = <&vbus_typec>;
dc-charger {
dc_det_gpio: dc-det-gpio {
- rockchip,pins = <4 RK_PD0 RK_FUNC_GPIO &pcfg_pull_none>;
+ rockchip,pins = <4 RK_PD0 RK_FUNC_GPIO &pcfg_pull_up>;
};
};
es8316 {
hp_det_gpio: hp-det-gpio {
- rockchip,pins = <0 RK_PB0 RK_FUNC_GPIO &pcfg_pull_down>;
+ rockchip,pins = <0 RK_PB0 RK_FUNC_GPIO &pcfg_pull_up>;
};
};
reset-names = "usb3-otg";
status = "disabled";
- usbdrd_dwc3_0: dwc3 {
+ usbdrd_dwc3_0: usb@fe800000 {
compatible = "snps,dwc3";
reg = <0x0 0xfe800000 0x0 0x100000>;
interrupts = <GIC_SPI 105 IRQ_TYPE_LEVEL_HIGH 0>;
reset-names = "usb3-otg";
status = "disabled";
- usbdrd_dwc3_1: dwc3 {
+ usbdrd_dwc3_1: usb@fe900000 {
compatible = "snps,dwc3";
reg = <0x0 0xfe900000 0x0 0x100000>;
interrupts = <GIC_SPI 110 IRQ_TYPE_LEVEL_HIGH 0>;
pmugrf: syscon@ff320000 {
compatible = "rockchip,rk3399-pmugrf", "syscon", "simple-mfd";
reg = <0x0 0xff320000 0x0 0x1000>;
- #address-cells = <1>;
- #size-cells = <1>;
pmu_io_domains: io-domains {
compatible = "rockchip,rk3399-pmu-io-voltage-domain";
gpu: gpu@ff9a0000 {
compatible = "rockchip,rk3399-mali", "arm,mali-t860";
reg = <0x0 0xff9a0000 0x0 0x10000>;
- interrupts = <GIC_SPI 19 IRQ_TYPE_LEVEL_HIGH 0>,
- <GIC_SPI 20 IRQ_TYPE_LEVEL_HIGH 0>,
- <GIC_SPI 21 IRQ_TYPE_LEVEL_HIGH 0>;
- interrupt-names = "gpu", "job", "mmu";
+ interrupts = <GIC_SPI 20 IRQ_TYPE_LEVEL_HIGH 0>,
+ <GIC_SPI 21 IRQ_TYPE_LEVEL_HIGH 0>,
+ <GIC_SPI 19 IRQ_TYPE_LEVEL_HIGH 0>;
+ interrupt-names = "job", "mmu", "gpu";
clocks = <&cru ACLK_GPU>;
#cooling-cells = <2>;
power-domains = <&power RK3399_PD_GPU>;
CONFIG_PCIE_ARMADA_8K=y
CONFIG_PCIE_KIRIN=y
CONFIG_PCIE_HISI_STB=y
-CONFIG_PCIE_TEGRA194=m
+CONFIG_PCIE_TEGRA194_HOST=m
CONFIG_DEVTMPFS=y
CONFIG_DEVTMPFS_MOUNT=y
CONFIG_FW_LOADER_USER_HELPER=y
CONFIG_MEDIA_SDR_SUPPORT=y
CONFIG_MEDIA_CONTROLLER=y
CONFIG_VIDEO_V4L2_SUBDEV_API=y
+CONFIG_MEDIA_PLATFORM_SUPPORT=y
# CONFIG_DVB_NET is not set
CONFIG_MEDIA_USB_SUPPORT=y
CONFIG_USB_VIDEO_CLASS=m
CONFIG_DRM_TEGRA=m
CONFIG_DRM_PANEL_LVDS=m
CONFIG_DRM_PANEL_SIMPLE=m
-CONFIG_DRM_DUMB_VGA_DAC=m
+CONFIG_DRM_SIMPLE_BRIDGE=m
CONFIG_DRM_PANEL_TRULY_NT35597_WQXGA=m
+CONFIG_DRM_DISPLAY_CONNECTOR=m
CONFIG_DRM_SII902X=m
CONFIG_DRM_THINE_THC63LVD1024=m
CONFIG_DRM_TI_SN65DSI86=m
CONFIG_ARCH_R8A774A1=y
CONFIG_ARCH_R8A774B1=y
CONFIG_ARCH_R8A774C0=y
-CONFIG_ARCH_R8A7795=y
+CONFIG_ARCH_R8A77950=y
+CONFIG_ARCH_R8A77951=y
CONFIG_ARCH_R8A77960=y
CONFIG_ARCH_R8A77961=y
CONFIG_ARCH_R8A77965=y
unsigned int key_len)
{
struct crypto_aes_essiv_cbc_ctx *ctx = crypto_skcipher_ctx(tfm);
- SHASH_DESC_ON_STACK(desc, ctx->hash);
u8 digest[SHA256_DIGEST_SIZE];
int ret;
if (ret)
return ret;
- desc->tfm = ctx->hash;
- crypto_shash_digest(desc, in_key, key_len, digest);
+ crypto_shash_tfm_digest(ctx->hash, in_key, key_len, digest);
return aes_expandkey(&ctx->key2, digest, sizeof(digest));
}
#include <asm/assembler.h>
.text
- .cpu generic+crypto
+ .arch armv8-a+crypto
init_crc .req w19
buf .req x20
#include <crypto/internal/simd.h>
#include <crypto/sha.h>
#include <crypto/sha256_base.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
*/
#include <crypto/internal/hash.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <linux/string.h>
#include <crypto/sha.h>
return (image_addr & ~(SZ_1G - 1UL)) + (1UL << (VA_BITS_MIN - 1));
}
-#define efi_bs_call(func, ...) efi_system_table()->boottime->func(__VA_ARGS__)
-#define efi_rt_call(func, ...) efi_system_table()->runtime->func(__VA_ARGS__)
-#define efi_is_native() (true)
-
-#define efi_table_attr(inst, attr) (inst->attr)
-
-#define efi_call_proto(inst, func, ...) inst->func(inst, ##__VA_ARGS__)
-
#define alloc_screen_info(x...) &screen_info
static inline void free_screen_info(struct screen_info *si)
struct nmi_ctx {
u64 hcr;
+ unsigned int cnt;
};
DECLARE_PER_CPU(struct nmi_ctx, nmi_contexts);
-#define arch_nmi_enter() \
- do { \
- if (is_kernel_in_hyp_mode()) { \
- struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts); \
- nmi_ctx->hcr = read_sysreg(hcr_el2); \
- if (!(nmi_ctx->hcr & HCR_TGE)) { \
- write_sysreg(nmi_ctx->hcr | HCR_TGE, hcr_el2); \
- isb(); \
- } \
- } \
- } while (0)
+#define arch_nmi_enter() \
+do { \
+ struct nmi_ctx *___ctx; \
+ u64 ___hcr; \
+ \
+ if (!is_kernel_in_hyp_mode()) \
+ break; \
+ \
+ ___ctx = this_cpu_ptr(&nmi_contexts); \
+ if (___ctx->cnt) { \
+ ___ctx->cnt++; \
+ break; \
+ } \
+ \
+ ___hcr = read_sysreg(hcr_el2); \
+ if (!(___hcr & HCR_TGE)) { \
+ write_sysreg(___hcr | HCR_TGE, hcr_el2); \
+ isb(); \
+ } \
+ /* \
+ * Make sure the sysreg write is performed before ___ctx->cnt \
+ * is set to 1. NMIs that see cnt == 1 will rely on us. \
+ */ \
+ barrier(); \
+ ___ctx->cnt = 1; \
+ /* \
+ * Make sure ___ctx->cnt is set before we save ___hcr. We \
+ * don't want ___ctx->hcr to be overwritten. \
+ */ \
+ barrier(); \
+ ___ctx->hcr = ___hcr; \
+} while (0)
-#define arch_nmi_exit() \
- do { \
- if (is_kernel_in_hyp_mode()) { \
- struct nmi_ctx *nmi_ctx = this_cpu_ptr(&nmi_contexts); \
- if (!(nmi_ctx->hcr & HCR_TGE)) \
- write_sysreg(nmi_ctx->hcr, hcr_el2); \
- } \
- } while (0)
+#define arch_nmi_exit() \
+do { \
+ struct nmi_ctx *___ctx; \
+ u64 ___hcr; \
+ \
+ if (!is_kernel_in_hyp_mode()) \
+ break; \
+ \
+ ___ctx = this_cpu_ptr(&nmi_contexts); \
+ ___hcr = ___ctx->hcr; \
+ /* \
+ * Make sure we read ___ctx->hcr before we release \
+ * ___ctx->cnt as it makes ___ctx->hcr updatable again. \
+ */ \
+ barrier(); \
+ ___ctx->cnt--; \
+ /* \
+ * Make sure ___ctx->cnt release is visible before we \
+ * restore the sysreg. Otherwise a new NMI occurring \
+ * right after write_sysreg() can be fooled and think \
+ * we secured things for it. \
+ */ \
+ barrier(); \
+ if (!___ctx->cnt && !(___hcr & HCR_TGE)) \
+ write_sysreg(___hcr, hcr_el2); \
+} while (0)
static inline void ack_bad_irq(unsigned int irq)
{
__p = uaccess_mask_ptr(__p); \
__raw_get_user((x), __p, (err)); \
} else { \
- (x) = 0; (err) = -EFAULT; \
+ (x) = (__force __typeof__(x))0; (err) = -EFAULT; \
} \
} while (0)
SYM_CODE_START(efi_enter_kernel)
/*
- * efi_entry() will have copied the kernel image if necessary and we
+ * efi_pe_entry() will have copied the kernel image if necessary and we
* end up here with device tree address in x1 and the kernel entry
* point stored in x0. Save those values in registers which are
* callee preserved.
.long __initdata_begin - efi_header_end // SizeOfCode
.long __pecoff_data_size // SizeOfInitializedData
.long 0 // SizeOfUninitializedData
- .long __efistub_efi_entry - _head // AddressOfEntryPoint
+ .long __efistub_efi_pe_entry - _head // AddressOfEntryPoint
.long efi_header_end - _head // BaseOfCode
extra_header_fields:
* the offline CPUs. Therefore, we must use the __* variant here.
*/
__flush_icache_range((uintptr_t)reboot_code_buffer,
+ (uintptr_t)reboot_code_buffer +
arm64_relocate_new_kernel_size);
/* Flush the kimage list and its buffers. */
int syscall_trace_enter(struct pt_regs *regs)
{
- if (test_thread_flag(TIF_SYSCALL_TRACE) ||
- test_thread_flag(TIF_SYSCALL_EMU)) {
+ unsigned long flags = READ_ONCE(current_thread_info()->flags);
+
+ if (flags & (_TIF_SYSCALL_EMU | _TIF_SYSCALL_TRACE)) {
tracehook_report_syscall(regs, PTRACE_SYSCALL_ENTER);
- if (!in_syscall(regs) || test_thread_flag(TIF_SYSCALL_EMU))
+ if (!in_syscall(regs) || (flags & _TIF_SYSCALL_EMU))
return -1;
}
__sdei_handler(struct pt_regs *regs, struct sdei_registered_event *arg)
{
unsigned long ret;
- bool do_nmi_exit = false;
- /*
- * nmi_enter() deals with printk() re-entrance and use of RCU when
- * RCU believed this CPU was idle. Because critical events can
- * interrupt normal events, we may already be in_nmi().
- */
- if (!in_nmi()) {
- nmi_enter();
- do_nmi_exit = true;
- }
+ nmi_enter();
ret = _sdei_handler(regs, arg);
- if (do_nmi_exit)
- nmi_exit();
+ nmi_exit();
return ret;
}
panic("CPU%u detected unsupported configuration\n", cpu);
}
- return ret;
+ return -EIO;
}
static void init_gic_priority_masking(void)
asmlinkage void do_serror(struct pt_regs *regs, unsigned int esr)
{
- const bool was_in_nmi = in_nmi();
-
- if (!was_in_nmi)
- nmi_enter();
+ nmi_enter();
/* non-RAS errors are not containable */
if (!arm64_is_ras_serror(esr) || arm64_is_fatal_ras_serror(regs, esr))
arm64_serror_panic(regs, esr);
- if (!was_in_nmi)
- nmi_exit();
+ nmi_exit();
}
asmlinkage void enter_from_user_mode(void)
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select ARCH_USE_BUILTIN_BSWAP
select ARCH_USE_QUEUED_RWLOCKS if NR_CPUS>2
+ select ARCH_WANT_FRAME_POINTERS if !CPU_CK610
select COMMON_CLK
select CLKSRC_MMIO
select CSKY_MPINTC if CPU_CK860
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_AUDITSYSCALL
select HAVE_COPY_THREAD_TLS
+ select HAVE_DEBUG_BUGVERBOSE
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
select HAVE_FUNCTION_TRACER
KBUILD_CFLAGS += -mno-stack-size
endif
-ifdef CONFIG_STACKTRACE
+ifdef CONFIG_FRAME_POINTER
KBUILD_CFLAGS += -mbacktrace
endif
.endm
.macro RESTORE_ALL
- psrclr ie
ldw lr, (sp, 4)
ldw a0, (sp, 8)
mtcr a0, epc
* BA Reserved C D V
*/
cprcr r6, cpcr30
- lsri r6, 28
- lsli r6, 28
+ lsri r6, 29
+ lsli r6, 29
addi r6, 0xe
cpwcr r6, cpcr30
movi r6, 0
cpwcr r6, cpcr31
.endm
-
-.macro ANDI_R3 rx, imm
- lsri \rx, 3
- andi \rx, (\imm >> 3)
-.endm
#endif /* __ASM_CSKY_ENTRY_H */
#define LSAVE_A1 28
#define LSAVE_A2 32
#define LSAVE_A3 36
+#define LSAVE_A4 40
+#define LSAVE_A5 44
#define KSPTOUSP
#define USPTOKSP
.endm
.macro RESTORE_ALL
- psrclr ie
ldw tls, (sp, 0)
ldw lr, (sp, 4)
ldw a0, (sp, 8)
*/
mfcr r6, cr<30, 15> /* Get MSA0 */
2:
- lsri r6, 28
- lsli r6, 28
+ lsri r6, 29
+ lsli r6, 29
addi r6, 0x1ce
mtcr r6, cr<30, 15> /* Set MSA0 */
jmpi 3f /* jump to va */
3:
.endm
-
-.macro ANDI_R3 rx, imm
- lsri \rx, 3
- andi \rx, (\imm >> 3)
-.endm
#endif /* __ASM_CSKY_ENTRY_H */
mov a0, lr
subi a0, 4
ldw a1, (sp, 24)
+ lrw a2, function_trace_op
+ ldw a2, (a2, 0)
jsr r26
#define TASK_UNMAPPED_BASE (TASK_SIZE / 3)
struct thread_struct {
- unsigned long ksp; /* kernel stack pointer */
- unsigned long sr; /* saved status register */
+ unsigned long sp; /* kernel stack pointer */
unsigned long trap_no; /* saved status register */
/* FPU regs */
};
#define INIT_THREAD { \
- .ksp = sizeof(init_stack) + (unsigned long) &init_stack, \
- .sr = DEFAULT_PSR_VALUE, \
+ .sp = sizeof(init_stack) + (unsigned long) &init_stack, \
}
/*
return regs->usp;
}
+static inline unsigned long frame_pointer(struct pt_regs *regs)
+{
+ return regs->regs[4];
+}
+static inline void frame_pointer_set(struct pt_regs *regs,
+ unsigned long val)
+{
+ regs->regs[4] = val;
+}
+
extern int regs_query_register_offset(const char *name);
extern unsigned long regs_get_kernel_stack_nth(struct pt_regs *regs,
unsigned int n);
#define THREAD_SIZE_ORDER (THREAD_SHIFT - PAGE_SHIFT)
#define thread_saved_fp(tsk) \
- ((unsigned long)(((struct switch_stack *)(tsk->thread.ksp))->r8))
+ ((unsigned long)(((struct switch_stack *)(tsk->thread.sp))->r8))
+
+#define thread_saved_sp(tsk) \
+ ((unsigned long)(tsk->thread.sp))
+
+#define thread_saved_lr(tsk) \
+ ((unsigned long)(((struct switch_stack *)(tsk->thread.sp))->r15))
static inline struct thread_info *current_thread_info(void)
{
#define TIF_SIGPENDING 0 /* signal pending */
#define TIF_NOTIFY_RESUME 1 /* callback before returning to user */
#define TIF_NEED_RESCHED 2 /* rescheduling necessary */
-#define TIF_SYSCALL_TRACE 3 /* syscall trace active */
-#define TIF_SYSCALL_TRACEPOINT 4 /* syscall tracepoint instrumentation */
-#define TIF_SYSCALL_AUDIT 5 /* syscall auditing */
-#define TIF_UPROBE 6 /* uprobe breakpoint or singlestep */
+#define TIF_UPROBE 3 /* uprobe breakpoint or singlestep */
+#define TIF_SYSCALL_TRACE 4 /* syscall trace active */
+#define TIF_SYSCALL_TRACEPOINT 5 /* syscall tracepoint instrumentation */
+#define TIF_SYSCALL_AUDIT 6 /* syscall auditing */
#define TIF_POLLING_NRFLAG 16 /* poll_idle() is TIF_NEED_RESCHED */
#define TIF_MEMDIE 18 /* is terminating due to OOM killer */
#define TIF_RESTORE_SIGMASK 20 /* restore signal mask in do_signal() */
#define _TIF_RESTORE_SIGMASK (1 << TIF_RESTORE_SIGMASK)
#define _TIF_SECCOMP (1 << TIF_SECCOMP)
+#define _TIF_WORK_MASK (_TIF_NEED_RESCHED | _TIF_SIGPENDING | \
+ _TIF_NOTIFY_RESUME | _TIF_UPROBE)
+
+#define _TIF_SYSCALL_WORK (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_AUDIT | \
+ _TIF_SYSCALL_TRACEPOINT)
+
#endif /* _ASM_CSKY_THREAD_INFO_H */
extern int __get_user_bad(void);
-#define __copy_user(to, from, n) \
+#define ___copy_to_user(to, from, n) \
do { \
int w0, w1, w2, w3; \
asm volatile( \
" subi %0, 4 \n" \
" br 3b \n" \
"5: cmpnei %0, 0 \n" /* 1B */ \
- " bf 8f \n" \
+ " bf 13f \n" \
" ldb %3, (%2, 0) \n" \
"6: stb %3, (%1, 0) \n" \
" addi %2, 1 \n" \
" addi %1, 1 \n" \
" subi %0, 1 \n" \
" br 5b \n" \
- "7: br 8f \n" \
+ "7: subi %0, 4 \n" \
+ "8: subi %0, 4 \n" \
+ "12: subi %0, 4 \n" \
+ " br 13f \n" \
".section __ex_table, \"a\" \n" \
".align 2 \n" \
- ".long 2b, 7b \n" \
- ".long 9b, 7b \n" \
- ".long 10b, 7b \n" \
+ ".long 2b, 13f \n" \
+ ".long 4b, 13f \n" \
+ ".long 6b, 13f \n" \
+ ".long 9b, 12b \n" \
+ ".long 10b, 8b \n" \
".long 11b, 7b \n" \
- ".long 4b, 7b \n" \
- ".long 6b, 7b \n" \
".previous \n" \
- "8: \n" \
+ "13: \n" \
: "=r"(n), "=r"(to), "=r"(from), "=r"(w0), \
"=r"(w1), "=r"(w2), "=r"(w3) \
: "0"(n), "1"(to), "2"(from) \
: "memory"); \
} while (0)
-#define __copy_user_zeroing(to, from, n) \
+#define ___copy_from_user(to, from, n) \
do { \
int tmp; \
int nsave; \
" addi %1, 1 \n" \
" subi %0, 1 \n" \
" br 5b \n" \
- "8: mov %3, %0 \n" \
- " movi %4, 0 \n" \
- "9: stb %4, (%1, 0) \n" \
- " addi %1, 1 \n" \
- " subi %3, 1 \n" \
- " cmpnei %3, 0 \n" \
- " bt 9b \n" \
- " br 7f \n" \
+ "8: stw %3, (%1, 0) \n" \
+ " subi %0, 4 \n" \
+ " bf 7f \n" \
+ "9: subi %0, 8 \n" \
+ " bf 7f \n" \
+ "13: stw %3, (%1, 8) \n" \
+ " subi %0, 12 \n" \
+ " bf 7f \n" \
".section __ex_table, \"a\" \n" \
".align 2 \n" \
- ".long 2b, 8b \n" \
+ ".long 2b, 7f \n" \
+ ".long 4b, 7f \n" \
+ ".long 6b, 7f \n" \
".long 10b, 8b \n" \
- ".long 11b, 8b \n" \
- ".long 12b, 8b \n" \
- ".long 4b, 8b \n" \
- ".long 6b, 8b \n" \
+ ".long 11b, 9b \n" \
+ ".long 12b,13b \n" \
".previous \n" \
"7: \n" \
: "=r"(n), "=r"(to), "=r"(from), "=r"(nsave), \
obj-y += entry.o atomic.o signal.o traps.o irq.o time.o vdso.o
obj-y += power.o syscall.o syscall_table.o setup.o
-obj-y += process.o cpu-probe.o ptrace.o dumpstack.o
+obj-y += process.o cpu-probe.o ptrace.o stacktrace.o
obj-y += probes/
obj-$(CONFIG_MODULES) += module.o
DEFINE(TASK_ACTIVE_MM, offsetof(struct task_struct, active_mm));
/* offsets into the thread struct */
- DEFINE(THREAD_KSP, offsetof(struct thread_struct, ksp));
- DEFINE(THREAD_SR, offsetof(struct thread_struct, sr));
+ DEFINE(THREAD_KSP, offsetof(struct thread_struct, sp));
DEFINE(THREAD_FESR, offsetof(struct thread_struct, user_fp.fesr));
DEFINE(THREAD_FCR, offsetof(struct thread_struct, user_fp.fcr));
DEFINE(THREAD_FPREG, offsetof(struct thread_struct, user_fp.vr));
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0
-// Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd.
-
-#include <linux/ptrace.h>
-
-int kstack_depth_to_print = 48;
-
-void show_trace(unsigned long *stack)
-{
- unsigned long *stack_end;
- unsigned long *stack_start;
- unsigned long *fp;
- unsigned long addr;
-
- addr = (unsigned long) stack & THREAD_MASK;
- stack_start = (unsigned long *) addr;
- stack_end = (unsigned long *) (addr + THREAD_SIZE);
-
- fp = stack;
- pr_info("\nCall Trace:");
-
- while (fp > stack_start && fp < stack_end) {
-#ifdef CONFIG_STACKTRACE
- addr = fp[1];
- fp = (unsigned long *) fp[0];
-#else
- addr = *fp++;
-#endif
- if (__kernel_text_address(addr))
- pr_cont("\n[<%08lx>] %pS", addr, (void *)addr);
- }
- pr_cont("\n");
-}
-
-void show_stack(struct task_struct *task, unsigned long *stack)
-{
- if (!stack) {
- if (task)
- stack = (unsigned long *)thread_saved_fp(task);
- else
-#ifdef CONFIG_STACKTRACE
- asm volatile("mov %0, r8\n":"=r"(stack)::"memory");
-#else
- stack = (unsigned long *)&stack;
-#endif
- }
-
- show_trace(stack);
-}
ENTRY(csky_systemcall)
SAVE_ALL TRAP0_SIZE
zero_fp
-#ifdef CONFIG_RSEQ_DEBUG
- mov a0, sp
- jbsr rseq_syscall
-#endif
psrset ee, ie
- lrw r11, __NR_syscalls
- cmphs syscallid, r11 /* Check nr of syscall */
- bt ret_from_exception
+ lrw r9, __NR_syscalls
+ cmphs syscallid, r9 /* Check nr of syscall */
+ bt 1f
- lrw r13, sys_call_table
- ixw r13, syscallid
- ldw r11, (r13)
- cmpnei r11, 0
+ lrw r9, sys_call_table
+ ixw r9, syscallid
+ ldw syscallid, (r9)
+ cmpnei syscallid, 0
bf ret_from_exception
mov r9, sp
bmaski r10, THREAD_SHIFT
andn r9, r10
- ldw r12, (r9, TINFO_FLAGS)
- ANDI_R3 r12, (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT)
- cmpnei r12, 0
+ ldw r10, (r9, TINFO_FLAGS)
+ lrw r9, _TIF_SYSCALL_WORK
+ and r10, r9
+ cmpnei r10, 0
bt csky_syscall_trace
#if defined(__CSKYABIV2__)
subi sp, 8
stw r5, (sp, 0x4)
stw r4, (sp, 0x0)
- jsr r11 /* Do system call */
+ jsr syscallid /* Do system call */
addi sp, 8
#else
- jsr r11
+ jsr syscallid
#endif
stw a0, (sp, LSAVE_A0) /* Save return value */
+1:
+#ifdef CONFIG_DEBUG_RSEQ
+ mov a0, sp
+ jbsr rseq_syscall
+#endif
jmpi ret_from_exception
csky_syscall_trace:
ldw a3, (sp, LSAVE_A3)
#if defined(__CSKYABIV2__)
subi sp, 8
- stw r5, (sp, 0x4)
- stw r4, (sp, 0x0)
+ ldw r9, (sp, LSAVE_A4)
+ stw r9, (sp, 0x0)
+ ldw r9, (sp, LSAVE_A5)
+ stw r9, (sp, 0x4)
+ jsr syscallid /* Do system call */
+ addi sp, 8
#else
ldw r6, (sp, LSAVE_A4)
ldw r7, (sp, LSAVE_A5)
-#endif
- jsr r11 /* Do system call */
-#if defined(__CSKYABIV2__)
- addi sp, 8
+ jsr syscallid /* Do system call */
#endif
stw a0, (sp, LSAVE_A0) /* Save return value */
+#ifdef CONFIG_DEBUG_RSEQ
+ mov a0, sp
+ jbsr rseq_syscall
+#endif
mov a0, sp /* right now, sp --> pt_regs */
jbsr syscall_trace_exit
br ret_from_exception
mov r9, sp
bmaski r10, THREAD_SHIFT
andn r9, r10
- ldw r12, (r9, TINFO_FLAGS)
- ANDI_R3 r12, (_TIF_SYSCALL_TRACE | _TIF_SYSCALL_TRACEPOINT | _TIF_SYSCALL_AUDIT)
- cmpnei r12, 0
+ ldw r10, (r9, TINFO_FLAGS)
+ lrw r9, _TIF_SYSCALL_WORK
+ and r10, r9
+ cmpnei r10, 0
bf ret_from_exception
mov a0, sp /* sp = pt_regs pointer */
jbsr syscall_trace_exit
ret_from_exception:
- ld syscallid, (sp, LSAVE_PSR)
- btsti syscallid, 31
- bt 1f
+ psrclr ie
+ ld r9, (sp, LSAVE_PSR)
+ btsti r9, 31
+ bt 1f
/*
* Load address of current->thread_info, Then get address of task_struct
* Get task_needreshed in task_struct
bmaski r10, THREAD_SHIFT
andn r9, r10
- ldw r12, (r9, TINFO_FLAGS)
- andi r12, (_TIF_SIGPENDING | _TIF_NOTIFY_RESUME | _TIF_NEED_RESCHED | _TIF_UPROBE)
- cmpnei r12, 0
+ ldw r10, (r9, TINFO_FLAGS)
+ lrw r9, _TIF_WORK_MASK
+ and r10, r9
+ cmpnei r10, 0
bt exit_work
1:
+#ifdef CONFIG_PREEMPTION
+ mov r9, sp
+ bmaski r10, THREAD_SHIFT
+ andn r9, r10
+
+ ldw r10, (r9, TINFO_PREEMPT)
+ cmpnei r10, 0
+ bt 2f
+ jbsr preempt_schedule_irq /* irq en/disable is done inside */
+2:
+#endif
+
#ifdef CONFIG_TRACE_IRQFLAGS
ld r10, (sp, LSAVE_PSR)
btsti r10, 6
RESTORE_ALL
exit_work:
- lrw syscallid, ret_from_exception
- mov lr, syscallid
+ lrw r9, ret_from_exception
+ mov lr, r9
- btsti r12, TIF_NEED_RESCHED
+ btsti r10, TIF_NEED_RESCHED
bt work_resched
+ psrset ie
mov a0, sp
- mov a1, r12
+ mov a1, r10
jmpi do_notify_resume
work_resched:
jbsr trace_hardirqs_off
#endif
-#ifdef CONFIG_PREEMPTION
- mov r9, sp /* Get current stack pointer */
- bmaski r10, THREAD_SHIFT
- andn r9, r10 /* Get thread_info */
-
- /*
- * Get task_struct->stack.preempt_count for current,
- * and increase 1.
- */
- ldw r12, (r9, TINFO_PREEMPT)
- addi r12, 1
- stw r12, (r9, TINFO_PREEMPT)
-#endif
mov a0, sp
jbsr csky_do_IRQ
-#ifdef CONFIG_PREEMPTION
- subi r12, 1
- stw r12, (r9, TINFO_PREEMPT)
- cmpnei r12, 0
- bt 2f
- ldw r12, (r9, TINFO_FLAGS)
- btsti r12, TIF_NEED_RESCHED
- bf 2f
- jbsr preempt_schedule_irq /* irq en/disable is done inside */
-#endif
-2:
jmpi ret_from_exception
/*
lrw a3, TASK_THREAD
addu a3, a0
- mfcr a2, psr /* Save PSR value */
- stw a2, (a3, THREAD_SR) /* Save PSR in task struct */
- bclri a2, 6 /* Disable interrupts */
- mtcr a2, psr
-
SAVE_SWITCH_STACK
stw sp, (a3, THREAD_KSP)
ldw sp, (a3, THREAD_KSP) /* Set next kernel sp */
- ldw a2, (a3, THREAD_SR) /* Set next PSR */
- mtcr a2, psr
-
#if defined(__CSKYABIV2__)
- addi r7, a1, TASK_THREAD_INFO
- ldw tls, (r7, TINFO_TP_VALUE)
+ addi a3, a1, TASK_THREAD_INFO
+ ldw tls, (a3, TINFO_TP_VALUE)
#endif
RESTORE_SWITCH_STACK
#endif /* CONFIG_DYNAMIC_FTRACE */
#endif /* CONFIG_FUNCTION_GRAPH_TRACER */
+#ifdef CONFIG_DYNAMIC_FTRACE
#ifndef CONFIG_CPU_HAS_ICACHE_INS
struct ftrace_modify_param {
int command;
stop_machine(__ftrace_modify_code, ¶m, cpu_online_mask);
}
#endif
+#endif /* CONFIG_DYNAMIC_FTRACE */
/* _mcount is defined in abi's mcount.S */
EXPORT_SYMBOL(_mcount);
static int unwind_frame_kernel(struct stackframe *frame)
{
- if (kstack_end((void *)frame->fp))
+ unsigned long low = (unsigned long)task_stack_page(current);
+ unsigned long high = low + THREAD_SIZE;
+
+ if (unlikely(frame->fp < low || frame->fp > high))
return -EPERM;
- if (frame->fp & 0x3 || frame->fp < TASK_SIZE)
+
+ if (kstack_end((void *)frame->fp) || frame->fp & 0x3)
return -EPERM;
*frame = *(struct stackframe *)frame->fp;
+
if (__kernel_text_address(frame->lr)) {
int graph = 0;
#define UPROBE_TRAP_NR UINT_MAX
+bool is_swbp_insn(uprobe_opcode_t *insn)
+{
+ return (*insn & 0xffff) == UPROBE_SWBP_INSN;
+}
+
unsigned long uprobe_get_swbp_addr(struct pt_regs *regs)
{
return instruction_pointer(regs);
*/
unsigned long thread_saved_pc(struct task_struct *tsk)
{
- struct switch_stack *sw = (struct switch_stack *)tsk->thread.ksp;
+ struct switch_stack *sw = (struct switch_stack *)tsk->thread.sp;
return sw->r15;
}
childstack = ((struct switch_stack *) childregs) - 1;
memset(childstack, 0, sizeof(struct switch_stack));
- /* setup ksp for switch_to !!! */
- p->thread.ksp = (unsigned long)childstack;
+ /* setup thread.sp for switch_to !!! */
+ p->thread.sp = (unsigned long)childstack;
if (unlikely(p->flags & PF_KTHREAD)) {
memset(childregs, 0, sizeof(struct pt_regs));
return 1;
}
-unsigned long get_wchan(struct task_struct *p)
-{
- unsigned long lr;
- unsigned long *fp, *stack_start, *stack_end;
- int count = 0;
-
- if (!p || p == current || p->state == TASK_RUNNING)
- return 0;
-
- stack_start = (unsigned long *)end_of_stack(p);
- stack_end = (unsigned long *)(task_stack_page(p) + THREAD_SIZE);
-
- fp = (unsigned long *) thread_saved_fp(p);
- do {
- if (fp < stack_start || fp > stack_end)
- return 0;
-#ifdef CONFIG_STACKTRACE
- lr = fp[1];
- fp = (unsigned long *)fp[0];
-#else
- lr = *fp++;
-#endif
- if (!in_sched_functions(lr) &&
- __kernel_text_address(lr))
- return lr;
- } while (count++ < 16);
-
- return 0;
-}
-EXPORT_SYMBOL(get_wchan);
-
#ifndef CONFIG_CPU_PM_NONE
void arch_cpu_idle(void)
{
regs = task_pt_regs(tsk);
regs->sr = (regs->sr & TRACE_MODE_MASK) | TRACE_MODE_RUN;
+
+ /* Enable irq */
+ regs->sr |= BIT(6);
}
static void singlestep_enable(struct task_struct *tsk)
regs = task_pt_regs(tsk);
regs->sr = (regs->sr & TRACE_MODE_MASK) | TRACE_MODE_SI;
+
+ /* Disable irq */
+ regs->sr &= ~BIT(6);
}
/*
// SPDX-License-Identifier: GPL-2.0
-/* Copyright (C) 2018 Hangzhou C-SKY Microsystems co.,ltd. */
#include <linux/sched/debug.h>
#include <linux/sched/task_stack.h>
#include <linux/stacktrace.h>
#include <linux/ftrace.h>
+#include <linux/ptrace.h>
-void save_stack_trace(struct stack_trace *trace)
+#ifdef CONFIG_FRAME_POINTER
+
+struct stackframe {
+ unsigned long fp;
+ unsigned long ra;
+};
+
+void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
+ bool (*fn)(unsigned long, void *), void *arg)
{
- save_stack_trace_tsk(current, trace);
+ unsigned long fp, sp, pc;
+
+ if (regs) {
+ fp = frame_pointer(regs);
+ sp = user_stack_pointer(regs);
+ pc = instruction_pointer(regs);
+ } else if (task == NULL || task == current) {
+ const register unsigned long current_sp __asm__ ("sp");
+ const register unsigned long current_fp __asm__ ("r8");
+ fp = current_fp;
+ sp = current_sp;
+ pc = (unsigned long)walk_stackframe;
+ } else {
+ /* task blocked in __switch_to */
+ fp = thread_saved_fp(task);
+ sp = thread_saved_sp(task);
+ pc = thread_saved_lr(task);
+ }
+
+ for (;;) {
+ unsigned long low, high;
+ struct stackframe *frame;
+
+ if (unlikely(!__kernel_text_address(pc) || fn(pc, arg)))
+ break;
+
+ /* Validate frame pointer */
+ low = sp;
+ high = ALIGN(sp, THREAD_SIZE);
+ if (unlikely(fp < low || fp > high || fp & 0x3))
+ break;
+ /* Unwind stack frame */
+ frame = (struct stackframe *)fp;
+ sp = fp;
+ fp = frame->fp;
+ pc = ftrace_graph_ret_addr(current, NULL, frame->ra,
+ (unsigned long *)(fp - 8));
+ }
}
-EXPORT_SYMBOL_GPL(save_stack_trace);
-void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+#else /* !CONFIG_FRAME_POINTER */
+
+static void notrace walk_stackframe(struct task_struct *task,
+ struct pt_regs *regs, bool (*fn)(unsigned long, void *), void *arg)
{
- unsigned long *fp, *stack_start, *stack_end;
- unsigned long addr;
- int skip = trace->skip;
- int savesched;
- int graph_idx = 0;
+ unsigned long sp, pc;
+ unsigned long *ksp;
- if (tsk == current) {
- asm volatile("mov %0, r8\n":"=r"(fp));
- savesched = 1;
+ if (regs) {
+ sp = user_stack_pointer(regs);
+ pc = instruction_pointer(regs);
+ } else if (task == NULL || task == current) {
+ const register unsigned long current_sp __asm__ ("sp");
+ sp = current_sp;
+ pc = (unsigned long)walk_stackframe;
} else {
- fp = (unsigned long *)thread_saved_fp(tsk);
- savesched = 0;
+ /* task blocked in __switch_to */
+ sp = thread_saved_sp(task);
+ pc = thread_saved_lr(task);
}
- addr = (unsigned long) fp & THREAD_MASK;
- stack_start = (unsigned long *) addr;
- stack_end = (unsigned long *) (addr + THREAD_SIZE);
-
- while (fp > stack_start && fp < stack_end) {
- unsigned long lpp, fpp;
+ if (unlikely(sp & 0x3))
+ return;
- fpp = fp[0];
- lpp = fp[1];
- if (!__kernel_text_address(lpp))
+ ksp = (unsigned long *)sp;
+ while (!kstack_end(ksp)) {
+ if (__kernel_text_address(pc) && unlikely(fn(pc, arg)))
break;
- else
- lpp = ftrace_graph_ret_addr(tsk, &graph_idx, lpp, NULL);
-
- if (savesched || !in_sched_functions(lpp)) {
- if (skip) {
- skip--;
- } else {
- trace->entries[trace->nr_entries++] = lpp;
- if (trace->nr_entries >= trace->max_entries)
- break;
- }
- }
- fp = (unsigned long *)fpp;
+ pc = (*ksp++) - 0x4;
}
}
+#endif /* CONFIG_FRAME_POINTER */
+
+static bool print_trace_address(unsigned long pc, void *arg)
+{
+ print_ip_sym(pc);
+ return false;
+}
+
+void show_stack(struct task_struct *task, unsigned long *sp)
+{
+ pr_cont("Call Trace:\n");
+ walk_stackframe(task, NULL, print_trace_address, NULL);
+}
+
+static bool save_wchan(unsigned long pc, void *arg)
+{
+ if (!in_sched_functions(pc)) {
+ unsigned long *p = arg;
+ *p = pc;
+ return true;
+ }
+ return false;
+}
+
+unsigned long get_wchan(struct task_struct *task)
+{
+ unsigned long pc = 0;
+
+ if (likely(task && task != current && task->state != TASK_RUNNING))
+ walk_stackframe(task, NULL, save_wchan, &pc);
+ return pc;
+}
+
+#ifdef CONFIG_STACKTRACE
+static bool __save_trace(unsigned long pc, void *arg, bool nosched)
+{
+ struct stack_trace *trace = arg;
+
+ if (unlikely(nosched && in_sched_functions(pc)))
+ return false;
+ if (unlikely(trace->skip > 0)) {
+ trace->skip--;
+ return false;
+ }
+
+ trace->entries[trace->nr_entries++] = pc;
+ return (trace->nr_entries >= trace->max_entries);
+}
+
+static bool save_trace(unsigned long pc, void *arg)
+{
+ return __save_trace(pc, arg, false);
+}
+
+/*
+ * Save stack-backtrace addresses into a stack_trace buffer.
+ */
+void save_stack_trace_tsk(struct task_struct *tsk, struct stack_trace *trace)
+{
+ walk_stackframe(tsk, NULL, save_trace, trace);
+}
EXPORT_SYMBOL_GPL(save_stack_trace_tsk);
+
+void save_stack_trace(struct stack_trace *trace)
+{
+ save_stack_trace_tsk(NULL, trace);
+}
+EXPORT_SYMBOL_GPL(save_stack_trace);
+
+#endif /* CONFIG_STACKTRACE */
unsigned long raw_copy_from_user(void *to, const void *from,
unsigned long n)
{
- if (access_ok(from, n))
- __copy_user_zeroing(to, from, n);
- else
- memset(to, 0, n);
+ ___copy_from_user(to, from, n);
return n;
}
EXPORT_SYMBOL(raw_copy_from_user);
unsigned long raw_copy_to_user(void *to, const void *from,
unsigned long n)
{
- if (access_ok(to, n))
- __copy_user(to, from, n);
+ ___copy_to_user(to, from, n);
return n;
}
EXPORT_SYMBOL(raw_copy_to_user);
#define _ASM_IA64_DEVICE_H
struct dev_archdata {
-#ifdef CONFIG_INTEL_IOMMU
+#ifdef CONFIG_IOMMU_API
void *iommu; /* hook for IOMMU specific extension */
#endif
};
unsigned long sal_systab_phys = EFI_INVALID_TABLE_ADDR;
static const efi_config_table_type_t arch_tables[] __initconst = {
- {ESI_TABLE_GUID, "ESI", &esi_phys},
- {HCDP_TABLE_GUID, "HCDP", &hcdp_phys},
- {MPS_TABLE_GUID, "MPS", &mps_phys},
- {PROCESSOR_ABSTRACTION_LAYER_OVERWRITE_GUID, "PALO", &palo_phys},
- {SAL_SYSTEM_TABLE_GUID, "SALsystab", &sal_systab_phys},
- {NULL_GUID, NULL, 0},
+ {ESI_TABLE_GUID, &esi_phys, "ESI" },
+ {HCDP_TABLE_GUID, &hcdp_phys, "HCDP" },
+ {MPS_TABLE_GUID, &mps_phys, "MPS" },
+ {PROCESSOR_ABSTRACTION_LAYER_OVERWRITE_GUID, &palo_phys, "PALO" },
+ {SAL_SYSTEM_TABLE_GUID, &sal_systab_phys, "SALsystab" },
+ {},
};
extern efi_status_t efi_call_phys (void *, ...);
#include <linux/export.h>
#include <linux/string.h>
-#include <linux/cryptohash.h>
#include <linux/delay.h>
#include <linux/in6.h>
#include <linux/syscalls.h>
#include <linux/module.h>
#include <linux/string.h>
#include <asm/byteorder.h>
-#include <linux/cryptohash.h>
#include <asm/octeon/octeon.h>
#include <crypto/internal/hash.h>
else
return -EFAULT;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
if (count != sizeof(uint32_t))
> BITS_PER_LONG);
high_memory = __va((max_pfn << PAGE_SHIFT));
- set_max_mapnr(page_to_pfn(virt_to_page(high_memory - 1)) + 1);
+ set_max_mapnr(max_low_pfn);
memblock_free_all();
#ifdef CONFIG_PA11
select ARCH_HAS_MMIOWB if PPC64
select ARCH_HAS_PHYS_TO_DMA
select ARCH_HAS_PMEM_API
+ select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PTE_DEVMAP if PPC_BOOK3S_64
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_MEMBARRIER_CALLBACKS
select ARCH_HAS_SCALED_CPUTIME if VIRT_CPU_ACCOUNTING_NATIVE && PPC_BOOK3S_64
- select ARCH_HAS_STRICT_KERNEL_RWX if ((PPC_BOOK3S_64 || PPC32) && !HIBERNATION)
+ select ARCH_HAS_STRICT_KERNEL_RWX if (PPC32 && !HIBERNATION)
select ARCH_HAS_TICK_BROADCAST if GENERIC_CLOCKEVENTS_BROADCAST
select ARCH_HAS_UACCESS_FLUSHCACHE
select ARCH_HAS_UACCESS_MCSAFE if PPC64
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/md5.h>
#include <asm/byteorder.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
-extern void powerpc_sha_transform(u32 *state, const u8 *src, u32 *temp);
+void powerpc_sha_transform(u32 *state, const u8 *src);
-static int sha1_init(struct shash_desc *desc)
+static int powerpc_sha1_init(struct shash_desc *desc)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
return 0;
}
-static int sha1_update(struct shash_desc *desc, const u8 *data,
- unsigned int len)
+static int powerpc_sha1_update(struct shash_desc *desc, const u8 *data,
+ unsigned int len)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
unsigned int partial, done;
src = data;
if ((partial + len) > 63) {
- u32 temp[SHA_WORKSPACE_WORDS];
if (partial) {
done = -partial;
}
do {
- powerpc_sha_transform(sctx->state, src, temp);
+ powerpc_sha_transform(sctx->state, src);
done += 64;
src = data + done;
} while (done + 63 < len);
- memzero_explicit(temp, sizeof(temp));
partial = 0;
}
memcpy(sctx->buffer + partial, src, len - done);
/* Add padding and return the message digest. */
-static int sha1_final(struct shash_desc *desc, u8 *out)
+static int powerpc_sha1_final(struct shash_desc *desc, u8 *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
__be32 *dst = (__be32 *)out;
/* Pad out to 56 mod 64 */
index = sctx->count & 0x3f;
padlen = (index < 56) ? (56 - index) : ((64+56) - index);
- sha1_update(desc, padding, padlen);
+ powerpc_sha1_update(desc, padding, padlen);
/* Append length */
- sha1_update(desc, (const u8 *)&bits, sizeof(bits));
+ powerpc_sha1_update(desc, (const u8 *)&bits, sizeof(bits));
/* Store state in digest */
for (i = 0; i < 5; i++)
return 0;
}
-static int sha1_export(struct shash_desc *desc, void *out)
+static int powerpc_sha1_export(struct shash_desc *desc, void *out)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
return 0;
}
-static int sha1_import(struct shash_desc *desc, const void *in)
+static int powerpc_sha1_import(struct shash_desc *desc, const void *in)
{
struct sha1_state *sctx = shash_desc_ctx(desc);
static struct shash_alg alg = {
.digestsize = SHA1_DIGEST_SIZE,
- .init = sha1_init,
- .update = sha1_update,
- .final = sha1_final,
- .export = sha1_export,
- .import = sha1_import,
+ .init = powerpc_sha1_init,
+ .update = powerpc_sha1_update,
+ .final = powerpc_sha1_final,
+ .export = powerpc_sha1_export,
+ .import = powerpc_sha1_import,
.descsize = sizeof(struct sha1_state),
.statesize = sizeof(struct sha1_state),
.base = {
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <asm/byteorder.h>
* updating the accessed and modified bits in the page table tree.
*/
-#define _PAGE_USER 0x001 /* usermode access allowed */
-#define _PAGE_RW 0x002 /* software: user write access allowed */
-#define _PAGE_PRESENT 0x004 /* software: pte contains a translation */
+#define _PAGE_PRESENT 0x001 /* software: pte contains a translation */
+#define _PAGE_HASHPTE 0x002 /* hash_page has made an HPTE for this pte */
+#define _PAGE_USER 0x004 /* usermode access allowed */
#define _PAGE_GUARDED 0x008 /* G: prohibit speculative access */
#define _PAGE_COHERENT 0x010 /* M: enforce memory coherence (SMP systems) */
#define _PAGE_NO_CACHE 0x020 /* I: cache inhibit */
#define _PAGE_DIRTY 0x080 /* C: page changed */
#define _PAGE_ACCESSED 0x100 /* R: page referenced */
#define _PAGE_EXEC 0x200 /* software: exec allowed */
-#define _PAGE_HASHPTE 0x400 /* hash_page has made an HPTE for this pte */
+#define _PAGE_RW 0x400 /* software: user write access allowed */
#define _PAGE_SPECIAL 0x800 /* software: Special page */
#ifdef CONFIG_PTE_64BIT
.macro kuap_check current, gpr
#ifdef CONFIG_PPC_KUAP_DEBUG
- lwz \gpr2, KUAP(thread)
+ lwz \gpr, KUAP(thread)
999: twnei \gpr, 0
EMIT_BUG_ENTRY 999b, __FILE__, __LINE__, (BUGFLAG_WARNING | BUGFLAG_ONCE)
#endif
} \
} while(0)
+static inline bool __lazy_irq_pending(u8 irq_happened)
+{
+ return !!(irq_happened & ~PACA_IRQ_HARD_DIS);
+}
+
+/*
+ * Check if a lazy IRQ is pending. Should be called with IRQs hard disabled.
+ */
static inline bool lazy_irq_pending(void)
{
- return !!(get_paca()->irq_happened & ~PACA_IRQ_HARD_DIS);
+ return __lazy_irq_pending(get_paca()->irq_happened);
+}
+
+/*
+ * Check if a lazy IRQ is pending, with no debugging checks.
+ * Should be called with IRQs hard disabled.
+ * For use in RI disabled code or other constrained situations.
+ */
+static inline bool lazy_irq_pending_nocheck(void)
+{
+ return __lazy_irq_pending(local_paca->irq_happened);
}
/*
({ \
long __pu_err; \
__typeof__(*(ptr)) __user *__pu_addr = (ptr); \
+ __typeof__(*(ptr)) __pu_val = (x); \
+ __typeof__(size) __pu_size = (size); \
+ \
if (!is_kernel_addr((unsigned long)__pu_addr)) \
might_fault(); \
- __chk_user_ptr(ptr); \
+ __chk_user_ptr(__pu_addr); \
if (do_allow) \
- __put_user_size((x), __pu_addr, (size), __pu_err); \
+ __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
else \
- __put_user_size_allowed((x), __pu_addr, (size), __pu_err); \
+ __put_user_size_allowed(__pu_val, __pu_addr, __pu_size, __pu_err); \
+ \
__pu_err; \
})
({ \
long __pu_err = -EFAULT; \
__typeof__(*(ptr)) __user *__pu_addr = (ptr); \
+ __typeof__(*(ptr)) __pu_val = (x); \
+ __typeof__(size) __pu_size = (size); \
+ \
might_fault(); \
- if (access_ok(__pu_addr, size)) \
- __put_user_size((x), __pu_addr, (size), __pu_err); \
+ if (access_ok(__pu_addr, __pu_size)) \
+ __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+ \
__pu_err; \
})
({ \
long __pu_err; \
__typeof__(*(ptr)) __user *__pu_addr = (ptr); \
- __chk_user_ptr(ptr); \
- __put_user_size((x), __pu_addr, (size), __pu_err); \
+ __typeof__(*(ptr)) __pu_val = (x); \
+ __typeof__(size) __pu_size = (size); \
+ \
+ __chk_user_ptr(__pu_addr); \
+ __put_user_size(__pu_val, __pu_addr, __pu_size, __pu_err); \
+ \
__pu_err; \
})
long __gu_err; \
__long_type(*(ptr)) __gu_val; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
- __chk_user_ptr(ptr); \
+ __typeof__(size) __gu_size = (size); \
+ \
+ __chk_user_ptr(__gu_addr); \
if (!is_kernel_addr((unsigned long)__gu_addr)) \
might_fault(); \
barrier_nospec(); \
if (do_allow) \
- __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
+ __get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \
else \
- __get_user_size_allowed(__gu_val, __gu_addr, (size), __gu_err); \
+ __get_user_size_allowed(__gu_val, __gu_addr, __gu_size, __gu_err); \
(x) = (__typeof__(*(ptr)))__gu_val; \
+ \
__gu_err; \
})
long __gu_err = -EFAULT; \
__long_type(*(ptr)) __gu_val = 0; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
+ __typeof__(size) __gu_size = (size); \
+ \
might_fault(); \
- if (access_ok(__gu_addr, (size))) { \
+ if (access_ok(__gu_addr, __gu_size)) { \
barrier_nospec(); \
- __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
+ __get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \
} \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
+ \
__gu_err; \
})
long __gu_err; \
__long_type(*(ptr)) __gu_val; \
__typeof__(*(ptr)) __user *__gu_addr = (ptr); \
- __chk_user_ptr(ptr); \
+ __typeof__(size) __gu_size = (size); \
+ \
+ __chk_user_ptr(__gu_addr); \
barrier_nospec(); \
- __get_user_size(__gu_val, __gu_addr, (size), __gu_err); \
+ __get_user_size(__gu_val, __gu_addr, __gu_size, __gu_err); \
(x) = (__force __typeof__(*(ptr)))__gu_val; \
+ \
__gu_err; \
})
GCOV_PROFILE_kprobes-ftrace.o := n
KCOV_INSTRUMENT_kprobes-ftrace.o := n
UBSAN_SANITIZE_kprobes-ftrace.o := n
+GCOV_PROFILE_syscall_64.o := n
+KCOV_INSTRUMENT_syscall_64.o := n
+UBSAN_SANITIZE_syscall_64.o := n
UBSAN_SANITIZE_vdso.o := n
# Necessary for booting with kcov enabled on book3e machines
#ifdef CONFIG_PPC_BOOK3S
/*
* If MSR EE/RI was never enabled, IRQs not reconciled, NVGPRs not
- * touched, AMR not set, no exit work created, then this can be used.
+ * touched, no exit work created, then this can be used.
*/
.balign IFETCH_ALIGN_BYTES
.globl fast_interrupt_return
fast_interrupt_return:
_ASM_NOKPROBE_SYMBOL(fast_interrupt_return)
+ kuap_check_amr r3, r4
ld r4,_MSR(r1)
andi. r0,r4,MSR_PR
bne .Lfast_user_interrupt_return
+ kuap_restore_amr r3
andi. r0,r4,MSR_RI
li r3,0 /* 0 return value, no EMULATE_STACK_STORE */
bne+ .Lfast_kernel_interrupt_return
ld r10,SOFTE(r1)
stb r10,PACAIRQSOFTMASK(r13)
+ kuap_restore_amr r10
EXCEPTION_RESTORE_REGS
RFI_TO_USER_OR_KERNEL
GEN_COMMON facility_unavailable
addi r3,r1,STACK_FRAME_OVERHEAD
bl facility_unavailable_exception
+ REST_NVGPRS(r1) /* instruction emulation may change GPRs */
b interrupt_return
GEN_KVM facility_unavailable
GEN_COMMON h_facility_unavailable
addi r3,r1,STACK_FRAME_OVERHEAD
bl facility_unavailable_exception
+ REST_NVGPRS(r1) /* XXX Shouldn't be necessary in practice */
b interrupt_return
GEN_KVM h_facility_unavailable
andis. r0, r5, (DSISR_BAD_FAULT_32S | DSISR_DABRMATCH)@h
#endif
bne handle_page_fault_tramp_2 /* if not, try to put a PTE */
- rlwinm r3, r5, 32 - 24, 30, 30 /* DSISR_STORE -> _PAGE_RW */
+ rlwinm r3, r5, 32 - 15, 21, 21 /* DSISR_STORE -> _PAGE_RW */
bl hash_page
b handle_page_fault_tramp_1
FTR_SECTION_ELSE
andc. r1,r1,r0 /* check access & ~permission */
bne- InstructionAddressInvalid /* return if access not permitted */
/* Convert linux-style PTE to low word of PPC-style PTE */
+ rlwimi r0,r0,32-2,31,31 /* _PAGE_USER -> PP lsb */
ori r1, r1, 0xe06 /* clear out reserved bits */
andc r1, r0, r1 /* PP = user? 1 : 0 */
BEGIN_FTR_SECTION
* we would need to update the pte atomically with lwarx/stwcx.
*/
/* Convert linux-style PTE to low word of PPC-style PTE */
- rlwinm r1,r0,0,30,30 /* _PAGE_RW -> PP msb */
- rlwimi r0,r0,1,30,30 /* _PAGE_USER -> PP msb */
+ rlwinm r1,r0,32-9,30,30 /* _PAGE_RW -> PP msb */
+ rlwimi r0,r0,32-1,30,30 /* _PAGE_USER -> PP msb */
+ rlwimi r0,r0,32-1,31,31 /* _PAGE_USER -> PP lsb */
ori r1,r1,0xe04 /* clear out reserved bits */
andc r1,r0,r1 /* PP = user? rw? 1: 3: 0 */
BEGIN_FTR_SECTION
* we would need to update the pte atomically with lwarx/stwcx.
*/
/* Convert linux-style PTE to low word of PPC-style PTE */
+ rlwimi r0,r0,32-2,31,31 /* _PAGE_USER -> PP lsb */
li r1,0xe06 /* clear out reserved bits & PP msb */
andc r1,r0,r1 /* PP = user? 1: 0 */
BEGIN_FTR_SECTION
/* 0x0C00 - System Call Exception */
START_EXCEPTION(0x0C00, SystemCall)
SYSCALL_ENTRY 0xc00
+/* Trap_0D is commented out to get more space for system call exception */
- EXCEPTION(0x0D00, Trap_0D, unknown_exception, EXC_XFER_STD)
+/* EXCEPTION(0x0D00, Trap_0D, unknown_exception, EXC_XFER_STD) */
EXCEPTION(0x0E00, Trap_0E, unknown_exception, EXC_XFER_STD)
EXCEPTION(0x0F00, Trap_0F, unknown_exception, EXC_XFER_STD)
* to be stored as an xattr or as an appended signature.
*
* To avoid duplicate signature verification as much as possible, the IMA
- * policy rule for module appraisal is added only if CONFIG_MODULE_SIG_FORCE
+ * policy rule for module appraisal is added only if CONFIG_MODULE_SIG
* is not enabled.
*/
static const char *const secure_rules[] = {
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig",
-#ifndef CONFIG_MODULE_SIG_FORCE
+#ifndef CONFIG_MODULE_SIG
"appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig",
#endif
NULL
"measure func=KEXEC_KERNEL_CHECK template=ima-modsig",
"measure func=MODULE_CHECK template=ima-modsig",
"appraise func=KEXEC_KERNEL_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig",
-#ifndef CONFIG_MODULE_SIG_FORCE
+#ifndef CONFIG_MODULE_SIG
"appraise func=MODULE_CHECK appraise_flag=check_blacklist appraise_type=imasig|modsig",
#endif
NULL
int rc = -1;
switch (reason) {
- case KMSG_DUMP_RESTART:
- case KMSG_DUMP_HALT:
- case KMSG_DUMP_POWEROFF:
+ case KMSG_DUMP_SHUTDOWN:
/* These are almost always orderly shutdowns. */
return;
case KMSG_DUMP_OOPS:
BUG_ON(!FULL_REGS(regs));
BUG_ON(regs->softe != IRQS_ENABLED);
+ kuap_check_amr();
+
account_cpu_user_entry();
#ifdef CONFIG_PPC_SPLPAR
}
#endif
- kuap_check_amr();
-
/*
* This is not required for the syscall exit path, but makes the
* stack frame look nicer. If this was initialised in the first stack
unsigned long ti_flags;
unsigned long ret = 0;
+ kuap_check_amr();
+
regs->result = r3;
/* Check whether the syscall is issued inside a restartable sequence */
/* This pattern matches prep_irq_for_idle */
__hard_EE_RI_disable();
- if (unlikely(lazy_irq_pending())) {
+ if (unlikely(lazy_irq_pending_nocheck())) {
__hard_RI_enable();
trace_hardirqs_off();
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
local_paca->tm_scratch = regs->msr;
#endif
- kuap_check_amr();
-
account_cpu_user_exit();
return ret;
BUG_ON(!FULL_REGS(regs));
BUG_ON(regs->softe != IRQS_ENABLED);
+ kuap_check_amr();
+
local_irq_save(flags);
again:
trace_hardirqs_on();
__hard_EE_RI_disable();
- if (unlikely(lazy_irq_pending())) {
+ if (unlikely(lazy_irq_pending_nocheck())) {
__hard_RI_enable();
trace_hardirqs_off();
local_paca->irq_happened |= PACA_IRQ_HARD_DIS;
local_paca->tm_scratch = regs->msr;
#endif
- kuap_check_amr();
-
account_cpu_user_exit();
return ret;
BUG_ON(regs->msr & MSR_PR);
BUG_ON(!FULL_REGS(regs));
+ kuap_check_amr();
+
if (unlikely(*ti_flagsp & _TIF_EMULATE_STACK_STORE)) {
clear_bits(_TIF_EMULATE_STACK_STORE, ti_flagsp);
ret = 1;
trace_hardirqs_on();
__hard_EE_RI_disable();
- if (unlikely(lazy_irq_pending())) {
+ if (unlikely(lazy_irq_pending_nocheck())) {
__hard_RI_enable();
irq_soft_mask_set(IRQS_ALL_DISABLED);
trace_hardirqs_off();
void system_reset_exception(struct pt_regs *regs)
{
unsigned long hsrr0, hsrr1;
- bool nested = in_nmi();
bool saved_hsrrs = false;
- /*
- * Avoid crashes in case of nested NMI exceptions. Recoverability
- * is determined by RI and in_nmi
- */
- if (!nested)
- nmi_enter();
+ nmi_enter();
/*
* System reset can interrupt code where HSRRs are live and MSR[RI]=1.
mtspr(SPRN_HSRR1, hsrr1);
}
- if (!nested)
- nmi_exit();
+ nmi_exit();
/* What should we do here? We could issue a shutdown or hard reset. */
}
void machine_check_exception(struct pt_regs *regs)
{
int recover = 0;
- bool nested = in_nmi();
- if (!nested)
- nmi_enter();
+
+ nmi_enter();
__this_cpu_inc(irq_stat.mce_exceptions);
if (check_io_access(regs))
goto bail;
- if (!nested)
- nmi_exit();
+ nmi_exit();
die("Machine check", regs, SIGBUS);
return;
bail:
- if (!nested)
- nmi_exit();
+ nmi_exit();
}
void SMIException(struct pt_regs *regs)
blr
/*
- * invalid clock
+ * syscall fallback
*/
99:
- li r3, EINVAL
- crset so
+ li r0,__NR_clock_getres
+ sc
blr
.cfi_endproc
V_FUNCTION_END(__kernel_clock_getres)
#ifdef CONFIG_PPC64
*(.tramp.ftrace.text);
#endif
+ NOINSTR_TEXT
SCHED_TEXT
CPUIDLE_TEXT
LOCK_TEXT
/*
* Load a PTE into the hash table, if possible.
* The address is in r4, and r3 contains an access flag:
- * _PAGE_RW (0x002) if a write.
+ * _PAGE_RW (0x400) if a write.
* r9 contains the SRR1 value, from which we use the MSR_PR bit.
* SPRG_THREAD contains the physical address of the current task's thread.
*
blt+ 112f /* assume user more likely */
lis r5, (swapper_pg_dir - PAGE_OFFSET)@ha /* if kernel address, use */
addi r5 ,r5 ,(swapper_pg_dir - PAGE_OFFSET)@l /* kernel page table */
- rlwimi r3,r9,32-14,31,31 /* MSR_PR -> _PAGE_USER */
+ rlwimi r3,r9,32-12,29,29 /* MSR_PR -> _PAGE_USER */
112:
#ifndef CONFIG_PTE_64BIT
rlwimi r5,r4,12,20,29 /* insert top 10 bits of address */
#else
rlwimi r8,r4,23,20,28 /* compute pte address */
#endif
- rlwinm r0,r3,6,24,24 /* _PAGE_RW access -> _PAGE_DIRTY */
+ rlwinm r0,r3,32-3,24,24 /* _PAGE_RW access -> _PAGE_DIRTY */
ori r0,r0,_PAGE_ACCESSED|_PAGE_HASHPTE
/*
_GLOBAL(create_hpte)
/* Convert linux-style PTE (r5) to low word of PPC-style PTE (r8) */
+ rlwinm r8,r5,32-9,30,30 /* _PAGE_RW -> PP msb */
rlwinm r0,r5,32-6,30,30 /* _PAGE_DIRTY -> PP msb */
- and r8,r5,r0 /* writable if _RW & _DIRTY */
- rlwimi r5,r5,1,30,30 /* _PAGE_USER -> PP msb */
+ and r8,r8,r0 /* writable if _RW & _DIRTY */
+ rlwimi r5,r5,32-1,30,30 /* _PAGE_USER -> PP msb */
+ rlwimi r5,r5,32-2,31,31 /* _PAGE_USER -> PP lsb */
ori r8,r8,0xe04 /* clear out reserved bits */
andc r8,r5,r8 /* PP = user? (rw&dirty? 1: 3): 0 */
BEGIN_FTR_SECTION
33: lwarx r8,0,r5 /* fetch the pte flags word */
andi. r0,r8,_PAGE_HASHPTE
beq 8f /* done if HASHPTE is already clear */
- rlwinm r8,r8,0,~_PAGE_HASHPTE /* clear HASHPTE bit */
+ rlwinm r8,r8,0,31,29 /* clear HASHPTE bit */
stwcx. r8,0,r5 /* update the pte */
bne- 33b
if (event->attr.type != event->pmu->type)
return -ENOENT;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
/* Sampling not supported */
if (event->attr.type != event->pmu->type)
return -ENOENT;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
/* Return if this is a couting event */
select GENERIC_ARCH_TOPOLOGY if SMP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_MMIOWB
- select ARCH_HAS_DEBUG_VIRTUAL
+ select ARCH_HAS_DEBUG_VIRTUAL if MMU
select HAVE_EBPF_JIT if MMU
select EDAC_SUPPORT
select ARCH_HAS_GIGANTIC_PAGE
def_bool y
config SYS_SUPPORTS_HUGETLBFS
+ depends on MMU
def_bool y
config STACKTRACE_SUPPORT
This enables support for SiFive SoC platform hardware.
config SOC_VIRT
- bool "QEMU Virt Machine"
- select POWER_RESET_SYSCON
- select POWER_RESET_SYSCON_POWEROFF
- select GOLDFISH
- select RTC_DRV_GOLDFISH
- select SIFIVE_PLIC
- help
- This enables support for QEMU Virt Machine.
+ bool "QEMU Virt Machine"
+ select POWER_RESET
+ select POWER_RESET_SYSCON
+ select POWER_RESET_SYSCON_POWEROFF
+ select GOLDFISH
+ select RTC_DRV_GOLDFISH if RTC_CLASS
+ select SIFIVE_PLIC
+ help
+ This enables support for QEMU Virt Machine.
config SOC_KENDRYTE
bool "Kendryte K210 SoC"
#ifndef CONFIG_MMU
#define pgprot_noncached(x) (x)
+#define pgprot_writecombine(x) (x)
+#define pgprot_device(x) (x)
#endif /* CONFIG_MMU */
/* Generic IO read/write. These perform native-endian accesses. */
*/
#define mmiowb() __asm__ __volatile__ ("fence o,w" : : : "memory");
+#include <linux/smp.h>
#include <asm-generic/mmiowb.h>
#endif /* _ASM_RISCV_MMIOWB_H */
#include <linux/ptrace.h>
#include <linux/interrupt.h>
+#ifdef CONFIG_RISCV_BASE_PMU
#define RISCV_BASE_COUNTERS 2
/*
* The RISCV_MAX_COUNTERS parameter should be specified.
*/
-#ifdef CONFIG_RISCV_BASE_PMU
#define RISCV_MAX_COUNTERS 2
-#endif
-
-#ifndef RISCV_MAX_COUNTERS
-#error "Please provide a valid RISCV_MAX_COUNTERS for the PMU."
-#endif
/*
* These are the indexes of bits in counteren register *minus* 1,
int irq;
};
+#endif
#ifdef CONFIG_PERF_EVENTS
#define perf_arch_bpf_user_pt_regs(regs) (struct user_regs_struct *)regs
#endif
#else /* CONFIG_MMU */
+#define PAGE_SHARED __pgprot(0)
#define PAGE_KERNEL __pgprot(0)
#define swapper_pg_dir NULL
#define VMALLOC_START 0
#define TASK_SIZE 0xffffffffUL
+static inline void __kernel_map_pages(struct page *page, int numpages, int enable) {}
+
#endif /* !CONFIG_MMU */
#define kern_addr_valid(addr) (1) /* FIXME */
obj-$(CONFIG_FUNCTION_TRACER) += mcount.o ftrace.o
obj-$(CONFIG_DYNAMIC_FTRACE) += mcount-dyn.o
-obj-$(CONFIG_PERF_EVENTS) += perf_event.o
+obj-$(CONFIG_RISCV_BASE_PMU) += perf_event.o
obj-$(CONFIG_PERF_EVENTS) += perf_callchain.o
obj-$(CONFIG_HAVE_PERF_REGS) += perf_regs.o
obj-$(CONFIG_RISCV_SBI) += sbi.o
return riscv_pmu->hw_events[config];
}
-int riscv_map_cache_decode(u64 config, unsigned int *type,
+static int riscv_map_cache_decode(u64 config, unsigned int *type,
unsigned int *op, unsigned int *result)
{
return -ENOENT;
static DEFINE_MUTEX(pmc_reserve_mutex);
-irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev)
+static irqreturn_t riscv_base_pmu_handle_irq(int irq_num, void *dev)
{
return IRQ_NONE;
}
return err;
}
-void release_pmc_hardware(void)
+static void release_pmc_hardware(void)
{
mutex_lock(&pmc_reserve_mutex);
if (riscv_pmu->irq >= 0)
{ /* sentinel value */ }
};
-int __init init_hw_perf_events(void)
+static int __init init_hw_perf_events(void)
{
struct device_node *node = of_find_node_by_type(NULL, "pmu");
const struct of_device_id *of_id;
#include <asm/switch_to.h>
#include <asm/thread_info.h>
-unsigned long gp_in_global __asm__("gp");
+register unsigned long gp_in_global __asm__("gp");
extern asmlinkage void ret_from_fork(void);
extern asmlinkage void ret_from_kernel_thread(void);
#else /* !CONFIG_FRAME_POINTER */
-static void notrace walk_stackframe(struct task_struct *task,
+void notrace walk_stackframe(struct task_struct *task,
struct pt_regs *regs, bool (*fn)(unsigned long, void *), void *arg)
{
unsigned long sp, pc;
memset((void *)empty_zero_page, 0, PAGE_SIZE);
}
-#ifdef CONFIG_DEBUG_VM
+#if defined(CONFIG_MMU) && defined(CONFIG_DEBUG_VM)
static inline void print_mlk(char *name, unsigned long b, unsigned long t)
{
pr_notice("%12s : 0x%08lx - 0x%08lx (%4ld kB)\n", name, b, t,
#include "sha.h"
-static int sha1_init(struct shash_desc *desc)
+static int s390_sha1_init(struct shash_desc *desc)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
return 0;
}
-static int sha1_export(struct shash_desc *desc, void *out)
+static int s390_sha1_export(struct shash_desc *desc, void *out)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
struct sha1_state *octx = out;
return 0;
}
-static int sha1_import(struct shash_desc *desc, const void *in)
+static int s390_sha1_import(struct shash_desc *desc, const void *in)
{
struct s390_sha_ctx *sctx = shash_desc_ctx(desc);
const struct sha1_state *ictx = in;
static struct shash_alg alg = {
.digestsize = SHA1_DIGEST_SIZE,
- .init = sha1_init,
+ .init = s390_sha1_init,
.update = s390_sha_update,
.final = s390_sha_final,
- .export = sha1_export,
- .import = sha1_import,
+ .export = s390_sha1_export,
+ .import = s390_sha1_import,
.descsize = sizeof(struct s390_sha_ctx),
.statesize = sizeof(struct sha1_state),
.base = {
#include <linux/slab.h>
#include <asm/pci_insn.h>
+/* I/O size constraints */
+#define ZPCI_MAX_READ_SIZE 8
+#define ZPCI_MAX_WRITE_SIZE 128
+
/* I/O Map */
#define ZPCI_IOMAP_SHIFT 48
#define ZPCI_IOMAP_ADDR_BASE 0x8000000000000000UL
while (n > 0) {
size = zpci_get_max_write_size((u64 __force) src,
- (u64) dst, n, 8);
+ (u64) dst, n,
+ ZPCI_MAX_READ_SIZE);
rc = zpci_read_single(dst, src, size);
if (rc)
break;
while (n > 0) {
size = zpci_get_max_write_size((u64 __force) dst,
- (u64) src, n, 128);
+ (u64) src, n,
+ ZPCI_MAX_WRITE_SIZE);
if (size > 8) /* main path */
rc = zpci_write_block(dst, src, size);
else
buf.mem += crashk_res.start;
buf.memsz = buf.bufsz;
- data->parm->initrd_start = buf.mem;
+ data->parm->initrd_start = data->memsz;
data->parm->initrd_size = buf.memsz;
data->memsz += buf.memsz;
break;
case R_390_64: /* Direct 64 bit. */
case R_390_GLOB_DAT:
+ case R_390_JMP_SLOT:
*(u64 *)loc = val;
break;
case R_390_PC16: /* PC relative 16 bit. */
rste &= ~_SEGMENT_ENTRY_NOEXEC;
/* Set correct table type for 2G hugepages */
- if ((pte_val(*ptep) & _REGION_ENTRY_TYPE_MASK) == _REGION_ENTRY_TYPE_R3)
- rste |= _REGION_ENTRY_TYPE_R3 | _REGION3_ENTRY_LARGE;
- else
+ if ((pte_val(*ptep) & _REGION_ENTRY_TYPE_MASK) == _REGION_ENTRY_TYPE_R3) {
+ if (likely(pte_present(pte)))
+ rste |= _REGION3_ENTRY_LARGE;
+ rste |= _REGION_ENTRY_TYPE_R3;
+ } else if (likely(pte_present(pte)))
rste |= _SEGMENT_ENTRY_LARGE;
+
clear_huge_pte_skeys(mm, rste);
pte_val(*ptep) = rste;
}
#include <linux/mm.h>
#include <linux/errno.h>
#include <linux/pci.h>
+#include <asm/pci_io.h>
+#include <asm/pci_debug.h>
+
+static inline void zpci_err_mmio(u8 cc, u8 status, u64 offset)
+{
+ struct {
+ u64 offset;
+ u8 cc;
+ u8 status;
+ } data = {offset, cc, status};
+
+ zpci_err_hex(&data, sizeof(data));
+}
+
+static inline int __pcistb_mio_inuser(
+ void __iomem *ioaddr, const void __user *src,
+ u64 len, u8 *status)
+{
+ int cc = -ENXIO;
+
+ asm volatile (
+ " sacf 256\n"
+ "0: .insn rsy,0xeb00000000d4,%[len],%[ioaddr],%[src]\n"
+ "1: ipm %[cc]\n"
+ " srl %[cc],28\n"
+ "2: sacf 768\n"
+ EX_TABLE(0b, 2b) EX_TABLE(1b, 2b)
+ : [cc] "+d" (cc), [len] "+d" (len)
+ : [ioaddr] "a" (ioaddr), [src] "Q" (*((u8 __force *)src))
+ : "cc", "memory");
+ *status = len >> 24 & 0xff;
+ return cc;
+}
+
+static inline int __pcistg_mio_inuser(
+ void __iomem *ioaddr, const void __user *src,
+ u64 ulen, u8 *status)
+{
+ register u64 addr asm("2") = (u64 __force) ioaddr;
+ register u64 len asm("3") = ulen;
+ int cc = -ENXIO;
+ u64 val = 0;
+ u64 cnt = ulen;
+ u8 tmp;
+
+ /*
+ * copy 0 < @len <= 8 bytes from @src into the right most bytes of
+ * a register, then store it to PCI at @ioaddr while in secondary
+ * address space. pcistg then uses the user mappings.
+ */
+ asm volatile (
+ " sacf 256\n"
+ "0: llgc %[tmp],0(%[src])\n"
+ " sllg %[val],%[val],8\n"
+ " aghi %[src],1\n"
+ " ogr %[val],%[tmp]\n"
+ " brctg %[cnt],0b\n"
+ "1: .insn rre,0xb9d40000,%[val],%[ioaddr]\n"
+ "2: ipm %[cc]\n"
+ " srl %[cc],28\n"
+ "3: sacf 768\n"
+ EX_TABLE(0b, 3b) EX_TABLE(1b, 3b) EX_TABLE(2b, 3b)
+ :
+ [src] "+a" (src), [cnt] "+d" (cnt),
+ [val] "+d" (val), [tmp] "=d" (tmp),
+ [len] "+d" (len), [cc] "+d" (cc),
+ [ioaddr] "+a" (addr)
+ :: "cc", "memory");
+ *status = len >> 24 & 0xff;
+
+ /* did we read everything from user memory? */
+ if (!cc && cnt != 0)
+ cc = -EFAULT;
+
+ return cc;
+}
+
+static inline int __memcpy_toio_inuser(void __iomem *dst,
+ const void __user *src, size_t n)
+{
+ int size, rc = 0;
+ u8 status = 0;
+ mm_segment_t old_fs;
+
+ if (!src)
+ return -EINVAL;
+
+ old_fs = enable_sacf_uaccess();
+ while (n > 0) {
+ size = zpci_get_max_write_size((u64 __force) dst,
+ (u64 __force) src, n,
+ ZPCI_MAX_WRITE_SIZE);
+ if (size > 8) /* main path */
+ rc = __pcistb_mio_inuser(dst, src, size, &status);
+ else
+ rc = __pcistg_mio_inuser(dst, src, size, &status);
+ if (rc)
+ break;
+ src += size;
+ dst += size;
+ n -= size;
+ }
+ disable_sacf_uaccess(old_fs);
+ if (rc)
+ zpci_err_mmio(rc, status, (__force u64) dst);
+ return rc;
+}
static long get_pfn(unsigned long user_addr, unsigned long access,
unsigned long *pfn)
if (length <= 0 || PAGE_SIZE - (mmio_addr & ~PAGE_MASK) < length)
return -EINVAL;
+
+ /*
+ * Only support read access to MIO capable devices on a MIO enabled
+ * system. Otherwise we would have to check for every address if it is
+ * a special ZPCI_ADDR and we would have to do a get_pfn() which we
+ * don't need for MIO capable devices.
+ */
+ if (static_branch_likely(&have_mio)) {
+ ret = __memcpy_toio_inuser((void __iomem *) mmio_addr,
+ user_buffer,
+ length);
+ return ret;
+ }
+
if (length > 64) {
buf = kmalloc(length, GFP_KERNEL);
if (!buf)
ret = get_pfn(mmio_addr, VM_WRITE, &pfn);
if (ret)
goto out;
- io_addr = (void __iomem *)((pfn << PAGE_SHIFT) | (mmio_addr & ~PAGE_MASK));
+ io_addr = (void __iomem *)((pfn << PAGE_SHIFT) |
+ (mmio_addr & ~PAGE_MASK));
ret = -EFAULT;
if ((unsigned long) io_addr < ZPCI_IOMAP_ADDR_BASE)
return ret;
}
+static inline int __pcilg_mio_inuser(
+ void __user *dst, const void __iomem *ioaddr,
+ u64 ulen, u8 *status)
+{
+ register u64 addr asm("2") = (u64 __force) ioaddr;
+ register u64 len asm("3") = ulen;
+ u64 cnt = ulen;
+ int shift = ulen * 8;
+ int cc = -ENXIO;
+ u64 val, tmp;
+
+ /*
+ * read 0 < @len <= 8 bytes from the PCI memory mapped at @ioaddr (in
+ * user space) into a register using pcilg then store these bytes at
+ * user address @dst
+ */
+ asm volatile (
+ " sacf 256\n"
+ "0: .insn rre,0xb9d60000,%[val],%[ioaddr]\n"
+ "1: ipm %[cc]\n"
+ " srl %[cc],28\n"
+ " ltr %[cc],%[cc]\n"
+ " jne 4f\n"
+ "2: ahi %[shift],-8\n"
+ " srlg %[tmp],%[val],0(%[shift])\n"
+ "3: stc %[tmp],0(%[dst])\n"
+ " aghi %[dst],1\n"
+ " brctg %[cnt],2b\n"
+ "4: sacf 768\n"
+ EX_TABLE(0b, 4b) EX_TABLE(1b, 4b) EX_TABLE(3b, 4b)
+ :
+ [cc] "+d" (cc), [val] "=d" (val), [len] "+d" (len),
+ [dst] "+a" (dst), [cnt] "+d" (cnt), [tmp] "=d" (tmp),
+ [shift] "+d" (shift)
+ :
+ [ioaddr] "a" (addr)
+ : "cc", "memory");
+
+ /* did we write everything to the user space buffer? */
+ if (!cc && cnt != 0)
+ cc = -EFAULT;
+
+ *status = len >> 24 & 0xff;
+ return cc;
+}
+
+static inline int __memcpy_fromio_inuser(void __user *dst,
+ const void __iomem *src,
+ unsigned long n)
+{
+ int size, rc = 0;
+ u8 status;
+ mm_segment_t old_fs;
+
+ old_fs = enable_sacf_uaccess();
+ while (n > 0) {
+ size = zpci_get_max_write_size((u64 __force) src,
+ (u64 __force) dst, n,
+ ZPCI_MAX_READ_SIZE);
+ rc = __pcilg_mio_inuser(dst, src, size, &status);
+ if (rc)
+ break;
+ src += size;
+ dst += size;
+ n -= size;
+ }
+ disable_sacf_uaccess(old_fs);
+ if (rc)
+ zpci_err_mmio(rc, status, (__force u64) dst);
+ return rc;
+}
+
SYSCALL_DEFINE3(s390_pci_mmio_read, unsigned long, mmio_addr,
void __user *, user_buffer, size_t, length)
{
if (length <= 0 || PAGE_SIZE - (mmio_addr & ~PAGE_MASK) < length)
return -EINVAL;
+
+ /*
+ * Only support write access to MIO capable devices on a MIO enabled
+ * system. Otherwise we would have to check for every address if it is
+ * a special ZPCI_ADDR and we would have to do a get_pfn() which we
+ * don't need for MIO capable devices.
+ */
+ if (static_branch_likely(&have_mio)) {
+ ret = __memcpy_fromio_inuser(
+ user_buffer, (const void __iomem *)mmio_addr,
+ length);
+ return ret;
+ }
+
if (length > 64) {
buf = kmalloc(length, GFP_KERNEL);
if (!buf)
return -ENOMEM;
- } else
+ } else {
buf = local_buf;
+ }
ret = get_pfn(mmio_addr, VM_READ, &pfn);
if (ret)
select HAVE_FUNCTION_TRACER
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_DYNAMIC_FTRACE
- select HAVE_FTRACE_NMI_ENTER if DYNAMIC_FTRACE
select ARCH_WANT_IPC_PARSE_VERSION
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_ARCH_KGDB
#ifndef __ASM_SH_SOCKIOS_H
#define __ASM_SH_SOCKIOS_H
+#include <linux/time_types.h>
+
/* Socket-level I/O control calls. */
#define FIOGETOWN _IOR('f', 123, int)
#define FIOSETOWN _IOW('f', 124, int)
force_sig(SIGTRAP);
}
+#ifdef CONFIG_DYNAMIC_FTRACE
+extern void arch_ftrace_nmi_enter(void);
+extern void arch_ftrace_nmi_exit(void);
+#else
+static inline void arch_ftrace_nmi_enter(void) { }
+static inline void arch_ftrace_nmi_exit(void) { }
+#endif
+
BUILD_TRAP_HANDLER(nmi)
{
unsigned int cpu = smp_processor_id();
TRAP_HANDLER_DECL;
+ arch_ftrace_nmi_enter();
+
nmi_enter();
nmi_count(cpu)++;
}
nmi_exit();
+
+ arch_ftrace_nmi_exit();
}
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/md5.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
while (vaddr < srmmu_nocache_end) {
pgd = pgd_offset_k(vaddr);
- p4d = p4d_offset(__nocache_fix(pgd), vaddr);
- pud = pud_offset(__nocache_fix(p4d), vaddr);
- pmd = pmd_offset(__nocache_fix(pgd), vaddr);
+ p4d = p4d_offset(pgd, vaddr);
+ pud = pud_offset(p4d, vaddr);
+ pmd = pmd_offset(__nocache_fix(pud), vaddr);
pte = pte_offset_kernel(__nocache_fix(pmd), vaddr);
pteval = ((paddr >> 4) | SRMMU_ET_PTE | SRMMU_PRIV);
#define TRANS_TAP_LEN strlen(TRANS_TAP)
#define TRANS_GRE "gre"
-#define TRANS_GRE_LEN strlen(TRANS_RAW)
+#define TRANS_GRE_LEN strlen(TRANS_GRE)
#define TRANS_L2TPV3 "l2tpv3"
#define TRANS_L2TPV3_LEN strlen(TRANS_L2TPV3)
/* SPDX-License-Identifier: GPL-2.0 */
#include <asm-generic/xor.h>
-#include <shared/timer-internal.h>
+#include <linux/time-internal.h>
/* pick an arbitrary one - measuring isn't possible with inf-cpu */
#define XOR_SELECT_TEMPLATE(x) \
#include <sysdep/ptrace_user.h>
#include <sysdep/syscalls.h>
#include <linux/time-internal.h>
+#include <asm/unistd.h>
void handle_syscall(struct uml_pt_regs *r)
{
#include <linux/module.h>
#include <linux/sched.h>
#include <linux/string.h>
-#include <linux/cryptohash.h>
#include <linux/delay.h>
#include <linux/in6.h>
#include <linux/syscalls.h>
select ARCH_HAS_KCOV if X86_64
select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_MEMBARRIER_SYNC_CORE
+ select ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
select ARCH_HAS_PMEM_API if X86_64
select ARCH_HAS_PTE_DEVMAP if X86_64
select ARCH_HAS_PTE_SPECIAL
Specify the maximum number of NUMA Nodes available on the target
system. Increases memory reserved to accommodate various tables.
-config ARCH_HAVE_MEMORY_PRESENT
- def_bool y
- depends on X86_32 && DISCONTIGMEM
-
config ARCH_FLATMEM_ENABLE
def_bool y
depends on X86_32 && !NUMA
-config ARCH_DISCONTIGMEM_ENABLE
- def_bool n
- depends on NUMA && X86_32
- depends on BROKEN
-
config ARCH_SPARSEMEM_ENABLE
def_bool y
depends on X86_64 || NUMA || X86_32 || X86_32_NON_STANDARD
results are dummy.
config X86_INTEL_MEMORY_PROTECTION_KEYS
- prompt "Intel Memory Protection Keys"
+ prompt "Memory Protection Keys"
def_bool y
# Note: only available in 64-bit mode
- depends on CPU_SUP_INTEL && X86_64
+ depends on X86_64 && (CPU_SUP_INTEL || CPU_SUP_AMD)
select ARCH_USES_HIGH_VMA_FLAGS
select ARCH_HAS_PKEYS
---help---
boot := arch/x86/boot
-BOOT_TARGETS = bzlilo bzdisk fdimage fdimage144 fdimage288 isoimage
+BOOT_TARGETS = bzdisk fdimage fdimage144 fdimage288 isoimage
PHONY += bzImage $(BOOT_TARGETS)
$(BOOT_TARGETS): vmlinux
$(Q)$(MAKE) $(build)=$(boot) $@
-PHONY += install
-install:
+PHONY += install bzlilo
+install bzlilo:
$(Q)$(MAKE) $(build)=$(boot) $@
PHONY += vdso_install
quiet_cmd_cpustr = CPUSTR $@
cmd_cpustr = $(obj)/mkcpustr > $@
-targets += cpustr.h
$(obj)/cpustr.h: $(obj)/mkcpustr FORCE
$(call if_changed,cpustr)
endif
-clean-files += cpustr.h
+targets += cpustr.h
# ---------------------------------------------------------------------------
cmd_genimage = sh $(srctree)/$(src)/genimage.sh $2 $3 $(obj)/bzImage \
$(obj)/mtools.conf '$(image_cmdline)' $(FDINITRD)
+PHONY += bzdisk fdimage fdimage144 fdimage288 isoimage bzlilo install
+
# This requires write access to /dev/fd0
bzdisk: $(obj)/bzImage $(obj)/mtools.conf
$(call cmd,genimage,bzdisk,/dev/fd0)
$(call cmd,genimage,isoimage,$(obj)/image.iso)
@$(kecho) 'Kernel: $(obj)/image.iso is ready'
-bzlilo: $(obj)/bzImage
+bzlilo:
if [ -f $(INSTALL_PATH)/vmlinuz ]; then mv $(INSTALL_PATH)/vmlinuz $(INSTALL_PATH)/vmlinuz.old; fi
if [ -f $(INSTALL_PATH)/System.map ]; then mv $(INSTALL_PATH)/System.map $(INSTALL_PATH)/System.old; fi
cat $(obj)/bzImage > $(INSTALL_PATH)/vmlinuz
*/
#define MAX_ADDR_LEN 19
-static acpi_physical_address get_cmdline_acpi_rsdp(void)
+static unsigned long get_cmdline_acpi_rsdp(void)
{
- acpi_physical_address addr = 0;
+ unsigned long addr = 0;
#ifdef CONFIG_KEXEC
char val[MAX_ADDR_LEN] = { };
if (ret < 0)
return 0;
- if (kstrtoull(val, 16, &addr))
+ if (boot_kstrtoul(val, 16, &addr))
return 0;
#endif
return addr;
* different ideas about whether to trust a command-line parameter.
*/
rsdp = (struct acpi_table_rsdp *)get_cmdline_acpi_rsdp();
-
if (!rsdp)
rsdp = (struct acpi_table_rsdp *)(long)
boot_params->acpi_rsdp_addr;
push %rbx
leaq 1f(%rip), %rbp
- leaq efi_gdt64(%rip), %rbx
- movl %ebx, 2(%rbx) /* Fixup the gdt base address */
movl %ds, %eax
push %rax
movl %r8d, 0xc(%rsp)
movl %r9d, 0x10(%rsp)
- sgdt 0x14(%rsp)
+ leaq 0x14(%rsp), %rbx
+ sgdt (%rbx)
/*
* Switch to gdt with 32-bit segments. This is the firmware GDT
pushq %rax
lretq
-1: lgdt 0x14(%rsp)
- addq $32, %rsp
+1: addq $32, %rsp
movq %rdi, %rax
pop %rbx
SYM_DATA_START(efi32_boot_ds)
.word 0
SYM_DATA_END(efi32_boot_ds)
-
-SYM_DATA_START(efi_gdt64)
- .word efi_gdt64_end - efi_gdt64
- .long 0 /* Filled out by user */
- .word 0
- .quad 0x0000000000000000 /* NULL descriptor */
- .quad 0x00af9a000000ffff /* __KERNEL_CS */
- .quad 0x00cf92000000ffff /* __KERNEL_DS */
- .quad 0x0080890000000000 /* TS descriptor */
- .quad 0x0000000000000000 /* TS continued */
-SYM_DATA_END_LABEL(efi_gdt64, SYM_L_LOCAL, efi_gdt64_end)
* Position Independent Executable (PIE) so that linker won't optimize
* R_386_GOT32X relocation to its fixed symbol address. Older
* linkers generate R_386_32 relocations against locally defined symbols,
- * _bss, _ebss, _got and _egot, in PIE. It isn't wrong, just less
+ * _bss, _ebss, _got, _egot and _end, in PIE. It isn't wrong, just less
* optimal than R_386_RELATIVE. But the x86 kernel fails to properly handle
* R_386_32 relocations when relocating the kernel. To generate
- * R_386_RELATIVE relocations, we mark _bss, _ebss, _got and _egot as
+ * R_386_RELATIVE relocations, we mark _bss, _ebss, _got, _egot and _end as
* hidden:
*/
.hidden _bss
.hidden _ebss
.hidden _got
.hidden _egot
+ .hidden _end
__HEAD
SYM_FUNC_START(startup_32)
.hidden _ebss
.hidden _got
.hidden _egot
+ .hidden _end
__HEAD
.code32
addq %rax, 2(%rax)
lgdt (%rax)
+ /* Reload CS so IRET returns to a CS actually in the GDT */
+ pushq $__KERNEL_CS
+ leaq .Lon_kernel_cs(%rip), %rax
+ pushq %rax
+ lretq
+
+.Lon_kernel_cs:
+
/*
* paging_prepare() sets up the trampoline and checks if we need to
* enable 5-level paging.
_data = . ;
*(.data)
*(.data.*)
+ *(.bss.efistub)
_edata = . ;
}
. = ALIGN(L1_CACHE_BYTES);
#endif
. = ALIGN(PAGE_SIZE); /* keep ZO size page aligned */
_end = .;
+
+ DISCARDS
}
* @endp: A pointer to the end of the parsed string will be placed here
* @base: The number base to use
*/
-
unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base)
{
unsigned long long result = 0;
s++;
return _kstrtoull(s, base, res);
}
+
+static int _kstrtoul(const char *s, unsigned int base, unsigned long *res)
+{
+ unsigned long long tmp;
+ int rv;
+
+ rv = kstrtoull(s, base, &tmp);
+ if (rv < 0)
+ return rv;
+ if (tmp != (unsigned long)tmp)
+ return -ERANGE;
+ *res = tmp;
+ return 0;
+}
+
+/**
+ * kstrtoul - convert a string to an unsigned long
+ * @s: The start of the string. The string must be null-terminated, and may also
+ * include a single newline before its terminating null. The first character
+ * may also be a plus sign, but not a minus sign.
+ * @base: The number base to use. The maximum supported base is 16. If base is
+ * given as 0, then the base of the string is automatically detected with the
+ * conventional semantics - If it begins with 0x the number will be parsed as a
+ * hexadecimal (case insensitive), if it otherwise begins with 0, it will be
+ * parsed as an octal number. Otherwise it will be parsed as a decimal.
+ * @res: Where to write the result of the conversion on success.
+ *
+ * Returns 0 on success, -ERANGE on overflow and -EINVAL on parsing error.
+ * Used as a replacement for the simple_strtoull.
+ */
+int boot_kstrtoul(const char *s, unsigned int base, unsigned long *res)
+{
+ /*
+ * We want to shortcut function call, but
+ * __builtin_types_compatible_p(unsigned long, unsigned long long) = 0.
+ */
+ if (sizeof(unsigned long) == sizeof(unsigned long long) &&
+ __alignof__(unsigned long) == __alignof__(unsigned long long))
+ return kstrtoull(s, base, (unsigned long long *)res);
+ else
+ return _kstrtoul(s, base, res);
+}
unsigned int base);
int kstrtoull(const char *s, unsigned int base, unsigned long long *res);
+int boot_kstrtoul(const char *s, unsigned int base, unsigned long *res);
#endif /* BOOT_STRING_H */
#define PECOFF_COMPAT_RESERVE 0x0
#endif
-unsigned long efi32_stub_entry;
-unsigned long efi64_stub_entry;
-unsigned long efi_pe_entry;
-unsigned long efi32_pe_entry;
-unsigned long kernel_info;
-unsigned long startup_64;
-unsigned long _ehead;
-unsigned long _end;
+static unsigned long efi32_stub_entry;
+static unsigned long efi64_stub_entry;
+static unsigned long efi_pe_entry;
+static unsigned long efi32_pe_entry;
+static unsigned long kernel_info;
+static unsigned long startup_64;
+static unsigned long _ehead;
+static unsigned long _end;
/*----------------------------------------------------------------------*/
pxor INC, STATE4
movdqu IV, 0x30(OUTP)
- CALL_NOSPEC %r11
+ CALL_NOSPEC r11
movdqu 0x00(OUTP), INC
pxor INC, STATE1
_aesni_gf128mul_x_ble()
movups IV, (IVP)
- CALL_NOSPEC %r11
+ CALL_NOSPEC r11
movdqu 0x40(OUTP), INC
pxor INC, STATE1
vpxor 14 * 16(%rax), %xmm15, %xmm14;
vpxor 15 * 16(%rax), %xmm15, %xmm15;
- CALL_NOSPEC %r9;
+ CALL_NOSPEC r9;
addq $(16 * 16), %rsp;
vpxor 14 * 32(%rax), %ymm15, %ymm14;
vpxor 15 * 32(%rax), %ymm15, %ymm15;
- CALL_NOSPEC %r9;
+ CALL_NOSPEC r9;
addq $(16 * 32), %rsp;
.text
SYM_FUNC_START(crc_pcl)
-#define bufp %rdi
+#define bufp rdi
#define bufp_dw %edi
#define bufp_w %di
#define bufp_b %dil
## 1) ALIGN:
################################################################
- mov bufp, bufptmp # rdi = *buf
- neg bufp
- and $7, bufp # calculate the unalignment amount of
+ mov %bufp, bufptmp # rdi = *buf
+ neg %bufp
+ and $7, %bufp # calculate the unalignment amount of
# the address
je proc_block # Skip if aligned
do_align:
#### Calculate CRC of unaligned bytes of the buffer (if any)
movq (bufptmp), tmp # load a quadward from the buffer
- add bufp, bufptmp # align buffer pointer for quadword
+ add %bufp, bufptmp # align buffer pointer for quadword
# processing
- sub bufp, len # update buffer length
+ sub %bufp, len # update buffer length
align_loop:
crc32b %bl, crc_init_dw # compute crc32 of 1-byte
shr $8, tmp # get next byte
- dec bufp
+ dec %bufp
jne align_loop
proc_block:
xor crc2, crc2
## branch into array
- lea jump_table(%rip), bufp
- movzxw (bufp, %rax, 2), len
- lea crc_array(%rip), bufp
- lea (bufp, len, 1), bufp
+ lea jump_table(%rip), %bufp
+ movzxw (%bufp, %rax, 2), len
+ lea crc_array(%rip), %bufp
+ lea (%bufp, len, 1), %bufp
JMP_NOSPEC bufp
################################################################
## 4) Combine three results:
################################################################
- lea (K_table-8)(%rip), bufp # first entry is for idx 1
+ lea (K_table-8)(%rip), %bufp # first entry is for idx 1
shlq $3, %rax # rax *= 8
- pmovzxdq (bufp,%rax), %xmm0 # 2 consts: K1:K2
+ pmovzxdq (%bufp,%rax), %xmm0 # 2 consts: K1:K2
leal (%eax,%eax,2), %eax # rax *= 3 (total *24)
subq %rax, tmp # tmp -= rax*24
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <crypto/sha256_base.h>
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/string.h>
#include <linux/types.h>
#include <crypto/sha.h>
/* kernel thread */
1: movl %edi, %eax
- CALL_NOSPEC %ebx
+ CALL_NOSPEC ebx
/*
* A kernel thread is allowed to return here after successfully
* calling do_execve(). Exit to userspace to complete the execve()
TRACE_IRQS_OFF
movl %esp, %eax # pt_regs pointer
- CALL_NOSPEC %edi
+ CALL_NOSPEC edi
jmp ret_from_exception
SYM_CODE_END(common_exception_read_cr2)
TRACE_IRQS_OFF
movl %esp, %eax # pt_regs pointer
- CALL_NOSPEC %edi
+ CALL_NOSPEC edi
jmp ret_from_exception
SYM_CODE_END(common_exception)
/* kernel thread */
UNWIND_HINT_EMPTY
movq %r12, %rdi
- CALL_NOSPEC %rbx
+ CALL_NOSPEC rbx
/*
* A kernel thread is allowed to return here after successfully
* calling do_execve(). Exit to userspace to complete the execve()
available on NehalemEX and more modern processors.
config PERF_EVENTS_INTEL_RAPL
- tristate "Intel rapl performance events"
- depends on PERF_EVENTS && CPU_SUP_INTEL && PCI
+ tristate "Intel/AMD rapl performance events"
+ depends on PERF_EVENTS && (CPU_SUP_INTEL || CPU_SUP_AMD) && PCI
default y
---help---
- Include support for Intel rapl performance events for power
+ Include support for Intel and AMD rapl performance events for power
monitoring on modern processors.
config PERF_EVENTS_INTEL_CSTATE
# SPDX-License-Identifier: GPL-2.0-only
obj-y += core.o probe.o
+obj-$(PERF_EVENTS_INTEL_RAPL) += rapl.o
obj-y += amd/
obj-$(CONFIG_X86_LOCAL_APIC) += msr.o
obj-$(CONFIG_CPU_SUP_INTEL) += intel/
+obj-$(CONFIG_CPU_SUP_CENTAUR) += zhaoxin/
+obj-$(CONFIG_CPU_SUP_ZHAOXIN) += zhaoxin/
err = amd_pmu_init();
x86_pmu.name = "HYGON";
break;
+ case X86_VENDOR_ZHAOXIN:
+ case X86_VENDOR_CENTAUR:
+ err = zhaoxin_pmu_init();
+ break;
default:
err = -ENOTSUPP;
}
obj-$(CONFIG_CPU_SUP_INTEL) += core.o bts.o
obj-$(CONFIG_CPU_SUP_INTEL) += ds.o knc.o
obj-$(CONFIG_CPU_SUP_INTEL) += lbr.o p4.o p6.o pt.o
-obj-$(CONFIG_PERF_EVENTS_INTEL_RAPL) += intel-rapl-perf.o
-intel-rapl-perf-objs := rapl.o
obj-$(CONFIG_PERF_EVENTS_INTEL_UNCORE) += intel-uncore.o
intel-uncore-objs := uncore.o uncore_nhmex.o uncore_snb.o uncore_snbep.o
obj-$(CONFIG_PERF_EVENTS_INTEL_CSTATE) += intel-cstate.o
local_t head;
unsigned long end;
void **data_pages;
- struct bts_phys buf[0];
+ struct bts_phys buf[];
};
static struct pmu bts_pmu;
static struct extra_reg intel_tnt_extra_regs[] __read_mostly = {
/* must define OFFCORE_RSP_X first, see intel_fixup_er() */
- INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0xffffff9fffull, RSP_0),
- INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0xffffff9fffull, RSP_1),
+ INTEL_UEVENT_EXTRA_REG(0x01b7, MSR_OFFCORE_RSP_0, 0x800ff0ffffff9fffull, RSP_0),
+ INTEL_UEVENT_EXTRA_REG(0x02b7, MSR_OFFCORE_RSP_1, 0xff0ffffff9fffull, RSP_1),
EVENT_EXTRA_END
};
pt_pmu.vmx = true;
}
- attrs = NULL;
-
for (i = 0; i < PT_CPUID_LEAVES; i++) {
cpuid_count(20, i,
&pt_pmu.caps[CPUID_EAX + i*PT_CPUID_REGS_NUM],
struct list_head list;
struct list_head active_list;
void __iomem *io_addr;
- struct intel_uncore_extra_reg shared_regs[0];
+ struct intel_uncore_extra_reg shared_regs[];
};
/* CFL uncore 8th cbox MSRs */
/* PMI handler bits */
unsigned int late_ack :1,
+ enabled_ack :1,
counter_freezing :1;
/*
* sysfs attrs
return 0;
}
#endif /* CONFIG_CPU_SUP_INTEL */
+
+#if ((defined CONFIG_CPU_SUP_CENTAUR) || (defined CONFIG_CPU_SUP_ZHAOXIN))
+int zhaoxin_pmu_init(void);
+#else
+static inline int zhaoxin_pmu_init(void)
+{
+ return 0;
+}
+#endif /*CONFIG_CPU_SUP_CENTAUR or CONFIG_CPU_SUP_ZHAOXIN*/
return 0;
}
+/*
+ * Accepts msr[] array with non populated entries as long as either
+ * msr[i].msr is 0 or msr[i].grp is NULL. Note that the default sysfs
+ * visibility is visible when group->is_visible callback is set.
+ */
unsigned long
perf_msr_probe(struct perf_msr *msr, int cnt, bool zero, void *data)
{
if (!msr[bit].no_check) {
struct attribute_group *grp = msr[bit].grp;
+ /* skip entry with no group */
+ if (!grp)
+ continue;
+
grp->is_visible = not_visible;
+ /* skip unpopulated entry */
+ if (!msr[bit].msr)
+ continue;
+
if (msr[bit].test && !msr[bit].test(bit, data))
continue;
/* Virt sucks; you cannot tell if a R/O MSR is present :/ */
// SPDX-License-Identifier: GPL-2.0-only
/*
- * Support Intel RAPL energy consumption counters
+ * Support Intel/AMD RAPL energy consumption counters
* Copyright (C) 2013 Google, Inc., Stephane Eranian
*
* Intel RAPL interface is specified in the IA-32 Manual Vol3b
* section 14.7.1 (September 2013)
*
+ * AMD RAPL interface for Fam17h is described in the public PPR:
+ * https://bugzilla.kernel.org/show_bug.cgi?id=206537
+ *
* RAPL provides more controls than just reporting energy consumption
* however here we only expose the 3 energy consumption free running
* counters (pp0, pkg, dram).
#include <linux/nospec.h>
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
-#include "../perf_event.h"
-#include "../probe.h"
+#include "perf_event.h"
+#include "probe.h"
MODULE_LICENSE("GPL");
};
struct rapl_model {
+ struct perf_msr *rapl_msrs;
unsigned long events;
+ unsigned int msr_power_unit;
bool apply_quirk;
};
static cpumask_t rapl_cpu_mask;
static unsigned int rapl_cntr_mask;
static u64 rapl_timer_ms;
-static struct perf_msr rapl_msrs[];
+static struct perf_msr *rapl_msrs;
static inline struct rapl_pmu *cpu_to_rapl_pmu(unsigned int cpu)
{
NULL,
};
+static umode_t
+rapl_not_visible(struct kobject *kobj, struct attribute *attr, int i)
+{
+ return 0;
+}
+
static struct attribute_group rapl_events_cores_group = {
.name = "events",
.attrs = rapl_events_cores,
+ .is_visible = rapl_not_visible,
};
static struct attribute *rapl_events_pkg[] = {
static struct attribute_group rapl_events_pkg_group = {
.name = "events",
.attrs = rapl_events_pkg,
+ .is_visible = rapl_not_visible,
};
static struct attribute *rapl_events_ram[] = {
static struct attribute_group rapl_events_ram_group = {
.name = "events",
.attrs = rapl_events_ram,
+ .is_visible = rapl_not_visible,
};
static struct attribute *rapl_events_gpu[] = {
static struct attribute_group rapl_events_gpu_group = {
.name = "events",
.attrs = rapl_events_gpu,
+ .is_visible = rapl_not_visible,
};
static struct attribute *rapl_events_psys[] = {
static struct attribute_group rapl_events_psys_group = {
.name = "events",
.attrs = rapl_events_psys,
+ .is_visible = rapl_not_visible,
};
static bool test_msr(int idx, void *data)
return test_bit(idx, (unsigned long *) data);
}
-static struct perf_msr rapl_msrs[] = {
+static struct perf_msr intel_rapl_msrs[] = {
[PERF_RAPL_PP0] = { MSR_PP0_ENERGY_STATUS, &rapl_events_cores_group, test_msr },
[PERF_RAPL_PKG] = { MSR_PKG_ENERGY_STATUS, &rapl_events_pkg_group, test_msr },
[PERF_RAPL_RAM] = { MSR_DRAM_ENERGY_STATUS, &rapl_events_ram_group, test_msr },
[PERF_RAPL_PSYS] = { MSR_PLATFORM_ENERGY_STATUS, &rapl_events_psys_group, test_msr },
};
+/*
+ * Force to PERF_RAPL_MAX size due to:
+ * - perf_msr_probe(PERF_RAPL_MAX)
+ * - want to use same event codes across both architectures
+ */
+static struct perf_msr amd_rapl_msrs[PERF_RAPL_MAX] = {
+ [PERF_RAPL_PKG] = { MSR_AMD_PKG_ENERGY_STATUS, &rapl_events_pkg_group, test_msr },
+};
+
+
static int rapl_cpu_offline(unsigned int cpu)
{
struct rapl_pmu *pmu = cpu_to_rapl_pmu(cpu);
return 0;
}
-static int rapl_check_hw_unit(bool apply_quirk)
+static int rapl_check_hw_unit(struct rapl_model *rm)
{
u64 msr_rapl_power_unit_bits;
int i;
/* protect rdmsrl() to handle virtualization */
- if (rdmsrl_safe(MSR_RAPL_POWER_UNIT, &msr_rapl_power_unit_bits))
+ if (rdmsrl_safe(rm->msr_power_unit, &msr_rapl_power_unit_bits))
return -1;
for (i = 0; i < NR_RAPL_DOMAINS; i++)
rapl_hw_unit[i] = (msr_rapl_power_unit_bits >> 8) & 0x1FULL;
* "Intel Xeon Processor E5-1600 and E5-2600 v3 Product Families, V2
* of 2. Datasheet, September 2014, Reference Number: 330784-001 "
*/
- if (apply_quirk)
+ if (rm->apply_quirk)
rapl_hw_unit[PERF_RAPL_RAM] = 16;
/*
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_PP1),
.apply_quirk = false,
+ .msr_power_unit = MSR_RAPL_POWER_UNIT,
+ .rapl_msrs = intel_rapl_msrs,
};
static struct rapl_model model_snbep = {
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM),
.apply_quirk = false,
+ .msr_power_unit = MSR_RAPL_POWER_UNIT,
+ .rapl_msrs = intel_rapl_msrs,
};
static struct rapl_model model_hsw = {
BIT(PERF_RAPL_RAM) |
BIT(PERF_RAPL_PP1),
.apply_quirk = false,
+ .msr_power_unit = MSR_RAPL_POWER_UNIT,
+ .rapl_msrs = intel_rapl_msrs,
};
static struct rapl_model model_hsx = {
BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM),
.apply_quirk = true,
+ .msr_power_unit = MSR_RAPL_POWER_UNIT,
+ .rapl_msrs = intel_rapl_msrs,
};
static struct rapl_model model_knl = {
.events = BIT(PERF_RAPL_PKG) |
BIT(PERF_RAPL_RAM),
.apply_quirk = true,
+ .msr_power_unit = MSR_RAPL_POWER_UNIT,
+ .rapl_msrs = intel_rapl_msrs,
};
static struct rapl_model model_skl = {
BIT(PERF_RAPL_PP1) |
BIT(PERF_RAPL_PSYS),
.apply_quirk = false,
+ .msr_power_unit = MSR_RAPL_POWER_UNIT,
+ .rapl_msrs = intel_rapl_msrs,
+};
+
+static struct rapl_model model_amd_fam17h = {
+ .events = BIT(PERF_RAPL_PKG),
+ .apply_quirk = false,
+ .msr_power_unit = MSR_AMD_RAPL_POWER_UNIT,
+ .rapl_msrs = amd_rapl_msrs,
};
static const struct x86_cpu_id rapl_model_match[] __initconst = {
X86_MATCH_INTEL_FAM6_MODEL(ATOM_GOLDMONT_PLUS, &model_hsw),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(ICELAKE, &model_skl),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &model_hsx),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &model_hsx),
X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE_L, &model_skl),
X86_MATCH_INTEL_FAM6_MODEL(COMETLAKE, &model_skl),
+ X86_MATCH_VENDOR_FAM(AMD, 0x17, &model_amd_fam17h),
{},
};
MODULE_DEVICE_TABLE(x86cpu, rapl_model_match);
return -ENODEV;
rm = (struct rapl_model *) id->driver_data;
+
+ rapl_msrs = rm->rapl_msrs;
+
rapl_cntr_mask = perf_msr_probe(rapl_msrs, PERF_RAPL_MAX,
false, (void *) &rm->events);
- ret = rapl_check_hw_unit(rm->apply_quirk);
+ ret = rapl_check_hw_unit(rm);
if (ret)
return ret;
--- /dev/null
+# SPDX-License-Identifier: GPL-2.0
+obj-y += core.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Zhoaxin PMU; like Intel Architectural PerfMon-v2
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/stddef.h>
+#include <linux/types.h>
+#include <linux/init.h>
+#include <linux/slab.h>
+#include <linux/export.h>
+#include <linux/nmi.h>
+
+#include <asm/cpufeature.h>
+#include <asm/hardirq.h>
+#include <asm/apic.h>
+
+#include "../perf_event.h"
+
+/*
+ * Zhaoxin PerfMon, used on zxc and later.
+ */
+static u64 zx_pmon_event_map[PERF_COUNT_HW_MAX] __read_mostly = {
+
+ [PERF_COUNT_HW_CPU_CYCLES] = 0x0082,
+ [PERF_COUNT_HW_INSTRUCTIONS] = 0x00c0,
+ [PERF_COUNT_HW_CACHE_REFERENCES] = 0x0515,
+ [PERF_COUNT_HW_CACHE_MISSES] = 0x051a,
+ [PERF_COUNT_HW_BUS_CYCLES] = 0x0083,
+};
+
+static struct event_constraint zxc_event_constraints[] __read_mostly = {
+
+ FIXED_EVENT_CONSTRAINT(0x0082, 1), /* unhalted core clock cycles */
+ EVENT_CONSTRAINT_END
+};
+
+static struct event_constraint zxd_event_constraints[] __read_mostly = {
+
+ FIXED_EVENT_CONSTRAINT(0x00c0, 0), /* retired instructions */
+ FIXED_EVENT_CONSTRAINT(0x0082, 1), /* unhalted core clock cycles */
+ FIXED_EVENT_CONSTRAINT(0x0083, 2), /* unhalted bus clock cycles */
+ EVENT_CONSTRAINT_END
+};
+
+static __initconst const u64 zxd_hw_cache_event_ids
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+[C(L1D)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0042,
+ [C(RESULT_MISS)] = 0x0538,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = 0x0043,
+ [C(RESULT_MISS)] = 0x0562,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(L1I)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0300,
+ [C(RESULT_MISS)] = 0x0301,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = 0x030a,
+ [C(RESULT_MISS)] = 0x030b,
+ },
+},
+[C(LL)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(DTLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0042,
+ [C(RESULT_MISS)] = 0x052c,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = 0x0043,
+ [C(RESULT_MISS)] = 0x0530,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = 0x0564,
+ [C(RESULT_MISS)] = 0x0565,
+ },
+},
+[C(ITLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x00c0,
+ [C(RESULT_MISS)] = 0x0534,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(BPU)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0700,
+ [C(RESULT_MISS)] = 0x0709,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(NODE)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+};
+
+static __initconst const u64 zxe_hw_cache_event_ids
+ [PERF_COUNT_HW_CACHE_MAX]
+ [PERF_COUNT_HW_CACHE_OP_MAX]
+ [PERF_COUNT_HW_CACHE_RESULT_MAX] = {
+[C(L1D)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0568,
+ [C(RESULT_MISS)] = 0x054b,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = 0x0669,
+ [C(RESULT_MISS)] = 0x0562,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(L1I)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0300,
+ [C(RESULT_MISS)] = 0x0301,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = 0x030a,
+ [C(RESULT_MISS)] = 0x030b,
+ },
+},
+[C(LL)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0,
+ [C(RESULT_MISS)] = 0x0,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = 0x0,
+ [C(RESULT_MISS)] = 0x0,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = 0x0,
+ [C(RESULT_MISS)] = 0x0,
+ },
+},
+[C(DTLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0568,
+ [C(RESULT_MISS)] = 0x052c,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = 0x0669,
+ [C(RESULT_MISS)] = 0x0530,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = 0x0564,
+ [C(RESULT_MISS)] = 0x0565,
+ },
+},
+[C(ITLB)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x00c0,
+ [C(RESULT_MISS)] = 0x0534,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(BPU)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = 0x0028,
+ [C(RESULT_MISS)] = 0x0029,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+[C(NODE)] = {
+ [C(OP_READ)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_WRITE)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+ [C(OP_PREFETCH)] = {
+ [C(RESULT_ACCESS)] = -1,
+ [C(RESULT_MISS)] = -1,
+ },
+},
+};
+
+static void zhaoxin_pmu_disable_all(void)
+{
+ wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, 0);
+}
+
+static void zhaoxin_pmu_enable_all(int added)
+{
+ wrmsrl(MSR_CORE_PERF_GLOBAL_CTRL, x86_pmu.intel_ctrl);
+}
+
+static inline u64 zhaoxin_pmu_get_status(void)
+{
+ u64 status;
+
+ rdmsrl(MSR_CORE_PERF_GLOBAL_STATUS, status);
+
+ return status;
+}
+
+static inline void zhaoxin_pmu_ack_status(u64 ack)
+{
+ wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack);
+}
+
+static inline void zxc_pmu_ack_status(u64 ack)
+{
+ /*
+ * ZXC needs global control enabled in order to clear status bits.
+ */
+ zhaoxin_pmu_enable_all(0);
+ zhaoxin_pmu_ack_status(ack);
+ zhaoxin_pmu_disable_all();
+}
+
+static void zhaoxin_pmu_disable_fixed(struct hw_perf_event *hwc)
+{
+ int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
+ u64 ctrl_val, mask;
+
+ mask = 0xfULL << (idx * 4);
+
+ rdmsrl(hwc->config_base, ctrl_val);
+ ctrl_val &= ~mask;
+ wrmsrl(hwc->config_base, ctrl_val);
+}
+
+static void zhaoxin_pmu_disable_event(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+ zhaoxin_pmu_disable_fixed(hwc);
+ return;
+ }
+
+ x86_pmu_disable_event(event);
+}
+
+static void zhaoxin_pmu_enable_fixed(struct hw_perf_event *hwc)
+{
+ int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
+ u64 ctrl_val, bits, mask;
+
+ /*
+ * Enable IRQ generation (0x8),
+ * and enable ring-3 counting (0x2) and ring-0 counting (0x1)
+ * if requested:
+ */
+ bits = 0x8ULL;
+ if (hwc->config & ARCH_PERFMON_EVENTSEL_USR)
+ bits |= 0x2;
+ if (hwc->config & ARCH_PERFMON_EVENTSEL_OS)
+ bits |= 0x1;
+
+ bits <<= (idx * 4);
+ mask = 0xfULL << (idx * 4);
+
+ rdmsrl(hwc->config_base, ctrl_val);
+ ctrl_val &= ~mask;
+ ctrl_val |= bits;
+ wrmsrl(hwc->config_base, ctrl_val);
+}
+
+static void zhaoxin_pmu_enable_event(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
+
+ if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+ zhaoxin_pmu_enable_fixed(hwc);
+ return;
+ }
+
+ __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
+}
+
+/*
+ * This handler is triggered by the local APIC, so the APIC IRQ handling
+ * rules apply:
+ */
+static int zhaoxin_pmu_handle_irq(struct pt_regs *regs)
+{
+ struct perf_sample_data data;
+ struct cpu_hw_events *cpuc;
+ int handled = 0;
+ u64 status;
+ int bit;
+
+ cpuc = this_cpu_ptr(&cpu_hw_events);
+ apic_write(APIC_LVTPC, APIC_DM_NMI);
+ zhaoxin_pmu_disable_all();
+ status = zhaoxin_pmu_get_status();
+ if (!status)
+ goto done;
+
+again:
+ if (x86_pmu.enabled_ack)
+ zxc_pmu_ack_status(status);
+ else
+ zhaoxin_pmu_ack_status(status);
+
+ inc_irq_stat(apic_perf_irqs);
+
+ /*
+ * CondChgd bit 63 doesn't mean any overflow status. Ignore
+ * and clear the bit.
+ */
+ if (__test_and_clear_bit(63, (unsigned long *)&status)) {
+ if (!status)
+ goto done;
+ }
+
+ for_each_set_bit(bit, (unsigned long *)&status, X86_PMC_IDX_MAX) {
+ struct perf_event *event = cpuc->events[bit];
+
+ handled++;
+
+ if (!test_bit(bit, cpuc->active_mask))
+ continue;
+
+ x86_perf_event_update(event);
+ perf_sample_data_init(&data, 0, event->hw.last_period);
+
+ if (!x86_perf_event_set_period(event))
+ continue;
+
+ if (perf_event_overflow(event, &data, regs))
+ x86_pmu_stop(event, 0);
+ }
+
+ /*
+ * Repeat if there is more work to be done:
+ */
+ status = zhaoxin_pmu_get_status();
+ if (status)
+ goto again;
+
+done:
+ zhaoxin_pmu_enable_all(0);
+ return handled;
+}
+
+static u64 zhaoxin_pmu_event_map(int hw_event)
+{
+ return zx_pmon_event_map[hw_event];
+}
+
+static struct event_constraint *
+zhaoxin_get_event_constraints(struct cpu_hw_events *cpuc, int idx,
+ struct perf_event *event)
+{
+ struct event_constraint *c;
+
+ if (x86_pmu.event_constraints) {
+ for_each_event_constraint(c, x86_pmu.event_constraints) {
+ if ((event->hw.config & c->cmask) == c->code)
+ return c;
+ }
+ }
+
+ return &unconstrained;
+}
+
+PMU_FORMAT_ATTR(event, "config:0-7");
+PMU_FORMAT_ATTR(umask, "config:8-15");
+PMU_FORMAT_ATTR(edge, "config:18");
+PMU_FORMAT_ATTR(inv, "config:23");
+PMU_FORMAT_ATTR(cmask, "config:24-31");
+
+static struct attribute *zx_arch_formats_attr[] = {
+ &format_attr_event.attr,
+ &format_attr_umask.attr,
+ &format_attr_edge.attr,
+ &format_attr_inv.attr,
+ &format_attr_cmask.attr,
+ NULL,
+};
+
+static ssize_t zhaoxin_event_sysfs_show(char *page, u64 config)
+{
+ u64 event = (config & ARCH_PERFMON_EVENTSEL_EVENT);
+
+ return x86_event_sysfs_show(page, config, event);
+}
+
+static const struct x86_pmu zhaoxin_pmu __initconst = {
+ .name = "zhaoxin",
+ .handle_irq = zhaoxin_pmu_handle_irq,
+ .disable_all = zhaoxin_pmu_disable_all,
+ .enable_all = zhaoxin_pmu_enable_all,
+ .enable = zhaoxin_pmu_enable_event,
+ .disable = zhaoxin_pmu_disable_event,
+ .hw_config = x86_pmu_hw_config,
+ .schedule_events = x86_schedule_events,
+ .eventsel = MSR_ARCH_PERFMON_EVENTSEL0,
+ .perfctr = MSR_ARCH_PERFMON_PERFCTR0,
+ .event_map = zhaoxin_pmu_event_map,
+ .max_events = ARRAY_SIZE(zx_pmon_event_map),
+ .apic = 1,
+ /*
+ * For zxd/zxe, read/write operation for PMCx MSR is 48 bits.
+ */
+ .max_period = (1ULL << 47) - 1,
+ .get_event_constraints = zhaoxin_get_event_constraints,
+
+ .format_attrs = zx_arch_formats_attr,
+ .events_sysfs_show = zhaoxin_event_sysfs_show,
+};
+
+static const struct { int id; char *name; } zx_arch_events_map[] __initconst = {
+ { PERF_COUNT_HW_CPU_CYCLES, "cpu cycles" },
+ { PERF_COUNT_HW_INSTRUCTIONS, "instructions" },
+ { PERF_COUNT_HW_BUS_CYCLES, "bus cycles" },
+ { PERF_COUNT_HW_CACHE_REFERENCES, "cache references" },
+ { PERF_COUNT_HW_CACHE_MISSES, "cache misses" },
+ { PERF_COUNT_HW_BRANCH_INSTRUCTIONS, "branch instructions" },
+ { PERF_COUNT_HW_BRANCH_MISSES, "branch misses" },
+};
+
+static __init void zhaoxin_arch_events_quirk(void)
+{
+ int bit;
+
+ /* disable event that reported as not presend by cpuid */
+ for_each_set_bit(bit, x86_pmu.events_mask, ARRAY_SIZE(zx_arch_events_map)) {
+ zx_pmon_event_map[zx_arch_events_map[bit].id] = 0;
+ pr_warn("CPUID marked event: \'%s\' unavailable\n",
+ zx_arch_events_map[bit].name);
+ }
+}
+
+__init int zhaoxin_pmu_init(void)
+{
+ union cpuid10_edx edx;
+ union cpuid10_eax eax;
+ union cpuid10_ebx ebx;
+ struct event_constraint *c;
+ unsigned int unused;
+ int version;
+
+ pr_info("Welcome to zhaoxin pmu!\n");
+
+ /*
+ * Check whether the Architectural PerfMon supports
+ * hw_event or not.
+ */
+ cpuid(10, &eax.full, &ebx.full, &unused, &edx.full);
+
+ if (eax.split.mask_length < ARCH_PERFMON_EVENTS_COUNT - 1)
+ return -ENODEV;
+
+ version = eax.split.version_id;
+ if (version != 2)
+ return -ENODEV;
+
+ x86_pmu = zhaoxin_pmu;
+ pr_info("Version check pass!\n");
+
+ x86_pmu.version = version;
+ x86_pmu.num_counters = eax.split.num_counters;
+ x86_pmu.cntval_bits = eax.split.bit_width;
+ x86_pmu.cntval_mask = (1ULL << eax.split.bit_width) - 1;
+ x86_pmu.events_maskl = ebx.full;
+ x86_pmu.events_mask_len = eax.split.mask_length;
+
+ x86_pmu.num_counters_fixed = edx.split.num_counters_fixed;
+ x86_add_quirk(zhaoxin_arch_events_quirk);
+
+ switch (boot_cpu_data.x86) {
+ case 0x06:
+ if (boot_cpu_data.x86_model == 0x0f || boot_cpu_data.x86_model == 0x19) {
+
+ x86_pmu.max_period = x86_pmu.cntval_mask >> 1;
+
+ /* Clearing status works only if the global control is enable on zxc. */
+ x86_pmu.enabled_ack = 1;
+
+ x86_pmu.event_constraints = zxc_event_constraints;
+ zx_pmon_event_map[PERF_COUNT_HW_INSTRUCTIONS] = 0;
+ zx_pmon_event_map[PERF_COUNT_HW_CACHE_REFERENCES] = 0;
+ zx_pmon_event_map[PERF_COUNT_HW_CACHE_MISSES] = 0;
+ zx_pmon_event_map[PERF_COUNT_HW_BUS_CYCLES] = 0;
+
+ pr_cont("ZXC events, ");
+ break;
+ }
+ return -ENODEV;
+
+ case 0x07:
+ zx_pmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_FRONTEND] =
+ X86_CONFIG(.event = 0x01, .umask = 0x01, .inv = 0x01, .cmask = 0x01);
+
+ zx_pmon_event_map[PERF_COUNT_HW_STALLED_CYCLES_BACKEND] =
+ X86_CONFIG(.event = 0x0f, .umask = 0x04, .inv = 0, .cmask = 0);
+
+ switch (boot_cpu_data.x86_model) {
+ case 0x1b:
+ memcpy(hw_cache_event_ids, zxd_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+
+ x86_pmu.event_constraints = zxd_event_constraints;
+
+ zx_pmon_event_map[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x0700;
+ zx_pmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x0709;
+
+ pr_cont("ZXD events, ");
+ break;
+ case 0x3b:
+ memcpy(hw_cache_event_ids, zxe_hw_cache_event_ids,
+ sizeof(hw_cache_event_ids));
+
+ x86_pmu.event_constraints = zxd_event_constraints;
+
+ zx_pmon_event_map[PERF_COUNT_HW_BRANCH_INSTRUCTIONS] = 0x0028;
+ zx_pmon_event_map[PERF_COUNT_HW_BRANCH_MISSES] = 0x0029;
+
+ pr_cont("ZXE events, ");
+ break;
+ default:
+ return -ENODEV;
+ }
+ break;
+
+ default:
+ return -ENODEV;
+ }
+
+ x86_pmu.intel_ctrl = (1 << (x86_pmu.num_counters)) - 1;
+ x86_pmu.intel_ctrl |= ((1LL << x86_pmu.num_counters_fixed)-1) << INTEL_PMC_IDX_FIXED;
+
+ if (x86_pmu.event_constraints) {
+ for_each_event_constraint(c, x86_pmu.event_constraints) {
+ c->idxmsk64 |= (1ULL << x86_pmu.num_counters) - 1;
+ c->weight += x86_pmu.num_counters;
+ }
+ }
+
+ return 0;
+}
+
rdmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
if (re_ctrl.target_vp == hv_vp_index[cpu]) {
- /* Reassign to some other online CPU */
+ /*
+ * Reassign reenlightenment notifications to some other online
+ * CPU or just disable the feature if there are no online CPUs
+ * left (happens on hibernation).
+ */
new_cpu = cpumask_any_but(cpu_online_mask, cpu);
- re_ctrl.target_vp = hv_vp_index[new_cpu];
+ if (new_cpu < nr_cpu_ids)
+ re_ctrl.target_vp = hv_vp_index[new_cpu];
+ else
+ re_ctrl.enabled = 0;
+
wrmsrl(HV_X64_MSR_REENLIGHTENMENT_CONTROL, *((u64 *)&re_ctrl));
}
hv_hypercall_pg = hv_hypercall_pg_saved;
hv_hypercall_pg_saved = NULL;
+
+ /*
+ * Reenlightenment notifications are disabled by hv_cpu_die(0),
+ * reenable them here if hv_reenlightenment_cb was previously set.
+ */
+ if (hv_reenlightenment_cb)
+ set_hv_tscchange_cb(hv_reenlightenment_cb);
}
/* Note: when the ops are called, only CPU0 is online and IRQs are disabled. */
// SPDX-License-Identifier: GPL-2.0
#include <asm/unistd_32.h>
+#include <asm/audit.h>
unsigned ia32_dir_class[] = {
#include <asm-generic/audit_dir_write.h>
--- /dev/null
+#ifdef CONFIG_64BIT
+GEN(rax)
+GEN(rbx)
+GEN(rcx)
+GEN(rdx)
+GEN(rsi)
+GEN(rdi)
+GEN(rbp)
+GEN(r8)
+GEN(r9)
+GEN(r10)
+GEN(r11)
+GEN(r12)
+GEN(r13)
+GEN(r14)
+GEN(r15)
+#else
+GEN(eax)
+GEN(ebx)
+GEN(ecx)
+GEN(edx)
+GEN(esi)
+GEN(edi)
+GEN(ebp)
+#endif
#define APBT_MIN_FREQ 1000000
#define APBT_MMAP_SIZE 1024
-#define APBT_DEV_USED 1
-
extern void apbt_time_init(void);
-extern unsigned long apbt_quick_calibrate(void);
-extern int arch_setup_apbt_irqs(int irq, int trigger, int mask, int cpu);
extern void apbt_setup_secondary_clock(void);
extern struct sfi_timer_table_entry *sfi_get_mtmr(int hint);
#else /* CONFIG_APB_TIMER */
-static inline unsigned long apbt_quick_calibrate(void) {return 0; }
static inline void apbt_time_init(void) { }
#endif
#define RDRAND_RETRY_LOOPS 10
-#define RDRAND_INT ".byte 0x0f,0xc7,0xf0"
-#define RDSEED_INT ".byte 0x0f,0xc7,0xf8"
-#ifdef CONFIG_X86_64
-# define RDRAND_LONG ".byte 0x48,0x0f,0xc7,0xf0"
-# define RDSEED_LONG ".byte 0x48,0x0f,0xc7,0xf8"
-#else
-# define RDRAND_LONG RDRAND_INT
-# define RDSEED_LONG RDSEED_INT
-#endif
-
/* Unconditional execution of RDRAND and RDSEED */
static inline bool __must_check rdrand_long(unsigned long *v)
bool ok;
unsigned int retry = RDRAND_RETRY_LOOPS;
do {
- asm volatile(RDRAND_LONG
+ asm volatile("rdrand %[out]"
CC_SET(c)
- : CC_OUT(c) (ok), "=a" (*v));
+ : CC_OUT(c) (ok), [out] "=r" (*v));
if (ok)
return true;
} while (--retry);
bool ok;
unsigned int retry = RDRAND_RETRY_LOOPS;
do {
- asm volatile(RDRAND_INT
+ asm volatile("rdrand %[out]"
CC_SET(c)
- : CC_OUT(c) (ok), "=a" (*v));
+ : CC_OUT(c) (ok), [out] "=r" (*v));
if (ok)
return true;
} while (--retry);
static inline bool __must_check rdseed_long(unsigned long *v)
{
bool ok;
- asm volatile(RDSEED_LONG
+ asm volatile("rdseed %[out]"
CC_SET(c)
- : CC_OUT(c) (ok), "=a" (*v));
+ : CC_OUT(c) (ok), [out] "=r" (*v));
return ok;
}
static inline bool __must_check rdseed_int(unsigned int *v)
{
bool ok;
- asm volatile(RDSEED_INT
+ asm volatile("rdseed %[out]"
CC_SET(c)
- : CC_OUT(c) (ok), "=a" (*v));
+ : CC_OUT(c) (ok), [out] "=r" (*v));
return ok;
}
#endif
#ifdef CONFIG_RETPOLINE
-#ifdef CONFIG_X86_32
-#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_e ## reg(void);
-#else
-#define INDIRECT_THUNK(reg) extern asmlinkage void __x86_indirect_thunk_r ## reg(void);
-INDIRECT_THUNK(8)
-INDIRECT_THUNK(9)
-INDIRECT_THUNK(10)
-INDIRECT_THUNK(11)
-INDIRECT_THUNK(12)
-INDIRECT_THUNK(13)
-INDIRECT_THUNK(14)
-INDIRECT_THUNK(15)
-#endif
-INDIRECT_THUNK(ax)
-INDIRECT_THUNK(bx)
-INDIRECT_THUNK(cx)
-INDIRECT_THUNK(dx)
-INDIRECT_THUNK(si)
-INDIRECT_THUNK(di)
-INDIRECT_THUNK(bp)
+
+#define DECL_INDIRECT_THUNK(reg) \
+ extern asmlinkage void __x86_indirect_thunk_ ## reg (void);
+
+#define DECL_RETPOLINE(reg) \
+ extern asmlinkage void __x86_retpoline_ ## reg (void);
+
+#undef GEN
+#define GEN(reg) DECL_INDIRECT_THUNK(reg)
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) DECL_RETPOLINE(reg)
+#include <asm/GEN-for-each-reg.h>
+
#endif /* CONFIG_RETPOLINE */
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _ASM_X86_AUDIT_H
+#define _ASM_X86_AUDIT_H
+
+int ia32_classify_syscall(unsigned int syscall);
+
+#endif /* _ASM_X86_AUDIT_H */
arch_set_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "orb %1,%0"
+ asm volatile(LOCK_PREFIX "orb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) & 0xff)
+ : "iq" (CONST_MASK(nr))
: "memory");
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(bts) " %1,%0"
arch_clear_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "andb %1,%0"
+ asm volatile(LOCK_PREFIX "andb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" (CONST_MASK(nr) ^ 0xff));
+ : "iq" (~CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btr) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
arch_change_bit(long nr, volatile unsigned long *addr)
{
if (__builtin_constant_p(nr)) {
- asm volatile(LOCK_PREFIX "xorb %1,%0"
+ asm volatile(LOCK_PREFIX "xorb %b1,%0"
: CONST_MASK_ADDR(nr, addr)
- : "iq" ((u8)CONST_MASK(nr)));
+ : "iq" (CONST_MASK(nr)));
} else {
asm volatile(LOCK_PREFIX __ASM_SIZE(btc) " %1,%0"
: : RLONG_ADDR(addr), "Ir" (nr) : "memory");
#define X86_CENTAUR_FAM6_C7_D 0xd
#define X86_CENTAUR_FAM6_NANO 0xf
+#define X86_STEPPINGS(mins, maxs) GENMASK(maxs, mins)
/**
- * X86_MATCH_VENDOR_FAM_MODEL_FEATURE - Base macro for CPU matching
+ * X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE - Base macro for CPU matching
* @_vendor: The vendor name, e.g. INTEL, AMD, HYGON, ..., ANY
* The name is expanded to X86_VENDOR_@_vendor
* @_family: The family number or X86_FAMILY_ANY
* @_model: The model number, model constant or X86_MODEL_ANY
+ * @_steppings: Bitmask for steppings, stepping constant or X86_STEPPING_ANY
* @_feature: A X86_FEATURE bit or X86_FEATURE_ANY
* @_data: Driver specific data or NULL. The internal storage
* format is unsigned long. The supplied value, pointer
* into another macro at the usage site for good reasons, then please
* start this local macro with X86_MATCH to allow easy grepping.
*/
-#define X86_MATCH_VENDOR_FAM_MODEL_FEATURE(_vendor, _family, _model, \
- _feature, _data) { \
+#define X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE(_vendor, _family, _model, \
+ _steppings, _feature, _data) { \
.vendor = X86_VENDOR_##_vendor, \
.family = _family, \
.model = _model, \
+ .steppings = _steppings, \
.feature = _feature, \
.driver_data = (unsigned long) _data \
}
+/**
+ * X86_MATCH_VENDOR_FAM_MODEL_FEATURE - Macro for CPU matching
+ * @_vendor: The vendor name, e.g. INTEL, AMD, HYGON, ..., ANY
+ * The name is expanded to X86_VENDOR_@_vendor
+ * @_family: The family number or X86_FAMILY_ANY
+ * @_model: The model number, model constant or X86_MODEL_ANY
+ * @_feature: A X86_FEATURE bit or X86_FEATURE_ANY
+ * @_data: Driver specific data or NULL. The internal storage
+ * format is unsigned long. The supplied value, pointer
+ * etc. is casted to unsigned long internally.
+ *
+ * The steppings arguments of X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE() is
+ * set to wildcards.
+ */
+#define X86_MATCH_VENDOR_FAM_MODEL_FEATURE(vendor, family, model, feature, data) \
+ X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE(vendor, family, model, \
+ X86_STEPPING_ANY, feature, data)
+
/**
* X86_MATCH_VENDOR_FAM_FEATURE - Macro for matching vendor, family and CPU feature
* @vendor: The vendor name, e.g. INTEL, AMD, HYGON, ..., ANY
#define X86_MATCH_INTEL_FAM6_MODEL(model, data) \
X86_MATCH_VENDOR_FAM_MODEL(INTEL, 6, INTEL_FAM6_##model, data)
+#define X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(model, steppings, data) \
+ X86_MATCH_VENDOR_FAM_MODEL_STEPPINGS_FEATURE(INTEL, 6, INTEL_FAM6_##model, \
+ steppings, X86_FEATURE_ANY, data)
+
/*
* Match specific microcode revisions.
*
#define _ASM_X86_DEVICE_H
struct dev_archdata {
-#if defined(CONFIG_INTEL_IOMMU) || defined(CONFIG_AMD_IOMMU)
+#ifdef CONFIG_IOMMU_API
void *iommu; /* hook for IOMMU specific extension */
#endif
};
#define MAX_DMA_PFN ((16UL * 1024 * 1024) >> PAGE_SHIFT)
/* 4GB broken PCI/AGP hardware bus master zone */
-#define MAX_DMA32_PFN ((4UL * 1024 * 1024 * 1024) >> PAGE_SHIFT)
+#define MAX_DMA32_PFN (1UL << (32 - PAGE_SHIFT))
#ifdef CONFIG_X86_32
/* The maximum address that we can perform a DMA transfer to on this platform */
#include <asm/nospec-branch.h>
#include <asm/mmu_context.h>
#include <linux/build_bug.h>
+#include <linux/kernel.h>
extern unsigned long efi_fw_vendor, efi_config_table;
/* arch specific definitions used by the stub code */
-__attribute_const__ bool efi_is_64bit(void);
+#ifdef CONFIG_EFI_MIXED
+
+#define ARCH_HAS_EFISTUB_WRAPPERS
+
+static inline bool efi_is_64bit(void)
+{
+ extern const bool efi_is64;
+
+ return efi_is64;
+}
static inline bool efi_is_native(void)
{
if (!IS_ENABLED(CONFIG_X86_64))
return true;
- if (!IS_ENABLED(CONFIG_EFI_MIXED))
- return true;
return efi_is_64bit();
}
#define __efi64_argmap_allocate_pool(type, size, buffer) \
((type), (size), efi64_zero_upper(buffer))
+#define __efi64_argmap_create_event(type, tpl, f, c, event) \
+ ((type), (tpl), (f), (c), efi64_zero_upper(event))
+
+#define __efi64_argmap_set_timer(event, type, time) \
+ ((event), (type), lower_32_bits(time), upper_32_bits(time))
+
+#define __efi64_argmap_wait_for_event(num, event, index) \
+ ((num), (event), efi64_zero_upper(index))
+
#define __efi64_argmap_handle_protocol(handle, protocol, interface) \
((handle), (protocol), efi64_zero_upper(interface))
#define __efi64_argmap_load_file(protocol, path, policy, bufsize, buf) \
((protocol), (path), (policy), efi64_zero_upper(bufsize), (buf))
+/* Graphics Output Protocol */
+#define __efi64_argmap_query_mode(gop, mode, size, info) \
+ ((gop), (mode), efi64_zero_upper(size), efi64_zero_upper(info))
+
/*
* The macros below handle the plumbing for the argument mapping. To add a
* mapping for a specific EFI method, simply define a macro
#define efi_bs_call(func, ...) \
(efi_is_native() \
- ? efi_system_table()->boottime->func(__VA_ARGS__) \
- : __efi64_thunk_map(efi_table_attr(efi_system_table(), \
- boottime), func, __VA_ARGS__))
+ ? efi_system_table->boottime->func(__VA_ARGS__) \
+ : __efi64_thunk_map(efi_table_attr(efi_system_table, \
+ boottime), \
+ func, __VA_ARGS__))
#define efi_rt_call(func, ...) \
(efi_is_native() \
- ? efi_system_table()->runtime->func(__VA_ARGS__) \
- : __efi64_thunk_map(efi_table_attr(efi_system_table(), \
- runtime), func, __VA_ARGS__))
+ ? efi_system_table->runtime->func(__VA_ARGS__) \
+ : __efi64_thunk_map(efi_table_attr(efi_system_table, \
+ runtime), \
+ func, __VA_ARGS__))
+
+#else /* CONFIG_EFI_MIXED */
+
+static inline bool efi_is_64bit(void)
+{
+ return IS_ENABLED(CONFIG_X86_64);
+}
+
+#endif /* CONFIG_EFI_MIXED */
extern bool efi_reboot_required(void);
extern bool efi_is_table_address(unsigned long phys_addr);
#ifndef __ASSEMBLY__
+#if defined(CONFIG_FUNCTION_TRACER) && defined(CONFIG_DYNAMIC_FTRACE)
+extern void set_ftrace_ops_ro(void);
+#else
+static inline void set_ftrace_ops_ro(void) { }
+#endif
+
#define ARCH_HAS_SYSCALL_MATCH_SYM_NAME
static inline bool arch_syscall_match_sym_name(const char *sym, const char *name)
{
* stale TLB entries and, especially if we're flushing global
* mappings, we don't want the compiler to reorder any subsequent
* memory accesses before the TLB flush.
- *
- * The hex opcode is invpcid (%ecx), %eax in 32-bit mode and
- * invpcid (%rcx), %rax in long mode.
*/
- asm volatile (".byte 0x66, 0x0f, 0x38, 0x82, 0x01"
- : : "m" (desc), "a" (type), "c" (&desc) : "memory");
+ asm volatile("invpcid %[desc], %[type]"
+ :: [desc] "m" (desc), [type] "r" (type) : "memory");
}
#define INVPCID_TYPE_INDIV_ADDR 0
#ifdef CONFIG_X86_IOPL_IOPERM
void io_bitmap_share(struct task_struct *tsk);
-void io_bitmap_exit(void);
+void io_bitmap_exit(struct task_struct *tsk);
void native_tss_update_io_bitmap(void);
#else
static inline void io_bitmap_share(struct task_struct *tsk) { }
-static inline void io_bitmap_exit(void) { }
+static inline void io_bitmap_exit(struct task_struct *tsk) { }
static inline void tss_update_io_bitmap(void) { }
#endif
unsigned long cr4;
unsigned long cr4_guest_owned_bits;
unsigned long cr8;
+ u32 host_pkru;
u32 pkru;
u32 hflags;
u64 efer;
void (*set_idt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
void (*get_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
void (*set_gdt)(struct kvm_vcpu *vcpu, struct desc_ptr *dt);
- u64 (*get_dr6)(struct kvm_vcpu *vcpu);
- void (*set_dr6)(struct kvm_vcpu *vcpu, unsigned long value);
void (*sync_dirty_debug_regs)(struct kvm_vcpu *vcpu);
void (*set_dr7)(struct kvm_vcpu *vcpu, unsigned long value);
void (*cache_reg)(struct kvm_vcpu *vcpu, enum kvm_reg reg);
void kvm_queue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_queue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
+void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr, unsigned long payload);
void kvm_requeue_exception(struct kvm_vcpu *vcpu, unsigned nr);
void kvm_requeue_exception_e(struct kvm_vcpu *vcpu, unsigned nr, u32 error_code);
void kvm_inject_page_fault(struct kvm_vcpu *vcpu, struct x86_exception *fault);
#define NODE_DATA(nid) (node_data[nid])
#endif /* CONFIG_NUMA */
-#ifdef CONFIG_DISCONTIGMEM
-
-/*
- * generic node memory support, the following assumptions apply:
- *
- * 1) memory comes in 64Mb contiguous chunks which are either present or not
- * 2) we will not have more than 64Gb in total
- *
- * for now assume that 64Gb is max amount of RAM for whole system
- * 64Gb / 4096bytes/page = 16777216 pages
- */
-#define MAX_NR_PAGES 16777216
-#define MAX_SECTIONS 1024
-#define PAGES_PER_SECTION (MAX_NR_PAGES/MAX_SECTIONS)
-
-extern s8 physnode_map[];
-
-static inline int pfn_to_nid(unsigned long pfn)
-{
-#ifdef CONFIG_NUMA
- return((int) physnode_map[(pfn) / PAGES_PER_SECTION]);
-#else
- return 0;
-#endif
-}
-
-static inline int pfn_valid(int pfn)
-{
- int nid = pfn_to_nid(pfn);
-
- if (nid >= 0)
- return (pfn < node_end_pfn(nid));
- return 0;
-}
-
-#define early_pfn_valid(pfn) pfn_valid((pfn))
-
-#endif /* CONFIG_DISCONTIGMEM */
-
#endif /* _ASM_X86_MMZONE_32_H */
#define MSR_PP1_ENERGY_STATUS 0x00000641
#define MSR_PP1_POLICY 0x00000642
+#define MSR_AMD_PKG_ENERGY_STATUS 0xc001029b
+#define MSR_AMD_RAPL_POWER_UNIT 0xc0010299
+
/* Config TDP MSRs */
#define MSR_CONFIG_TDP_NOMINAL 0x00000648
#define MSR_CONFIG_TDP_LEVEL_1 0x00000649
#define _ASM_X86_NOSPEC_BRANCH_H_
#include <linux/static_key.h>
+#include <linux/frame.h>
#include <asm/alternative.h>
#include <asm/alternative-asm.h>
#include <asm/cpufeatures.h>
#include <asm/msr-index.h>
-
-/*
- * This should be used immediately before a retpoline alternative. It tells
- * objtool where the retpolines are so that it can make sense of the control
- * flow by just reading the original instruction(s) and ignoring the
- * alternatives.
- */
-#define ANNOTATE_NOSPEC_ALTERNATIVE \
- ANNOTATE_IGNORE_ALTERNATIVE
+#include <asm/unwind_hints.h>
/*
* Fill the CPU return stack buffer.
#define __FILL_RETURN_BUFFER(reg, nr, sp) \
mov $(nr/2), reg; \
771: \
+ ANNOTATE_INTRA_FUNCTION_CALL; \
call 772f; \
773: /* speculation trap */ \
+ UNWIND_HINT_EMPTY; \
pause; \
lfence; \
jmp 773b; \
772: \
+ ANNOTATE_INTRA_FUNCTION_CALL; \
call 774f; \
775: /* speculation trap */ \
+ UNWIND_HINT_EMPTY; \
pause; \
lfence; \
jmp 775b; \
774: \
+ add $(BITS_PER_LONG/8) * 2, sp; \
dec reg; \
- jnz 771b; \
- add $(BITS_PER_LONG/8) * nr, sp;
+ jnz 771b;
#ifdef __ASSEMBLY__
.popsection
.endm
-/*
- * These are the bare retpoline primitives for indirect jmp and call.
- * Do not use these directly; they only exist to make the ALTERNATIVE
- * invocation below less ugly.
- */
-.macro RETPOLINE_JMP reg:req
- call .Ldo_rop_\@
-.Lspec_trap_\@:
- pause
- lfence
- jmp .Lspec_trap_\@
-.Ldo_rop_\@:
- mov \reg, (%_ASM_SP)
- ret
-.endm
-
-/*
- * This is a wrapper around RETPOLINE_JMP so the called function in reg
- * returns to the instruction after the macro.
- */
-.macro RETPOLINE_CALL reg:req
- jmp .Ldo_call_\@
-.Ldo_retpoline_jmp_\@:
- RETPOLINE_JMP \reg
-.Ldo_call_\@:
- call .Ldo_retpoline_jmp_\@
-.endm
-
/*
* JMP_NOSPEC and CALL_NOSPEC macros can be used instead of a simple
* indirect jmp/call which may be susceptible to the Spectre variant 2
*/
.macro JMP_NOSPEC reg:req
#ifdef CONFIG_RETPOLINE
- ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *\reg), \
- __stringify(RETPOLINE_JMP \reg), X86_FEATURE_RETPOLINE, \
- __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *\reg), X86_FEATURE_RETPOLINE_AMD
+ ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), \
+ __stringify(jmp __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+ __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; jmp *%\reg), X86_FEATURE_RETPOLINE_AMD
#else
- jmp *\reg
+ jmp *%\reg
#endif
.endm
.macro CALL_NOSPEC reg:req
#ifdef CONFIG_RETPOLINE
- ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *\reg), \
- __stringify(RETPOLINE_CALL \reg), X86_FEATURE_RETPOLINE,\
- __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *\reg), X86_FEATURE_RETPOLINE_AMD
+ ALTERNATIVE_2 __stringify(ANNOTATE_RETPOLINE_SAFE; call *%\reg), \
+ __stringify(call __x86_retpoline_\reg), X86_FEATURE_RETPOLINE, \
+ __stringify(lfence; ANNOTATE_RETPOLINE_SAFE; call *%\reg), X86_FEATURE_RETPOLINE_AMD
#else
- call *\reg
+ call *%\reg
#endif
.endm
*/
.macro FILL_RETURN_BUFFER reg:req nr:req ftr:req
#ifdef CONFIG_RETPOLINE
- ANNOTATE_NOSPEC_ALTERNATIVE
- ALTERNATIVE "jmp .Lskip_rsb_\@", \
- __stringify(__FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)) \
- \ftr
+ ALTERNATIVE "jmp .Lskip_rsb_\@", "", \ftr
+ __FILL_RETURN_BUFFER(\reg,\nr,%_ASM_SP)
.Lskip_rsb_\@:
#endif
.endm
* which is ensured when CONFIG_RETPOLINE is defined.
*/
# define CALL_NOSPEC \
- ANNOTATE_NOSPEC_ALTERNATIVE \
ALTERNATIVE_2( \
ANNOTATE_RETPOLINE_SAFE \
"call *%[thunk_target]\n", \
- "call __x86_indirect_thunk_%V[thunk_target]\n", \
+ "call __x86_retpoline_%V[thunk_target]\n", \
X86_FEATURE_RETPOLINE, \
"lfence;\n" \
ANNOTATE_RETPOLINE_SAFE \
"call *%[thunk_target]\n", \
X86_FEATURE_RETPOLINE_AMD)
+
# define THUNK_TARGET(addr) [thunk_target] "r" (addr)
#else /* CONFIG_X86_32 */
* here, anyway.
*/
# define CALL_NOSPEC \
- ANNOTATE_NOSPEC_ALTERNATIVE \
ALTERNATIVE_2( \
ANNOTATE_RETPOLINE_SAFE \
"call *%[thunk_target]\n", \
#define ORC_TYPE_CALL 0
#define ORC_TYPE_REGS 1
#define ORC_TYPE_REGS_IRET 2
-#define UNWIND_HINT_TYPE_SAVE 3
-#define UNWIND_HINT_TYPE_RESTORE 4
+#define UNWIND_HINT_TYPE_RET_OFFSET 3
#ifndef __ASSEMBLY__
/*
#endif /* !__ASSEMBLY__ */
/*
- * kern_addr_valid() is (1) for FLATMEM and (0) for
- * SPARSEMEM and DISCONTIGMEM
+ * kern_addr_valid() is (1) for FLATMEM and (0) for SPARSEMEM
*/
#ifdef CONFIG_FLATMEM
#define kern_addr_valid(addr) (1)
/* in KB - valid for CPUS which support this call: */
unsigned int x86_cache_size;
int x86_cache_alignment; /* In bytes */
- /* Cache QoS architectural values: */
+ /* Cache QoS architectural values, valid only on the BSP: */
int x86_cache_max_rmid; /* max index */
int x86_cache_occ_scale; /* scale to bytes */
+ int x86_cache_mbm_width_offset;
int x86_power;
unsigned long loops_per_jiffy;
/* cpuid returned max cores value: */
unsigned int tmp;
asm volatile (
- UNWIND_HINT_SAVE
"mov %%ss, %0\n\t"
"pushq %q0\n\t"
"pushq %%rsp\n\t"
"pushq %q0\n\t"
"pushq $1f\n\t"
"iretq\n\t"
- UNWIND_HINT_RESTORE
"1:"
: "=&r" (tmp), ASM_CALL_CONSTRAINT : : "cc", "memory");
#endif
/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef _ASM_X86_RESCTRL_SCHED_H
-#define _ASM_X86_RESCTRL_SCHED_H
+#ifndef _ASM_X86_RESCTRL_H
+#define _ASM_X86_RESCTRL_H
#ifdef CONFIG_X86_CPU_RESCTRL
__resctrl_sched_in();
}
+void resctrl_cpu_detect(struct cpuinfo_x86 *c);
+
#else
static inline void resctrl_sched_in(void) {}
+static inline void resctrl_cpu_detect(struct cpuinfo_x86 *c) {}
#endif /* CONFIG_X86_CPU_RESCTRL */
-#endif /* _ASM_X86_RESCTRL_SCHED_H */
+#endif /* _ASM_X86_RESCTRL_H */
{
unsigned long flags;
- asm volatile (ALTERNATIVE("", "pushf; pop %0; " __ASM_CLAC,
- X86_FEATURE_SMAP)
+ asm volatile ("# smap_save\n\t"
+ ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP)
+ "pushf; pop %0; " __ASM_CLAC "\n\t"
+ "1:"
: "=rm" (flags) : : "memory", "cc");
return flags;
static __always_inline void smap_restore(unsigned long flags)
{
- asm volatile (ALTERNATIVE("", "push %0; popf", X86_FEATURE_SMAP)
+ asm volatile ("# smap_restore\n\t"
+ ALTERNATIVE("jmp 1f", "", X86_FEATURE_SMAP)
+ "push %0; popf\n\t"
+ "1:"
: : "g" (flags) : "memory", "cc");
}
#define _ASM_X86_SPINLOCK_TYPES_H
#include <linux/types.h>
-
-#ifdef CONFIG_PARAVIRT_SPINLOCKS
-#define __TICKET_LOCK_INC 2
-#define TICKET_SLOWPATH_FLAG ((__ticket_t)1)
-#else
-#define __TICKET_LOCK_INC 1
-#define TICKET_SLOWPATH_FLAG ((__ticket_t)0)
-#endif
-
-#if (CONFIG_NR_CPUS < (256 / __TICKET_LOCK_INC))
-typedef u8 __ticket_t;
-typedef u16 __ticketpair_t;
-#else
-typedef u16 __ticket_t;
-typedef u32 __ticketpair_t;
-#endif
-
-#define TICKET_LOCK_INC ((__ticket_t)__TICKET_LOCK_INC)
-
-#define TICKET_SHIFT (sizeof(__ticket_t) * 8)
-
#include <asm-generic/qspinlock_types.h>
-
#include <asm-generic/qrwlock_types.h>
#endif /* _ASM_X86_SPINLOCK_TYPES_H */
/*
* Initialize the stackprotector canary value.
*
- * NOTE: this must only be called from functions that never return,
+ * NOTE: this must only be called from functions that never return
* and it must always be inlined.
+ *
+ * In addition, it should be called from a compilation unit for which
+ * stack protector is disabled. Alternatively, the caller should not end
+ * with a function call which gets tail-call optimized as that would
+ * lead to checking a modified canary value.
*/
static __always_inline void boot_init_stack_canary(void)
{
void smp_error_interrupt(struct pt_regs *regs);
asmlinkage void smp_irq_move_cleanup_interrupt(void);
-extern void ist_enter(struct pt_regs *regs);
-extern void ist_exit(struct pt_regs *regs);
-extern void ist_begin_non_atomic(struct pt_regs *regs);
-extern void ist_end_non_atomic(void);
-
#ifdef CONFIG_VMAP_STACK
void __noreturn handle_stack_overflow(const char *message,
struct pt_regs *regs,
UNWIND_HINT sp_offset=\sp_offset
.endm
-.macro UNWIND_HINT_SAVE
- UNWIND_HINT type=UNWIND_HINT_TYPE_SAVE
-.endm
-
-.macro UNWIND_HINT_RESTORE
- UNWIND_HINT type=UNWIND_HINT_TYPE_RESTORE
+/*
+ * RET_OFFSET: Used on instructions that terminate a function; mostly RETURN
+ * and sibling calls. On these, sp_offset denotes the expected offset from
+ * initial_func_cfi.
+ */
+.macro UNWIND_HINT_RET_OFFSET sp_offset=8
+ UNWIND_HINT type=UNWIND_HINT_TYPE_RET_OFFSET sp_offset=\sp_offset
.endm
-#else /* !__ASSEMBLY__ */
-
-#define UNWIND_HINT(sp_reg, sp_offset, type, end) \
- "987: \n\t" \
- ".pushsection .discard.unwind_hints\n\t" \
- /* struct unwind_hint */ \
- ".long 987b - .\n\t" \
- ".short " __stringify(sp_offset) "\n\t" \
- ".byte " __stringify(sp_reg) "\n\t" \
- ".byte " __stringify(type) "\n\t" \
- ".byte " __stringify(end) "\n\t" \
- ".balign 4 \n\t" \
- ".popsection\n\t"
-
-#define UNWIND_HINT_SAVE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_SAVE, 0)
-
-#define UNWIND_HINT_RESTORE UNWIND_HINT(0, 0, UNWIND_HINT_TYPE_RESTORE, 0)
-
#endif /* __ASSEMBLY__ */
#endif /* _ASM_X86_UNWIND_HINTS_H */
#ifndef _UAPI_ASM_X86_UNISTD_H
#define _UAPI_ASM_X86_UNISTD_H
-/* x32 syscall flag bit */
-#define __X32_SYSCALL_BIT 0x40000000UL
+/*
+ * x32 syscall flag bit. Some user programs expect syscall NR macros
+ * and __X32_SYSCALL_BIT to have type int, even though syscall numbers
+ * are, for practical purposes, unsigned long.
+ *
+ * Fortunately, expressions like (nr & ~__X32_SYSCALL_BIT) do the right
+ * thing regardless.
+ */
+#define __X32_SYSCALL_BIT 0x40000000
#ifndef __KERNEL__
# ifdef __i386__
apb_timer_block_enabled = 0;
panic("failed to enable APB timer\n");
}
-
-/* called before apb_timer_enable, use early map */
-unsigned long apbt_quick_calibrate(void)
-{
- int i, scale;
- u64 old, new;
- u64 t1, t2;
- unsigned long khz = 0;
- u32 loop, shift;
-
- apbt_set_mapping();
- dw_apb_clocksource_start(clocksource_apbt);
-
- /* check if the timer can count down, otherwise return */
- old = dw_apb_clocksource_read(clocksource_apbt);
- i = 10000;
- while (--i) {
- if (old != dw_apb_clocksource_read(clocksource_apbt))
- break;
- }
- if (!i)
- goto failed;
-
- /* count 16 ms */
- loop = (apbt_freq / 1000) << 4;
-
- /* restart the timer to ensure it won't get to 0 in the calibration */
- dw_apb_clocksource_start(clocksource_apbt);
-
- old = dw_apb_clocksource_read(clocksource_apbt);
- old += loop;
-
- t1 = rdtsc();
-
- do {
- new = dw_apb_clocksource_read(clocksource_apbt);
- } while (new < old);
-
- t2 = rdtsc();
-
- shift = 5;
- if (unlikely(loop >> shift == 0)) {
- printk(KERN_INFO
- "APBT TSC calibration failed, not enough resolution\n");
- return 0;
- }
- scale = (int)div_u64((t2 - t1), loop >> shift);
- khz = (scale * (apbt_freq / 1000)) >> shift;
- printk(KERN_INFO "TSC freq calculated by APB timer is %lu khz\n", khz);
- return khz;
-failed:
- return 0;
-}
};
static DEFINE_PER_CPU(struct clock_event_device, lapic_events);
-static __init u32 hsx_deadline_rev(void)
-{
- switch (boot_cpu_data.x86_stepping) {
- case 0x02: return 0x3a; /* EP */
- case 0x04: return 0x0f; /* EX */
- }
-
- return ~0U;
-}
-
-static __init u32 bdx_deadline_rev(void)
-{
- switch (boot_cpu_data.x86_stepping) {
- case 0x02: return 0x00000011;
- case 0x03: return 0x0700000e;
- case 0x04: return 0x0f00000c;
- case 0x05: return 0x0e000003;
- }
-
- return ~0U;
-}
-
-static __init u32 skx_deadline_rev(void)
-{
- switch (boot_cpu_data.x86_stepping) {
- case 0x03: return 0x01000136;
- case 0x04: return 0x02000014;
- }
+static const struct x86_cpu_id deadline_match[] __initconst = {
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(HASWELL_X, X86_STEPPINGS(0x2, 0x2), 0x3a), /* EP */
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(HASWELL_X, X86_STEPPINGS(0x4, 0x4), 0x0f), /* EX */
- if (boot_cpu_data.x86_stepping > 4)
- return 0;
+ X86_MATCH_INTEL_FAM6_MODEL( BROADWELL_X, 0x0b000020),
- return ~0U;
-}
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x2, 0x2), 0x00000011),
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x3, 0x3), 0x0700000e),
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x4, 0x4), 0x0f00000c),
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(BROADWELL_D, X86_STEPPINGS(0x5, 0x5), 0x0e000003),
-static const struct x86_cpu_id deadline_match[] __initconst = {
- X86_MATCH_INTEL_FAM6_MODEL( HASWELL_X, &hsx_deadline_rev),
- X86_MATCH_INTEL_FAM6_MODEL( BROADWELL_X, 0x0b000020),
- X86_MATCH_INTEL_FAM6_MODEL( BROADWELL_D, &bdx_deadline_rev),
- X86_MATCH_INTEL_FAM6_MODEL( SKYLAKE_X, &skx_deadline_rev),
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(SKYLAKE_X, X86_STEPPINGS(0x3, 0x3), 0x01000136),
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(SKYLAKE_X, X86_STEPPINGS(0x4, 0x4), 0x02000014),
+ X86_MATCH_INTEL_FAM6_MODEL_STEPPINGS(SKYLAKE_X, X86_STEPPINGS(0x5, 0xf), 0),
X86_MATCH_INTEL_FAM6_MODEL( HASWELL, 0x22),
X86_MATCH_INTEL_FAM6_MODEL( HASWELL_L, 0x20),
if (!m)
return true;
- /*
- * Function pointers will have the MSB set due to address layout,
- * immediate revisions will not.
- */
- if ((long)m->driver_data < 0)
- rev = ((u32 (*)(void))(m->driver_data))();
- else
- rev = (u32)m->driver_data;
+ rev = (u32)m->driver_data;
if (boot_cpu_data.microcode >= rev)
return true;
return irq >= 0 && irq < nr_legacy_irqs();
}
-/*
- * Initialize all legacy IRQs and all pins on the first IOAPIC
- * if we have legacy interrupt controller. Kernel boot option "pirq="
- * may rely on non-legacy pins on the first IOAPIC.
- */
-static inline int mp_init_irq_at_boot(int ioapic, int irq)
-{
- if (!nr_legacy_irqs())
- return 0;
-
- return ioapic == 0 || mp_is_legacy_irq(irq);
-}
-
static inline struct irq_domain *mp_ioapic_irqdomain(int ioapic)
{
return ioapics[ioapic].irqdomain;
#include <linux/types.h>
#include <linux/audit.h>
#include <asm/unistd.h>
+#include <asm/audit.h>
static unsigned dir_class[] = {
#include <asm-generic/audit_dir_write.h>
int audit_classify_syscall(int abi, unsigned syscall)
{
#ifdef CONFIG_IA32_EMULATION
- extern int ia32_classify_syscall(unsigned);
if (abi == AUDIT_ARCH_I386)
return ia32_classify_syscall(syscall);
#endif
#include <asm/pci-direct.h>
#include <asm/delay.h>
#include <asm/debugreg.h>
+#include <asm/resctrl.h>
#ifdef CONFIG_X86_64
# include <asm/mmconfig.h>
x86_amd_ls_cfg_ssbd_mask = 1ULL << bit;
}
}
+
+ resctrl_cpu_detect(c);
}
static void early_detect_mem_encrypt(struct cpuinfo_x86 *c)
/* #1054: Instructions Retired Performance Counter May Be Inaccurate */
static const int amd_erratum_1054[] =
- AMD_OSVW_ERRATUM(0, AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
-
+ AMD_LEGACY_ERRATUM(AMD_MODEL_RANGE(0x17, 0, 0, 0x2f, 0xf));
static bool cpu_has_amd_erratum(struct cpuinfo_x86 *cpu, const int *erratum)
{
}
}
-static void init_cqm(struct cpuinfo_x86 *c)
-{
- if (!cpu_has(c, X86_FEATURE_CQM_LLC)) {
- c->x86_cache_max_rmid = -1;
- c->x86_cache_occ_scale = -1;
- return;
- }
-
- /* will be overridden if occupancy monitoring exists */
- c->x86_cache_max_rmid = cpuid_ebx(0xf);
-
- if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
- cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
- cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
- u32 eax, ebx, ecx, edx;
-
- /* QoS sub-leaf, EAX=0Fh, ECX=1 */
- cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
-
- c->x86_cache_max_rmid = ecx;
- c->x86_cache_occ_scale = ebx;
- }
-}
-
void get_cpu_cap(struct cpuinfo_x86 *c)
{
u32 eax, ebx, ecx, edx;
init_scattered_cpuid_features(c);
init_speculation_control(c);
- init_cqm(c);
/*
* Clear/Set all flags overridden by options, after probe.
#endif
}
-static void x86_init_cache_qos(struct cpuinfo_x86 *c)
-{
- /*
- * The heavy lifting of max_rmid and cache_occ_scale are handled
- * in get_cpu_cap(). Here we just set the max_rmid for the boot_cpu
- * in case CQM bits really aren't there in this CPU.
- */
- if (c != &boot_cpu_data) {
- boot_cpu_data.x86_cache_max_rmid =
- min(boot_cpu_data.x86_cache_max_rmid,
- c->x86_cache_max_rmid);
- }
-}
-
/*
* Validate that ACPI/mptables have the same information about the
* effective APIC id and update the package map.
#endif
x86_init_rdrand(c);
- x86_init_cache_qos(c);
setup_pku(c);
/*
#include <asm/cpu_device_id.h>
#include <asm/cmdline.h>
#include <asm/traps.h>
+#include <asm/resctrl.h>
#ifdef CONFIG_X86_64
#include <linux/topology.h>
detect_ht_early(c);
}
+static void bsp_init_intel(struct cpuinfo_x86 *c)
+{
+ resctrl_cpu_detect(c);
+}
+
#ifdef CONFIG_X86_32
/*
* Early probe support logic for ppro memory erratum #50
#endif
.c_detect_tlb = intel_detect_tlb,
.c_early_init = early_init_intel,
+ .c_bsp_init = bsp_init_intel,
.c_init = init_intel,
.c_x86_vendor = X86_VENDOR_INTEL,
};
const struct x86_cpu_id *m;
struct cpuinfo_x86 *c = &boot_cpu_data;
- for (m = match; m->vendor | m->family | m->model | m->feature; m++) {
+ for (m = match;
+ m->vendor | m->family | m->model | m->steppings | m->feature;
+ m++) {
if (m->vendor != X86_VENDOR_ANY && c->x86_vendor != m->vendor)
continue;
if (m->family != X86_FAMILY_ANY && c->x86 != m->family)
continue;
if (m->model != X86_MODEL_ANY && c->x86_model != m->model)
continue;
+ if (m->steppings != X86_STEPPING_ANY &&
+ !(BIT(c->x86_stepping) & m->steppings))
+ continue;
if (m->feature != X86_FEATURE_ANY && !cpu_has(c, m->feature))
continue;
return m;
#include <linux/export.h>
#include <linux/jump_label.h>
#include <linux/set_memory.h>
+#include <linux/task_work.h>
+#include <linux/hardirq.h>
#include <asm/intel-family.h>
#include <asm/processor.h>
}
}
-static int do_memory_failure(struct mce *m)
-{
- int flags = MF_ACTION_REQUIRED;
- int ret;
-
- pr_err("Uncorrected hardware memory error in user-access at %llx", m->addr);
- if (!(m->mcgstatus & MCG_STATUS_RIPV))
- flags |= MF_MUST_KILL;
- ret = memory_failure(m->addr >> PAGE_SHIFT, flags);
- if (ret)
- pr_err("Memory error not recovered");
- else
- set_mce_nospec(m->addr >> PAGE_SHIFT);
- return ret;
-}
-
-
/*
* Cases where we avoid rendezvous handler timeout:
* 1) If this CPU is offline.
*m = *final;
}
+static void kill_me_now(struct callback_head *ch)
+{
+ force_sig(SIGBUS);
+}
+
+static void kill_me_maybe(struct callback_head *cb)
+{
+ struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me);
+ int flags = MF_ACTION_REQUIRED;
+
+ pr_err("Uncorrected hardware memory error in user-access at %llx", p->mce_addr);
+ if (!(p->mce_status & MCG_STATUS_RIPV))
+ flags |= MF_MUST_KILL;
+
+ if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags)) {
+ set_mce_nospec(p->mce_addr >> PAGE_SHIFT);
+ return;
+ }
+
+ pr_err("Memory error not recovered");
+ kill_me_now(cb);
+}
+
/*
* The actual machine check handler. This only handles real
* exceptions when something got corrupted coming in through int 18.
* backing the user stack, tracing that reads the user stack will cause
* potentially infinite recursion.
*/
-void notrace do_machine_check(struct pt_regs *regs, long error_code)
+void noinstr do_machine_check(struct pt_regs *regs, long error_code)
{
DECLARE_BITMAP(valid_banks, MAX_NR_BANKS);
DECLARE_BITMAP(toclear, MAX_NR_BANKS);
if (__mc_check_crashing_cpu(cpu))
return;
- ist_enter(regs);
+ nmi_enter();
this_cpu_inc(mce_exception_count);
/* Fault was in user mode and we need to take some action */
if ((m.cs & 3) == 3) {
- ist_begin_non_atomic(regs);
- local_irq_enable();
-
- if (kill_it || do_memory_failure(&m))
- force_sig(SIGBUS);
- local_irq_disable();
- ist_end_non_atomic();
+ /* If this triggers there is no way to recover. Die hard. */
+ BUG_ON(!on_thread_stack() || !user_mode(regs));
+
+ current->mce_addr = m.addr;
+ current->mce_status = m.mcgstatus;
+ current->mce_kill_me.func = kill_me_maybe;
+ if (kill_it)
+ current->mce_kill_me.func = kill_me_now;
+ task_work_add(current, ¤t->mce_kill_me, true);
} else {
if (!fixup_exception(regs, X86_TRAP_MC, error_code, 0))
mce_panic("Failed kernel mode recovery", &m, msg);
}
out_ist:
- ist_exit(regs);
+ nmi_exit();
}
EXPORT_SYMBOL_GPL(do_machine_check);
-NOKPROBE_SYMBOL(do_machine_check);
#ifndef CONFIG_MEMORY_FAILURE
int memory_failure(unsigned long pfn, int flags)
#include <linux/kernel.h>
#include <linux/types.h>
#include <linux/smp.h>
+#include <linux/hardirq.h>
#include <asm/processor.h>
#include <asm/traps.h>
{
u32 loaddr, hi, lotype;
- ist_enter(regs);
+ nmi_enter();
rdmsr(MSR_IA32_P5_MC_ADDR, loaddr, hi);
rdmsr(MSR_IA32_P5_MC_TYPE, lotype, hi);
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
- ist_exit(regs);
+ nmi_exit();
}
/* Set up machine check reporting for processors with Intel style MCE: */
#include <linux/interrupt.h>
#include <linux/kernel.h>
#include <linux/types.h>
+#include <linux/hardirq.h>
#include <asm/processor.h>
#include <asm/traps.h>
/* Machine check handler for WinChip C6: */
static void winchip_machine_check(struct pt_regs *regs, long error_code)
{
- ist_enter(regs);
+ nmi_enter();
pr_emerg("CPU0: Machine Check Exception.\n");
add_taint(TAINT_MACHINE_CHECK, LOCKDEP_NOW_UNRELIABLE);
- ist_exit(regs);
+ nmi_exit();
}
/* Set up machine check reporting on the Winchip C6 series */
/*
* Returns:
* < 0 - on error
- * 0 - no update done
- * 1 - microcode was updated
+ * 0 - success (no update done or microcode was updated)
*/
static int __reload_late(void *info)
{
else
goto wait_for_siblings;
- if (err > UCODE_NFOUND) {
- pr_warn("Error reloading microcode on CPU %d\n", cpu);
+ if (err >= UCODE_NFOUND) {
+ if (err == UCODE_ERROR)
+ pr_warn("Error reloading microcode on CPU %d\n", cpu);
+
ret = -1;
- } else if (err == UCODE_UPDATED || err == UCODE_OK) {
- ret = 1;
}
wait_for_siblings:
atomic_set(&late_cpus_out, 0);
ret = stop_machine_cpuslocked(__reload_late, NULL, cpu_online_mask);
- if (ret > 0)
+ if (ret == 0)
microcode_check();
pr_info("Reload completed, microcode revision: 0x%x\n", boot_cpu_data.microcode);
put:
put_online_cpus();
- if (ret >= 0)
+ if (ret == 0)
ret = size;
return ret;
case 15:
return msr - MSR_P4_BPU_PERFCTR0;
}
+ fallthrough;
+ case X86_VENDOR_ZHAOXIN:
+ case X86_VENDOR_CENTAUR:
+ return msr - MSR_ARCH_PERFMON_PERFCTR0;
}
return 0;
}
case 15:
return msr - MSR_P4_BSU_ESCR0;
}
+ fallthrough;
+ case X86_VENDOR_ZHAOXIN:
+ case X86_VENDOR_CENTAUR:
+ return msr - MSR_ARCH_PERFMON_EVENTSEL0;
}
return 0;
#include <linux/cpuhotplug.h>
#include <asm/intel-family.h>
-#include <asm/resctrl_sched.h>
+#include <asm/resctrl.h>
#include "internal.h"
/* Mutex to protect rdtgroup access. */
static enum cpuhp_state rdt_online;
+/* Runs once on the BSP during boot. */
+void resctrl_cpu_detect(struct cpuinfo_x86 *c)
+{
+ if (!cpu_has(c, X86_FEATURE_CQM_LLC)) {
+ c->x86_cache_max_rmid = -1;
+ c->x86_cache_occ_scale = -1;
+ c->x86_cache_mbm_width_offset = -1;
+ return;
+ }
+
+ /* will be overridden if occupancy monitoring exists */
+ c->x86_cache_max_rmid = cpuid_ebx(0xf);
+
+ if (cpu_has(c, X86_FEATURE_CQM_OCCUP_LLC) ||
+ cpu_has(c, X86_FEATURE_CQM_MBM_TOTAL) ||
+ cpu_has(c, X86_FEATURE_CQM_MBM_LOCAL)) {
+ u32 eax, ebx, ecx, edx;
+
+ /* QoS sub-leaf, EAX=0Fh, ECX=1 */
+ cpuid_count(0xf, 1, &eax, &ebx, &ecx, &edx);
+
+ c->x86_cache_max_rmid = ecx;
+ c->x86_cache_occ_scale = ebx;
+ if (c->x86_vendor == X86_VENDOR_INTEL)
+ c->x86_cache_mbm_width_offset = eax & 0xff;
+ else
+ c->x86_cache_mbm_width_offset = -1;
+ }
+}
+
static int __init resctrl_late_init(void)
{
struct rdt_resource *r;
return ret;
}
-void mon_event_read(struct rmid_read *rr, struct rdt_domain *d,
- struct rdtgroup *rdtgrp, int evtid, int first)
+void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
+ struct rdt_domain *d, struct rdtgroup *rdtgrp,
+ int evtid, int first)
{
/*
* setup the parameters to send to the IPI to read the data.
*/
rr->rgrp = rdtgrp;
rr->evtid = evtid;
+ rr->r = r;
rr->d = d;
rr->val = 0;
rr->first = first;
goto out;
}
- mon_event_read(&rr, d, rdtgrp, evtid, false);
+ mon_event_read(&rr, r, d, rdtgrp, evtid, false);
if (rr.val & RMID_VAL_ERROR)
seq_puts(m, "Error\n");
#define CQM_LIMBOCHECK_INTERVAL 1000
-#define MBM_CNTR_WIDTH 24
+#define MBM_CNTR_WIDTH_BASE 24
#define MBM_OVERFLOW_INTERVAL 1000
#define MAX_MBA_BW 100u
#define MBA_IS_LINEAR 0x4
#define RMID_VAL_ERROR BIT_ULL(63)
#define RMID_VAL_UNAVAIL BIT_ULL(62)
+/*
+ * With the above fields in use 62 bits remain in MSR_IA32_QM_CTR for
+ * data to be returned. The counter width is discovered from the hardware
+ * as an offset from MBM_CNTR_WIDTH_BASE.
+ */
+#define MBM_CNTR_WIDTH_OFFSET_MAX (62 - MBM_CNTR_WIDTH_BASE)
struct rdt_fs_context {
struct rmid_read {
struct rdtgroup *rgrp;
+ struct rdt_resource *r;
struct rdt_domain *d;
int evtid;
bool first;
struct list_head evt_list;
int num_rmid;
unsigned int mon_scale;
+ unsigned int mbm_width;
unsigned long fflags;
};
unsigned int dom_id);
void mkdir_mondata_subdir_allrdtgrp(struct rdt_resource *r,
struct rdt_domain *d);
-void mon_event_read(struct rmid_read *rr, struct rdt_domain *d,
- struct rdtgroup *rdtgrp, int evtid, int first);
+void mon_event_read(struct rmid_read *rr, struct rdt_resource *r,
+ struct rdt_domain *d, struct rdtgroup *rdtgrp,
+ int evtid, int first);
void mbm_setup_overflow_handler(struct rdt_domain *dom,
unsigned long delay_ms);
void mbm_handle_overflow(struct work_struct *work);
list_add_tail(&entry->list, &rmid_free_lru);
}
-static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr)
+static u64 mbm_overflow_count(u64 prev_msr, u64 cur_msr, unsigned int width)
{
- u64 shift = 64 - MBM_CNTR_WIDTH, chunks;
+ u64 shift = 64 - width, chunks;
chunks = (cur_msr << shift) - (prev_msr << shift);
return chunks >>= shift;
return 0;
}
- chunks = mbm_overflow_count(m->prev_msr, tval);
+ chunks = mbm_overflow_count(m->prev_msr, tval, rr->r->mbm_width);
m->chunks += chunks;
m->prev_msr = tval;
if (tval & (RMID_VAL_ERROR | RMID_VAL_UNAVAIL))
return;
- chunks = mbm_overflow_count(m->prev_bw_msr, tval);
+ chunks = mbm_overflow_count(m->prev_bw_msr, tval, rr->r->mbm_width);
m->chunks_bw += chunks;
m->chunks = m->chunks_bw;
cur_bw = (chunks * r->mon_scale) >> 20;
}
}
-static void mbm_update(struct rdt_domain *d, int rmid)
+static void mbm_update(struct rdt_resource *r, struct rdt_domain *d, int rmid)
{
struct rmid_read rr;
rr.first = false;
+ rr.r = r;
rr.d = d;
/*
struct rdtgroup *prgrp, *crgrp;
int cpu = smp_processor_id();
struct list_head *head;
+ struct rdt_resource *r;
struct rdt_domain *d;
mutex_lock(&rdtgroup_mutex);
if (!static_branch_likely(&rdt_mon_enable_key))
goto out_unlock;
- d = get_domain_from_cpu(cpu, &rdt_resources_all[RDT_RESOURCE_L3]);
+ r = &rdt_resources_all[RDT_RESOURCE_L3];
+
+ d = get_domain_from_cpu(cpu, r);
if (!d)
goto out_unlock;
list_for_each_entry(prgrp, &rdt_all_groups, rdtgroup_list) {
- mbm_update(d, prgrp->mon.rmid);
+ mbm_update(r, d, prgrp->mon.rmid);
head = &prgrp->mon.crdtgrp_list;
list_for_each_entry(crgrp, head, mon.crdtgrp_list)
- mbm_update(d, crgrp->mon.rmid);
+ mbm_update(r, d, crgrp->mon.rmid);
if (is_mba_sc(NULL))
update_mba_bw(prgrp, d);
int rdt_get_mon_l3_config(struct rdt_resource *r)
{
+ unsigned int mbm_offset = boot_cpu_data.x86_cache_mbm_width_offset;
unsigned int cl_size = boot_cpu_data.x86_cache_size;
int ret;
r->mon_scale = boot_cpu_data.x86_cache_occ_scale;
r->num_rmid = boot_cpu_data.x86_cache_max_rmid + 1;
+ r->mbm_width = MBM_CNTR_WIDTH_BASE;
+
+ if (mbm_offset > 0 && mbm_offset <= MBM_CNTR_WIDTH_OFFSET_MAX)
+ r->mbm_width += mbm_offset;
+ else if (mbm_offset > MBM_CNTR_WIDTH_OFFSET_MAX)
+ pr_warn("Ignoring impossible MBM counter offset\n");
/*
* A reasonable upper limit on the max threshold is the number
#include <asm/cacheflush.h>
#include <asm/intel-family.h>
-#include <asm/resctrl_sched.h>
+#include <asm/resctrl.h>
#include <asm/perf_event.h>
#include "../../events/perf_event.h" /* For X86_CONFIG() */
#include <uapi/linux/magic.h>
-#include <asm/resctrl_sched.h>
+#include <asm/resctrl.h>
#include "internal.h"
DEFINE_STATIC_KEY_FALSE(rdt_enable_key);
goto out_destroy;
if (is_mbm_event(mevt->evtid))
- mon_event_read(&rr, d, prgrp, mevt->evtid, true);
+ mon_event_read(&rr, r, d, prgrp, mevt->evtid, true);
}
kernfs_activate(kn);
return 0;
return -EINVAL;
if (!strncmp(p, "exactmap", 8)) {
-#ifdef CONFIG_CRASH_DUMP
- /*
- * If we are doing a crash dump, we still need to know
- * the real memory size before the original memory map is
- * reset.
- */
- saved_max_pfn = e820__end_of_ram_pfn();
-#endif
e820_table->nr_entries = 0;
userdef = 1;
return 0;
#include <xen/hvc-console.h>
#include <asm/pci-direct.h>
#include <asm/fixmap.h>
-#include <asm/intel-mid.h>
#include <asm/pgtable.h>
#include <linux/usb/ehci_def.h>
#include <linux/usb/xhci-dbgp.h>
-#include <linux/efi.h>
-#include <asm/efi.h>
#include <asm/pci_x86.h>
/* Simple VGA output */
return true;
}
-/*
- * This is similar to user_regset_copyout(), but will not add offset to
- * the source data pointer or increment pos, count, kbuf, and ubuf.
- */
-static inline void
-__copy_xstate_to_kernel(void *kbuf, const void *data,
- unsigned int offset, unsigned int size, unsigned int size_total)
+static void fill_gap(unsigned to, void **kbuf, unsigned *pos, unsigned *count)
{
- if (offset < size_total) {
- unsigned int copy = min(size, size_total - offset);
+ if (*pos < to) {
+ unsigned size = to - *pos;
+
+ if (size > *count)
+ size = *count;
+ memcpy(*kbuf, (void *)&init_fpstate.xsave + *pos, size);
+ *kbuf += size;
+ *pos += size;
+ *count -= size;
+ }
+}
- memcpy(kbuf + offset, data, copy);
+static void copy_part(unsigned offset, unsigned size, void *from,
+ void **kbuf, unsigned *pos, unsigned *count)
+{
+ fill_gap(offset, kbuf, pos, count);
+ if (size > *count)
+ size = *count;
+ if (size) {
+ memcpy(*kbuf, from, size);
+ *kbuf += size;
+ *pos += size;
+ *count -= size;
}
}
*/
int copy_xstate_to_kernel(void *kbuf, struct xregs_state *xsave, unsigned int offset_start, unsigned int size_total)
{
- unsigned int offset, size;
struct xstate_header header;
+ const unsigned off_mxcsr = offsetof(struct fxregs_state, mxcsr);
+ unsigned count = size_total;
int i;
/*
header.xfeatures = xsave->header.xfeatures;
header.xfeatures &= xfeatures_mask_user();
+ if (header.xfeatures & XFEATURE_MASK_FP)
+ copy_part(0, off_mxcsr,
+ &xsave->i387, &kbuf, &offset_start, &count);
+ if (header.xfeatures & (XFEATURE_MASK_SSE | XFEATURE_MASK_YMM))
+ copy_part(off_mxcsr, MXCSR_AND_FLAGS_SIZE,
+ &xsave->i387.mxcsr, &kbuf, &offset_start, &count);
+ if (header.xfeatures & XFEATURE_MASK_FP)
+ copy_part(offsetof(struct fxregs_state, st_space), 128,
+ &xsave->i387.st_space, &kbuf, &offset_start, &count);
+ if (header.xfeatures & XFEATURE_MASK_SSE)
+ copy_part(xstate_offsets[XFEATURE_MASK_SSE], 256,
+ &xsave->i387.xmm_space, &kbuf, &offset_start, &count);
+ /*
+ * Fill xsave->i387.sw_reserved value for ptrace frame:
+ */
+ copy_part(offsetof(struct fxregs_state, sw_reserved), 48,
+ xstate_fx_sw_bytes, &kbuf, &offset_start, &count);
/*
* Copy xregs_state->header:
*/
- offset = offsetof(struct xregs_state, header);
- size = sizeof(header);
-
- __copy_xstate_to_kernel(kbuf, &header, offset, size, size_total);
+ copy_part(offsetof(struct xregs_state, header), sizeof(header),
+ &header, &kbuf, &offset_start, &count);
- for (i = 0; i < XFEATURE_MAX; i++) {
+ for (i = FIRST_EXTENDED_XFEATURE; i < XFEATURE_MAX; i++) {
/*
* Copy only in-use xstates:
*/
if ((header.xfeatures >> i) & 1) {
void *src = __raw_xsave_addr(xsave, i);
- offset = xstate_offsets[i];
- size = xstate_sizes[i];
-
- /* The next component has to fit fully into the output buffer: */
- if (offset + size > size_total)
- break;
-
- __copy_xstate_to_kernel(kbuf, src, offset, size, size_total);
+ copy_part(xstate_offsets[i], xstate_sizes[i],
+ src, &kbuf, &offset_start, &count);
}
}
-
- if (xfeatures_mxcsr_quirk(header.xfeatures)) {
- offset = offsetof(struct fxregs_state, mxcsr);
- size = MXCSR_AND_FLAGS_SIZE;
- __copy_xstate_to_kernel(kbuf, &xsave->i387.mxcsr, offset, size, size_total);
- }
-
- /*
- * Fill xsave->i387.sw_reserved value for ptrace frame:
- */
- offset = offsetof(struct fxregs_state, sw_reserved);
- size = sizeof(xstate_fx_sw_bytes);
-
- __copy_xstate_to_kernel(kbuf, xstate_fx_sw_bytes, offset, size, size_total);
+ fill_gap(size_total, &kbuf, &offset_start, &count);
return 0;
}
/* Defined as markers to the end of the ftrace default trampolines */
extern void ftrace_regs_caller_end(void);
-extern void ftrace_epilogue(void);
+extern void ftrace_regs_caller_ret(void);
+extern void ftrace_caller_end(void);
extern void ftrace_caller_op_ptr(void);
extern void ftrace_regs_caller_op_ptr(void);
call_offset = (unsigned long)ftrace_regs_call;
} else {
start_offset = (unsigned long)ftrace_caller;
- end_offset = (unsigned long)ftrace_epilogue;
+ end_offset = (unsigned long)ftrace_caller_end;
op_offset = (unsigned long)ftrace_caller_op_ptr;
call_offset = (unsigned long)ftrace_call;
}
if (WARN_ON(ret < 0))
goto fail;
+ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
+ ip = trampoline + (ftrace_regs_caller_ret - ftrace_regs_caller);
+ ret = probe_kernel_read(ip, (void *)retq, RET_SIZE);
+ if (WARN_ON(ret < 0))
+ goto fail;
+ }
+
/*
* The address of the ftrace_ops that is used for this trampoline
* is stored at the end of the trampoline. This will be used to
set_vm_flush_reset_perms(trampoline);
- set_memory_ro((unsigned long)trampoline, npages);
+ if (likely(system_state != SYSTEM_BOOTING))
+ set_memory_ro((unsigned long)trampoline, npages);
set_memory_x((unsigned long)trampoline, npages);
return (unsigned long)trampoline;
fail:
return 0;
}
+void set_ftrace_ops_ro(void)
+{
+ struct ftrace_ops *ops;
+ unsigned long start_offset;
+ unsigned long end_offset;
+ unsigned long npages;
+ unsigned long size;
+
+ do_for_each_ftrace_op(ops, ftrace_ops_list) {
+ if (!(ops->flags & FTRACE_OPS_FL_ALLOC_TRAMP))
+ continue;
+
+ if (ops->flags & FTRACE_OPS_FL_SAVE_REGS) {
+ start_offset = (unsigned long)ftrace_regs_caller;
+ end_offset = (unsigned long)ftrace_regs_caller_end;
+ } else {
+ start_offset = (unsigned long)ftrace_caller;
+ end_offset = (unsigned long)ftrace_caller_end;
+ }
+ size = end_offset - start_offset;
+ size = size + RET_SIZE + sizeof(void *);
+ npages = DIV_ROUND_UP(size, PAGE_SIZE);
+ set_memory_ro((unsigned long)ops->trampoline, npages);
+ } while_for_each_ftrace_op(ops);
+}
+
static unsigned long calc_trampoline_call_offset(bool save_regs)
{
unsigned long start_offset;
movl %eax, %ecx
popl %edx
popl %eax
- JMP_NOSPEC %ecx
+ JMP_NOSPEC ecx
#endif
#endif /* CONFIG_FRAME_POINTER */
/* Size of stack used to save mcount regs in save_mcount_regs */
-#define MCOUNT_REG_SIZE (SS+8 + MCOUNT_FRAME_SIZE)
+#define MCOUNT_REG_SIZE (FRAME_SIZE + MCOUNT_FRAME_SIZE)
/*
* gcc -pg option adds a call to 'mcount' in most functions.
/*
* We add enough stack to save all regs.
*/
- subq $(MCOUNT_REG_SIZE - MCOUNT_FRAME_SIZE), %rsp
+ subq $(FRAME_SIZE), %rsp
movq %rax, RAX(%rsp)
movq %rcx, RCX(%rsp)
movq %rdx, RDX(%rsp)
* think twice before adding any new code or changing the
* layout here.
*/
-SYM_INNER_LABEL(ftrace_epilogue, SYM_L_GLOBAL)
+SYM_INNER_LABEL(ftrace_caller_end, SYM_L_GLOBAL)
+ jmp ftrace_epilogue
+SYM_FUNC_END(ftrace_caller);
+
+SYM_FUNC_START(ftrace_epilogue)
#ifdef CONFIG_FUNCTION_GRAPH_TRACER
SYM_INNER_LABEL(ftrace_graph_call, SYM_L_GLOBAL)
jmp ftrace_stub
*/
SYM_INNER_LABEL_ALIGN(ftrace_stub, SYM_L_WEAK)
retq
-SYM_FUNC_END(ftrace_caller)
+SYM_FUNC_END(ftrace_epilogue)
SYM_FUNC_START(ftrace_regs_caller)
/* Save the current flags before any operations that can change them */
pushfq
- UNWIND_HINT_SAVE
-
/* added 8 bytes to save flags */
save_mcount_regs 8
/* save_mcount_regs fills in first two parameters */
movq ORIG_RAX(%rsp), %rax
movq %rax, MCOUNT_REG_SIZE-8(%rsp)
- /* If ORIG_RAX is anything but zero, make this a call to that */
+ /*
+ * If ORIG_RAX is anything but zero, make this a call to that.
+ * See arch_ftrace_set_direct_caller().
+ */
movq ORIG_RAX(%rsp), %rax
- cmpq $0, %rax
- je 1f
+ testq %rax, %rax
+ jz 1f
/* Swap the flags with orig_rax */
movq MCOUNT_REG_SIZE(%rsp), %rdi
movq %rax, MCOUNT_REG_SIZE(%rsp)
restore_mcount_regs 8
+ /* Restore flags */
+ popfq
- jmp 2f
+SYM_INNER_LABEL(ftrace_regs_caller_ret, SYM_L_GLOBAL);
+ UNWIND_HINT_RET_OFFSET
+ jmp ftrace_epilogue
1: restore_mcount_regs
-
-
-2:
- /*
- * The stack layout is nondetermistic here, depending on which path was
- * taken. This confuses objtool and ORC, rightfully so. For now,
- * pretend the stack always looks like the non-direct case.
- */
- UNWIND_HINT_RESTORE
-
/* Restore flags */
popfq
* to the return.
*/
SYM_INNER_LABEL(ftrace_regs_caller_end, SYM_L_GLOBAL)
-
jmp ftrace_epilogue
SYM_FUNC_END(ftrace_regs_caller)
* function tracing is enabled.
*/
movq ftrace_trace_function, %r8
- CALL_NOSPEC %r8
+ CALL_NOSPEC r8
restore_mcount_regs
jmp fgraph_trace
movq 8(%rsp), %rdx
movq (%rsp), %rax
addq $24, %rsp
- JMP_NOSPEC %rdi
+ JMP_NOSPEC rdi
SYM_CODE_END(return_to_handler)
#endif
set_tsk_thread_flag(tsk, TIF_IO_BITMAP);
}
-static void task_update_io_bitmap(void)
+static void task_update_io_bitmap(struct task_struct *tsk)
{
- struct thread_struct *t = ¤t->thread;
+ struct thread_struct *t = &tsk->thread;
if (t->iopl_emul == 3 || t->io_bitmap) {
/* TSS update is handled on exit to user space */
- set_thread_flag(TIF_IO_BITMAP);
+ set_tsk_thread_flag(tsk, TIF_IO_BITMAP);
} else {
- clear_thread_flag(TIF_IO_BITMAP);
+ clear_tsk_thread_flag(tsk, TIF_IO_BITMAP);
/* Invalidate TSS */
preempt_disable();
tss_update_io_bitmap();
}
}
-void io_bitmap_exit(void)
+void io_bitmap_exit(struct task_struct *tsk)
{
- struct io_bitmap *iobm = current->thread.io_bitmap;
+ struct io_bitmap *iobm = tsk->thread.io_bitmap;
- current->thread.io_bitmap = NULL;
- task_update_io_bitmap();
+ tsk->thread.io_bitmap = NULL;
+ task_update_io_bitmap(tsk);
if (iobm && refcount_dec_and_test(&iobm->refcnt))
kfree(iobm);
}
if (!iobm)
return -ENOMEM;
refcount_set(&iobm->refcnt, 1);
- io_bitmap_exit();
+ io_bitmap_exit(current);
}
/*
}
/* All permissions dropped? */
if (max_long == UINT_MAX) {
- io_bitmap_exit();
+ io_bitmap_exit(current);
return 0;
}
}
t->iopl_emul = level;
- task_update_io_bitmap();
+ task_update_io_bitmap(current);
return 0;
}
#include <linux/atomic.h>
#include <linux/sched/clock.h>
-#if defined(CONFIG_EDAC)
-#include <linux/edac.h>
-#endif
-
#include <asm/cpu_entry_area.h>
#include <asm/traps.h>
#include <asm/mach_traps.h>
}
/*
- * Free current thread data structures etc..
+ * Free thread data structures etc..
*/
void exit_thread(struct task_struct *tsk)
{
struct fpu *fpu = &t->fpu;
if (test_thread_flag(TIF_IO_BITMAP))
- io_bitmap_exit();
+ io_bitmap_exit(tsk);
free_vm86(t);
#include <asm/debugreg.h>
#include <asm/switch_to.h>
#include <asm/vm86.h>
-#include <asm/resctrl_sched.h>
+#include <asm/resctrl.h>
#include <asm/proto.h>
#include "process.h"
#include <asm/switch_to.h>
#include <asm/xen/hypervisor.h>
#include <asm/vdso.h>
-#include <asm/resctrl_sched.h>
+#include <asm/resctrl.h>
#include <asm/unistd.h>
#include <asm/fsgsbase.h>
#ifdef CONFIG_IA32_EMULATION
ramdisk_image |= (u64)boot_params.ext_ramdisk_image << 32;
+ if (ramdisk_image == 0)
+ ramdisk_image = phys_initrd_start;
+
return ramdisk_image;
}
static u64 __init get_ramdisk_size(void)
ramdisk_size |= (u64)boot_params.ext_ramdisk_size << 32;
+ if (ramdisk_size == 0)
+ ramdisk_size = phys_initrd_size;
+
return ramdisk_size;
}
wmb();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+
+ /*
+ * Prevent tail call to cpu_startup_entry() because the stack protector
+ * guard has been changed a couple of function calls up, in
+ * boot_init_stack_canary() and must not be checked before tail calling
+ * another function.
+ */
+ prevent_tail_call_optimization();
}
/**
speculative_store_bypass_ht_init();
}
-void arch_enable_nonboot_cpus_begin(void)
+void arch_thaw_secondary_cpus_begin(void)
{
set_mtrr_aps_delayed_init();
}
-void arch_enable_nonboot_cpus_end(void)
+void arch_thaw_secondary_cpus_end(void)
{
mtrr_aps_init();
}
#include <asm/cpu_device_id.h>
#include <asm/intel-family.h>
-#define ICPU(model) \
- {X86_VENDOR_INTEL, 6, model, X86_FEATURE_APERFMPERF, 0}
+#define X86_MATCH(model) \
+ X86_MATCH_VENDOR_FAM_MODEL_FEATURE(INTEL, 6, \
+ INTEL_FAM6_##model, X86_FEATURE_APERFMPERF, NULL)
static const struct x86_cpu_id has_knl_turbo_ratio_limits[] = {
- ICPU(INTEL_FAM6_XEON_PHI_KNL),
- ICPU(INTEL_FAM6_XEON_PHI_KNM),
+ X86_MATCH(XEON_PHI_KNL),
+ X86_MATCH(XEON_PHI_KNM),
{}
};
static const struct x86_cpu_id has_skx_turbo_ratio_limits[] = {
- ICPU(INTEL_FAM6_SKYLAKE_X),
+ X86_MATCH(SKYLAKE_X),
{}
};
static const struct x86_cpu_id has_glm_turbo_ratio_limits[] = {
- ICPU(INTEL_FAM6_ATOM_GOLDMONT),
- ICPU(INTEL_FAM6_ATOM_GOLDMONT_D),
- ICPU(INTEL_FAM6_ATOM_GOLDMONT_PLUS),
+ X86_MATCH(ATOM_GOLDMONT),
+ X86_MATCH(ATOM_GOLDMONT_D),
+ X86_MATCH(ATOM_GOLDMONT_PLUS),
{}
};
#include "../realmode/rm/wakeup.h"
/* Global pointer to shared data; NULL means no measured launch. */
-struct tboot *tboot __read_mostly;
-EXPORT_SYMBOL(tboot);
+static struct tboot *tboot __read_mostly;
/* timeout for APs (in secs) to enter wait-for-SIPI state during shutdown */
#define AP_WAIT_TIMEOUT 1
static u8 tboot_uuid[16] __initdata = TBOOT_UUID;
+bool tboot_enabled(void)
+{
+ return tboot != NULL;
+}
+
void __init tboot_probe(void)
{
/* Look for valid page-aligned address for shared page. */
#include <linux/mm.h>
#include <linux/smp.h>
#include <linux/io.h>
+#include <linux/hardirq.h>
+#include <linux/atomic.h>
+
#include <asm/stacktrace.h>
#include <asm/processor.h>
#include <asm/debugreg.h>
-#include <linux/atomic.h>
#include <asm/text-patching.h>
#include <asm/ftrace.h>
#include <asm/traps.h>
local_irq_disable();
}
-/*
- * In IST context, we explicitly disable preemption. This serves two
- * purposes: it makes it much less likely that we would accidentally
- * schedule in IST context and it will force a warning if we somehow
- * manage to schedule by accident.
- */
-void ist_enter(struct pt_regs *regs)
-{
- if (user_mode(regs)) {
- RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
- } else {
- /*
- * We might have interrupted pretty much anything. In
- * fact, if we're a machine check, we can even interrupt
- * NMI processing. We don't want in_nmi() to return true,
- * but we need to notify RCU.
- */
- rcu_nmi_enter();
- }
-
- preempt_disable();
-
- /* This code is a bit fragile. Test it. */
- RCU_LOCKDEP_WARN(!rcu_is_watching(), "ist_enter didn't work");
-}
-NOKPROBE_SYMBOL(ist_enter);
-
-void ist_exit(struct pt_regs *regs)
-{
- preempt_enable_no_resched();
-
- if (!user_mode(regs))
- rcu_nmi_exit();
-}
-
-/**
- * ist_begin_non_atomic() - begin a non-atomic section in an IST exception
- * @regs: regs passed to the IST exception handler
- *
- * IST exception handlers normally cannot schedule. As a special
- * exception, if the exception interrupted userspace code (i.e.
- * user_mode(regs) would return true) and the exception was not
- * a double fault, it can be safe to schedule. ist_begin_non_atomic()
- * begins a non-atomic section within an ist_enter()/ist_exit() region.
- * Callers are responsible for enabling interrupts themselves inside
- * the non-atomic section, and callers must call ist_end_non_atomic()
- * before ist_exit().
- */
-void ist_begin_non_atomic(struct pt_regs *regs)
-{
- BUG_ON(!user_mode(regs));
-
- /*
- * Sanity check: we need to be on the normal thread stack. This
- * will catch asm bugs and any attempt to use ist_preempt_enable
- * from double_fault.
- */
- BUG_ON(!on_thread_stack());
-
- preempt_enable_no_resched();
-}
-
-/**
- * ist_end_non_atomic() - begin a non-atomic section in an IST exception
- *
- * Ends a non-atomic section started with ist_begin_non_atomic().
- */
-void ist_end_non_atomic(void)
-{
- preempt_disable();
-}
-
int is_valid_bugaddr(unsigned long addr)
{
unsigned short ud;
* The net result is that our #GP handler will think that we
* entered from usermode with the bad user context.
*
- * No need for ist_enter here because we don't use RCU.
+ * No need for nmi_enter() here because we don't use RCU.
*/
if (((long)regs->sp >> P4D_SHIFT) == ESPFIX_PGD_ENTRY &&
regs->cs == __KERNEL_CS &&
}
#endif
- ist_enter(regs);
+ nmi_enter();
notify_die(DIE_TRAP, str, regs, error_code, X86_TRAP_DF, SIGSEGV);
tsk->thread.error_code = error_code;
return;
/*
- * Unlike any other non-IST entry, we can be called from a kprobe in
- * non-CONTEXT_KERNEL kernel mode or even during context tracking
- * state changes. Make sure that we wake up RCU even if we're coming
- * from kernel code.
- *
- * This means that we can't schedule even if we came from a
- * preemptible kernel context. That's okay.
+ * Unlike any other non-IST entry, we can be called from pretty much
+ * any location in the kernel through kprobes -- text_poke() will most
+ * likely be handled by poke_int3_handler() above. This means this
+ * handler is effectively NMI-like.
*/
- if (!user_mode(regs)) {
- rcu_nmi_enter();
- preempt_disable();
- }
- RCU_LOCKDEP_WARN(!rcu_is_watching(), "entry code didn't wake RCU");
+ if (!user_mode(regs))
+ nmi_enter();
#ifdef CONFIG_KGDB_LOW_LEVEL_TRAP
if (kgdb_ll_trap(DIE_INT3, "int3", regs, error_code, X86_TRAP_BP,
cond_local_irq_disable(regs);
exit:
- if (!user_mode(regs)) {
- preempt_enable_no_resched();
- rcu_nmi_exit();
- }
+ if (!user_mode(regs))
+ nmi_exit();
}
NOKPROBE_SYMBOL(do_int3);
unsigned long dr6;
int si_code;
- ist_enter(regs);
+ nmi_enter();
get_debugreg(dr6, 6);
/*
debug_stack_usage_dec();
exit:
- ist_exit(regs);
+ nmi_exit();
}
NOKPROBE_SYMBOL(do_debug);
unsigned long *unwind_get_return_address_ptr(struct unwind_state *state)
{
+ struct task_struct *task = state->task;
+
if (unwind_done(state))
return NULL;
if (state->regs)
return &state->regs->ip;
+ if (task != current && state->sp == task->thread.sp) {
+ struct inactive_task_frame *frame = (void *)task->thread.sp;
+ return &frame->ret_addr;
+ }
+
if (state->sp)
return (unsigned long *)state->sp - 1;
void __unwind_start(struct unwind_state *state, struct task_struct *task,
struct pt_regs *regs, unsigned long *first_frame)
{
- if (!orc_init)
- goto done;
-
memset(state, 0, sizeof(*state));
state->task = task;
+ if (!orc_init)
+ goto err;
+
/*
* Refuse to unwind the stack of a task while it's executing on another
* CPU. This check is racy, but that's ok: the unwinder has other
* checks to prevent it from going off the rails.
*/
if (task_on_another_cpu(task))
- goto done;
+ goto err;
if (regs) {
if (user_mode(regs))
- goto done;
+ goto the_end;
state->ip = regs->ip;
state->sp = regs->sp;
* generate some kind of backtrace if this happens.
*/
void *next_page = (void *)PAGE_ALIGN((unsigned long)state->sp);
+ state->error = true;
if (get_stack_info(next_page, state->task, &state->stack_info,
&state->stack_mask))
return;
return;
-done:
+err:
+ state->error = true;
+the_end:
state->stack_info.type = STACK_TYPE_UNKNOWN;
- return;
}
EXPORT_SYMBOL_GPL(__unwind_start);
*/
kvm_make_vcpus_request_mask(kvm,
KVM_REQ_TLB_FLUSH | KVM_REQUEST_NO_WAKEUP,
- vcpu_mask, &hv_vcpu->tlb_flush);
+ NULL, vcpu_mask, &hv_vcpu->tlb_flush);
ret_success:
/* We always do full TLB flush, set rep_done = rep_cnt. */
#include <linux/kernel.h>
#include <asm/msr-index.h>
+#include <asm/debugreg.h>
#include "kvm_emulate.h"
#include "trace.h"
svm->vmcb->save.rsp = nested_vmcb->save.rsp;
svm->vmcb->save.rip = nested_vmcb->save.rip;
svm->vmcb->save.dr7 = nested_vmcb->save.dr7;
- svm->vmcb->save.dr6 = nested_vmcb->save.dr6;
+ svm->vcpu.arch.dr6 = nested_vmcb->save.dr6;
svm->vmcb->save.cpl = nested_vmcb->save.cpl;
svm->nested.vmcb_msrpm = nested_vmcb->control.msrpm_base_pa & ~0x0fffULL;
nested_vmcb->save.rsp = vmcb->save.rsp;
nested_vmcb->save.rax = vmcb->save.rax;
nested_vmcb->save.dr7 = vmcb->save.dr7;
- nested_vmcb->save.dr6 = vmcb->save.dr6;
+ nested_vmcb->save.dr6 = svm->vcpu.arch.dr6;
nested_vmcb->save.cpl = vmcb->save.cpl;
nested_vmcb->control.int_ctl = vmcb->control.int_ctl;
/* DB exceptions for our internal use must not cause vmexit */
static int nested_svm_intercept_db(struct vcpu_svm *svm)
{
- unsigned long dr6;
+ unsigned long dr6 = svm->vmcb->save.dr6;
+
+ /* Always catch it and pass it to userspace if debugging. */
+ if (svm->vcpu.guest_debug &
+ (KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))
+ return NESTED_EXIT_HOST;
/* if we're not singlestepping, it's not ours */
if (!svm->nmi_singlestep)
- return NESTED_EXIT_DONE;
+ goto reflected_db;
/* if it's not a singlestep exception, it's not ours */
- if (kvm_get_dr(&svm->vcpu, 6, &dr6))
- return NESTED_EXIT_DONE;
if (!(dr6 & DR6_BS))
- return NESTED_EXIT_DONE;
+ goto reflected_db;
/* if the guest is singlestepping, it should get the vmexit */
if (svm->nmi_singlestep_guest_rflags & X86_EFLAGS_TF) {
disable_nmi_singlestep(svm);
- return NESTED_EXIT_DONE;
+ goto reflected_db;
}
/* it's ours, the nested hypervisor must not see this one */
return NESTED_EXIT_HOST;
+
+reflected_db:
+ /*
+ * Synchronize guest DR6 here just like in kvm_deliver_exception_payload;
+ * it will be moved into the nested VMCB by nested_svm_vmexit. Once
+ * exceptions will be moved to svm_check_nested_events, all this stuff
+ * will just go away and we could just return NESTED_EXIT_HOST
+ * unconditionally. db_interception will queue the exception, which
+ * will be processed by svm_check_nested_events if a nested vmexit is
+ * required, and we will just use kvm_deliver_exception_payload to copy
+ * the payload to DR6 before vmexit.
+ */
+ WARN_ON(svm->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT);
+ svm->vcpu.arch.dr6 &= ~(DR_TRAP_BITS | DR6_RTM);
+ svm->vcpu.arch.dr6 |= dr6 & ~DR6_FIXED_1;
+ return NESTED_EXIT_DONE;
}
static int nested_svm_intercept_ioio(struct vcpu_svm *svm)
if (svm->nested.intercept_exceptions & excp_bits) {
if (exit_code == SVM_EXIT_EXCP_BASE + DB_VECTOR)
vmexit = nested_svm_intercept_db(svm);
+ else if (exit_code == SVM_EXIT_EXCP_BASE + BP_VECTOR &&
+ svm->vcpu.guest_debug & KVM_GUESTDBG_USE_SW_BP)
+ vmexit = NESTED_EXIT_HOST;
else
vmexit = NESTED_EXIT_DONE;
}
mark_dirty(svm->vmcb, VMCB_ASID);
}
-static u64 svm_get_dr6(struct kvm_vcpu *vcpu)
+static void svm_set_dr6(struct vcpu_svm *svm, unsigned long value)
{
- return to_svm(vcpu)->vmcb->save.dr6;
-}
-
-static void svm_set_dr6(struct kvm_vcpu *vcpu, unsigned long value)
-{
- struct vcpu_svm *svm = to_svm(vcpu);
+ struct vmcb *vmcb = svm->vmcb;
- svm->vmcb->save.dr6 = value;
- mark_dirty(svm->vmcb, VMCB_DR);
+ if (unlikely(value != vmcb->save.dr6)) {
+ vmcb->save.dr6 = value;
+ mark_dirty(vmcb, VMCB_DR);
+ }
}
static void svm_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
get_debugreg(vcpu->arch.db[1], 1);
get_debugreg(vcpu->arch.db[2], 2);
get_debugreg(vcpu->arch.db[3], 3);
- vcpu->arch.dr6 = svm_get_dr6(vcpu);
+ /*
+ * We cannot reset svm->vmcb->save.dr6 to DR6_FIXED_1|DR6_RTM here,
+ * because db_interception might need it. We can do it before vmentry.
+ */
+ vcpu->arch.dr6 = svm->vmcb->save.dr6;
vcpu->arch.dr7 = svm->vmcb->save.dr7;
-
vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_WONT_EXIT;
set_dr_intercepts(svm);
}
if (!(svm->vcpu.guest_debug &
(KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP)) &&
!svm->nmi_singlestep) {
- kvm_queue_exception(&svm->vcpu, DB_VECTOR);
+ u32 payload = (svm->vmcb->save.dr6 ^ DR6_RTM) & ~DR6_FIXED_1;
+ kvm_queue_exception_p(&svm->vcpu, DB_VECTOR, payload);
return 1;
}
svm->vmcb->save.cr2 = vcpu->arch.cr2;
+ /*
+ * Run with all-zero DR6 unless needed, so that we can get the exact cause
+ * of a #DB.
+ */
+ if (unlikely(svm->vcpu.arch.switch_db_regs & KVM_DEBUGREG_WONT_EXIT))
+ svm_set_dr6(svm, vcpu->arch.dr6);
+ else
+ svm_set_dr6(svm, DR6_FIXED_1 | DR6_RTM);
+
clgi();
kvm_load_guest_xsave_state(vcpu);
.set_idt = svm_set_idt,
.get_gdt = svm_get_gdt,
.set_gdt = svm_set_gdt,
- .get_dr6 = svm_get_dr6,
- .set_dr6 = svm_set_dr6,
.set_dr7 = svm_set_dr7,
.sync_dirty_debug_regs = svm_sync_dirty_debug_regs,
.cache_reg = svm_cache_reg,
vmx_vcpu_pi_load(vcpu, cpu);
- vmx->host_pkru = read_pkru();
vmx->host_debugctlmsr = get_debugctlmsr();
}
dr6 = vmcs_readl(EXIT_QUALIFICATION);
if (!(vcpu->guest_debug &
(KVM_GUESTDBG_SINGLESTEP | KVM_GUESTDBG_USE_HW_BP))) {
- vcpu->arch.dr6 &= ~DR_TRAP_BITS;
- vcpu->arch.dr6 |= dr6 | DR6_RTM;
if (is_icebp(intr_info))
WARN_ON(!skip_emulated_instruction(vcpu));
- kvm_queue_exception(vcpu, DB_VECTOR);
+ kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
return 1;
}
- kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1;
+ kvm_run->debug.arch.dr6 = dr6 | DR6_FIXED_1 | DR6_RTM;
kvm_run->debug.arch.dr7 = vmcs_readl(GUEST_DR7);
/* fall through */
case BP_VECTOR:
* guest debugging itself.
*/
if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP) {
- vcpu->run->debug.arch.dr6 = vcpu->arch.dr6;
+ vcpu->run->debug.arch.dr6 = DR6_BD | DR6_RTM | DR6_FIXED_1;
vcpu->run->debug.arch.dr7 = dr7;
vcpu->run->debug.arch.pc = kvm_get_linear_rip(vcpu);
vcpu->run->debug.arch.exception = DB_VECTOR;
vcpu->run->exit_reason = KVM_EXIT_DEBUG;
return 0;
} else {
- vcpu->arch.dr6 &= ~DR_TRAP_BITS;
- vcpu->arch.dr6 |= DR6_BD | DR6_RTM;
- kvm_queue_exception(vcpu, DB_VECTOR);
+ kvm_queue_exception_p(vcpu, DB_VECTOR, DR6_BD);
return 1;
}
}
return kvm_skip_emulated_instruction(vcpu);
}
-static u64 vmx_get_dr6(struct kvm_vcpu *vcpu)
-{
- return vcpu->arch.dr6;
-}
-
-static void vmx_set_dr6(struct kvm_vcpu *vcpu, unsigned long val)
-{
-}
-
static void vmx_sync_dirty_debug_regs(struct kvm_vcpu *vcpu)
{
get_debugreg(vcpu->arch.db[0], 0);
kvm_load_guest_xsave_state(vcpu);
- if (static_cpu_has(X86_FEATURE_PKU) &&
- kvm_read_cr4_bits(vcpu, X86_CR4_PKE) &&
- vcpu->arch.pkru != vmx->host_pkru)
- __write_pkru(vcpu->arch.pkru);
-
pt_guest_enter(vmx);
if (vcpu_to_pmu(vcpu)->version)
pt_guest_exit(vmx);
- /*
- * eager fpu is enabled if PKEY is supported and CR4 is switched
- * back on host, so it is safe to read guest PKRU from current
- * XSAVE.
- */
- if (static_cpu_has(X86_FEATURE_PKU) &&
- kvm_read_cr4_bits(vcpu, X86_CR4_PKE)) {
- vcpu->arch.pkru = rdpkru();
- if (vcpu->arch.pkru != vmx->host_pkru)
- __write_pkru(vmx->host_pkru);
- }
-
kvm_load_host_xsave_state(vcpu);
vmx->nested.nested_run_pending = 0;
.set_idt = vmx_set_idt,
.get_gdt = vmx_get_gdt,
.set_gdt = vmx_set_gdt,
- .get_dr6 = vmx_get_dr6,
- .set_dr6 = vmx_set_dr6,
.set_dr7 = vmx_set_dr7,
.sync_dirty_debug_regs = vmx_sync_dirty_debug_regs,
.cache_reg = vmx_cache_reg,
}
EXPORT_SYMBOL_GPL(kvm_requeue_exception);
-static void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
- unsigned long payload)
+void kvm_queue_exception_p(struct kvm_vcpu *vcpu, unsigned nr,
+ unsigned long payload)
{
kvm_multiple_exception(vcpu, nr, false, 0, true, payload, false);
}
+EXPORT_SYMBOL_GPL(kvm_queue_exception_p);
static void kvm_queue_exception_e_p(struct kvm_vcpu *vcpu, unsigned nr,
u32 error_code, unsigned long payload)
vcpu->arch.ia32_xss != host_xss)
wrmsrl(MSR_IA32_XSS, vcpu->arch.ia32_xss);
}
+
+ if (static_cpu_has(X86_FEATURE_PKU) &&
+ (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
+ (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU)) &&
+ vcpu->arch.pkru != vcpu->arch.host_pkru)
+ __write_pkru(vcpu->arch.pkru);
}
EXPORT_SYMBOL_GPL(kvm_load_guest_xsave_state);
void kvm_load_host_xsave_state(struct kvm_vcpu *vcpu)
{
+ if (static_cpu_has(X86_FEATURE_PKU) &&
+ (kvm_read_cr4_bits(vcpu, X86_CR4_PKE) ||
+ (vcpu->arch.xcr0 & XFEATURE_MASK_PKRU))) {
+ vcpu->arch.pkru = rdpkru();
+ if (vcpu->arch.pkru != vcpu->arch.host_pkru)
+ __write_pkru(vcpu->arch.host_pkru);
+ }
+
if (kvm_read_cr4_bits(vcpu, X86_CR4_OSXSAVE)) {
if (vcpu->arch.xcr0 != host_xcr0)
}
}
-static void kvm_update_dr6(struct kvm_vcpu *vcpu)
-{
- if (!(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP))
- kvm_x86_ops.set_dr6(vcpu, vcpu->arch.dr6);
-}
-
static void kvm_update_dr7(struct kvm_vcpu *vcpu)
{
unsigned long dr7;
if (val & 0xffffffff00000000ULL)
return -1; /* #GP */
vcpu->arch.dr6 = (val & DR6_VOLATILE) | kvm_dr6_fixed(vcpu);
- kvm_update_dr6(vcpu);
break;
case 5:
/* fall through */
case 4:
/* fall through */
case 6:
- if (vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP)
- *val = vcpu->arch.dr6;
- else
- *val = kvm_x86_ops.get_dr6(vcpu);
+ *val = vcpu->arch.dr6;
break;
case 5:
/* fall through */
kvm_x86_ops.vcpu_load(vcpu, cpu);
+ /* Save host pkru register if supported */
+ vcpu->arch.host_pkru = read_pkru();
+
/* Apply any externally detected TSC adjustments (due to suspend) */
if (unlikely(vcpu->arch.tsc_offset_adjustment)) {
adjust_tsc_offset_host(vcpu, vcpu->arch.tsc_offset_adjustment);
unsigned bank_num = mcg_cap & 0xff, bank;
r = -EINVAL;
- if (!bank_num || bank_num >= KVM_MAX_MCE_BANKS)
+ if (!bank_num || bank_num > KVM_MAX_MCE_BANKS)
goto out;
if (mcg_cap & ~(kvm_mce_cap_supported | 0xff | 0xff0000))
goto out;
memcpy(vcpu->arch.db, dbgregs->db, sizeof(vcpu->arch.db));
kvm_update_dr0123(vcpu);
vcpu->arch.dr6 = dbgregs->dr6;
- kvm_update_dr6(vcpu);
vcpu->arch.dr7 = dbgregs->dr7;
kvm_update_dr7(vcpu);
if (vcpu->guest_debug & KVM_GUESTDBG_SINGLESTEP) {
kvm_run->debug.arch.dr6 = DR6_BS | DR6_FIXED_1 | DR6_RTM;
- kvm_run->debug.arch.pc = vcpu->arch.singlestep_rip;
+ kvm_run->debug.arch.pc = kvm_get_linear_rip(vcpu);
kvm_run->debug.arch.exception = DB_VECTOR;
kvm_run->exit_reason = KVM_EXIT_DEBUG;
return 0;
vcpu->arch.db);
if (dr6 != 0) {
- vcpu->arch.dr6 &= ~DR_TRAP_BITS;
- vcpu->arch.dr6 |= dr6 | DR6_RTM;
- kvm_queue_exception(vcpu, DB_VECTOR);
+ kvm_queue_exception_p(vcpu, DB_VECTOR, dr6);
*r = 1;
return true;
}
zalloc_cpumask_var(&cpus, GFP_ATOMIC);
kvm_make_vcpus_request_mask(kvm, KVM_REQ_SCAN_IOAPIC,
- vcpu_bitmap, cpus);
+ NULL, vcpu_bitmap, cpus);
free_cpumask_var(cpus);
}
*/
void kvm_request_apicv_update(struct kvm *kvm, bool activate, ulong bit)
{
+ struct kvm_vcpu *except;
unsigned long old, new, expected;
if (!kvm_x86_ops.check_apicv_inhibit_reasons ||
trace_kvm_apicv_update_request(activate, bit);
if (kvm_x86_ops.pre_update_apicv_exec_ctrl)
kvm_x86_ops.pre_update_apicv_exec_ctrl(kvm, activate);
- kvm_make_all_cpus_request(kvm, KVM_REQ_APICV_UPDATE);
+
+ /*
+ * Sending request to update APICV for all other vcpus,
+ * while update the calling vcpu immediately instead of
+ * waiting for another #VMEXIT to handle the request.
+ */
+ except = kvm_get_running_vcpu();
+ kvm_make_all_cpus_request_except(kvm, KVM_REQ_APICV_UPDATE,
+ except);
+ if (except)
+ kvm_vcpu_update_apicv(except);
}
EXPORT_SYMBOL_GPL(kvm_request_apicv_update);
WARN_ON(vcpu->guest_debug & KVM_GUESTDBG_USE_HW_BP);
kvm_x86_ops.sync_dirty_debug_regs(vcpu);
kvm_update_dr0123(vcpu);
- kvm_update_dr6(vcpu);
kvm_update_dr7(vcpu);
vcpu->arch.switch_db_regs &= ~KVM_DEBUGREG_RELOAD;
}
memset(vcpu->arch.db, 0, sizeof(vcpu->arch.db));
kvm_update_dr0123(vcpu);
vcpu->arch.dr6 = DR6_INIT;
- kvm_update_dr6(vcpu);
vcpu->arch.dr7 = DR7_FIXED_1;
kvm_update_dr7(vcpu);
negl %ebx
lea 45f(%ebx,%ebx,2), %ebx
testl %esi, %esi
- JMP_NOSPEC %ebx
+ JMP_NOSPEC ebx
# Handle 2-byte-aligned regions
20: addw (%esi), %ax
andl $-32,%edx
lea 3f(%ebx,%ebx), %ebx
testl %esi, %esi
- JMP_NOSPEC %ebx
+ JMP_NOSPEC ebx
1: addl $64,%esi
addl $64,%edi
SRC(movb -32(%edx),%bl) ; SRC(movb (%edx),%bl)
#include <asm/alternative-asm.h>
#include <asm/export.h>
#include <asm/nospec-branch.h>
+#include <asm/unwind_hints.h>
+#include <asm/frame.h>
.macro THUNK reg
.section .text.__x86.indirect_thunk
+ .align 32
SYM_FUNC_START(__x86_indirect_thunk_\reg)
- CFI_STARTPROC
- JMP_NOSPEC %\reg
- CFI_ENDPROC
+ JMP_NOSPEC \reg
SYM_FUNC_END(__x86_indirect_thunk_\reg)
+
+SYM_FUNC_START_NOALIGN(__x86_retpoline_\reg)
+ ANNOTATE_INTRA_FUNCTION_CALL
+ call .Ldo_rop_\@
+.Lspec_trap_\@:
+ UNWIND_HINT_EMPTY
+ pause
+ lfence
+ jmp .Lspec_trap_\@
+.Ldo_rop_\@:
+ mov %\reg, (%_ASM_SP)
+ UNWIND_HINT_RET_OFFSET
+ ret
+SYM_FUNC_END(__x86_retpoline_\reg)
+
.endm
/*
* only see one instance of "__x86_indirect_thunk_\reg" rather
* than one per register with the correct names. So we do it
* the simple and nasty way...
+ *
+ * Worse, you can only have a single EXPORT_SYMBOL per line,
+ * and CPP can't insert newlines, so we have to repeat everything
+ * at least twice.
*/
-#define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
-#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
-#define GENERATE_THUNK(reg) THUNK reg ; EXPORT_THUNK(reg)
-
-GENERATE_THUNK(_ASM_AX)
-GENERATE_THUNK(_ASM_BX)
-GENERATE_THUNK(_ASM_CX)
-GENERATE_THUNK(_ASM_DX)
-GENERATE_THUNK(_ASM_SI)
-GENERATE_THUNK(_ASM_DI)
-GENERATE_THUNK(_ASM_BP)
-#ifdef CONFIG_64BIT
-GENERATE_THUNK(r8)
-GENERATE_THUNK(r9)
-GENERATE_THUNK(r10)
-GENERATE_THUNK(r11)
-GENERATE_THUNK(r12)
-GENERATE_THUNK(r13)
-GENERATE_THUNK(r14)
-GENERATE_THUNK(r15)
-#endif
+
+#define __EXPORT_THUNK(sym) _ASM_NOKPROBE(sym); EXPORT_SYMBOL(sym)
+#define EXPORT_THUNK(reg) __EXPORT_THUNK(__x86_indirect_thunk_ ## reg)
+#define EXPORT_RETPOLINE(reg) __EXPORT_THUNK(__x86_retpoline_ ## reg)
+
+#undef GEN
+#define GEN(reg) THUNK reg
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) EXPORT_THUNK(reg)
+#include <asm/GEN-for-each-reg.h>
+
+#undef GEN
+#define GEN(reg) EXPORT_RETPOLINE(reg)
+#include <asm/GEN-for-each-reg.h>
} else {
pfn = pgt_buf_end;
pgt_buf_end += num;
- printk(KERN_DEBUG "BRK [%#010lx, %#010lx] PGTABLE\n",
- pfn << PAGE_SHIFT, (pgt_buf_end << PAGE_SHIFT) - 1);
}
for (i = 0; i < num; i++) {
#include <asm/init.h>
#include <asm/uv/uv.h>
#include <asm/setup.h>
+#include <asm/ftrace.h>
#include "mm_internal.h"
all_end = roundup((unsigned long)_brk_end, PMD_SIZE);
set_memory_nx(text_end, (all_end - text_end) >> PAGE_SHIFT);
+ set_ftrace_ops_ro();
+
#ifdef CONFIG_CPA_DEBUG
printk(KERN_INFO "Testing CPA: undo %lx-%lx\n", start, end);
set_memory_rw(start, (end-start) >> PAGE_SHIFT);
#include <linux/sched/signal.h>
#include <linux/sched/mm.h>
#include <linux/compat.h>
+#include <linux/elf-randomize.h>
#include <asm/elf.h>
+#include <asm/io.h>
#include "physaddr.h"
int cpu;
int err;
- if (downed_cpus == NULL &&
+ if (!cpumask_available(downed_cpus) &&
!alloc_cpumask_var(&downed_cpus, GFP_KERNEL)) {
pr_notice("Failed to allocate mask\n");
goto out;
int cpu;
int err;
- if (downed_cpus == NULL || cpumask_weight(downed_cpus) == 0)
+ if (!cpumask_available(downed_cpus) || cpumask_weight(downed_cpus) == 0)
return;
pr_notice("Re-enabling CPUs...\n");
for_each_cpu(cpu, downed_cpus) {
#include "numa_internal.h"
-#ifdef CONFIG_DISCONTIGMEM
-/*
- * 4) physnode_map - the mapping between a pfn and owning node
- * physnode_map keeps track of the physical memory layout of a generic
- * numa node on a 64Mb break (each element of the array will
- * represent 64Mb of memory and will be marked by the node id. so,
- * if the first gig is on node 0, and the second gig is on node 1
- * physnode_map will contain:
- *
- * physnode_map[0-15] = 0;
- * physnode_map[16-31] = 1;
- * physnode_map[32- ] = -1;
- */
-s8 physnode_map[MAX_SECTIONS] __read_mostly = { [0 ... (MAX_SECTIONS - 1)] = -1};
-EXPORT_SYMBOL(physnode_map);
-
-void memory_present(int nid, unsigned long start, unsigned long end)
-{
- unsigned long pfn;
-
- printk(KERN_INFO "Node: %d, start_pfn: %lx, end_pfn: %lx\n",
- nid, start, end);
- printk(KERN_DEBUG " Setting physnode_map array to node %d for pfns:\n", nid);
- printk(KERN_DEBUG " ");
- start = round_down(start, PAGES_PER_SECTION);
- end = round_up(end, PAGES_PER_SECTION);
- for (pfn = start; pfn < end; pfn += PAGES_PER_SECTION) {
- physnode_map[pfn / PAGES_PER_SECTION] = nid;
- printk(KERN_CONT "%lx ", pfn);
- }
- printk(KERN_CONT "\n");
-}
-#endif
-
extern unsigned long highend_pfn, highstart_pfn;
void __init initmem_init(void)
unsigned long efi_fw_vendor, efi_config_table;
static const efi_config_table_type_t arch_tables[] __initconst = {
- {EFI_PROPERTIES_TABLE_GUID, "PROP", &prop_phys},
- {UGA_IO_PROTOCOL_GUID, "UGA", &uga_phys},
+ {EFI_PROPERTIES_TABLE_GUID, &prop_phys, "PROP" },
+ {UGA_IO_PROTOCOL_GUID, &uga_phys, "UGA" },
#ifdef CONFIG_X86_UV
- {UV_SYSTEM_TABLE_GUID, "UVsystab", &uv_systab_phys},
+ {UV_SYSTEM_TABLE_GUID, &uv_systab_phys, "UVsystab" },
#endif
- {NULL_GUID, NULL, NULL},
+ {},
};
static const unsigned long * const efi_tables[] = {
mov %r8, %r9
mov %rcx, %r8
mov %rsi, %rcx
- CALL_NOSPEC %rdi
+ CALL_NOSPEC rdi
leave
ret
SYM_FUNC_END(__efi_call)
if (ret)
return ret;
smp_ops.play_dead = resume_play_dead;
- ret = disable_nonboot_cpus();
+ ret = freeze_secondary_cpus(0);
smp_ops.play_dead = play_dead;
return ret;
}
.fw_vendor = EFI_INVALID_TABLE_ADDR, /* Initialized later. */
.fw_revision = 0, /* Initialized later. */
.con_in_handle = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
- .con_in = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
+ .con_in = NULL, /* Not used under Xen. */
.con_out_handle = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
.con_out = NULL, /* Not used under Xen. */
.stderr_handle = EFI_INVALID_TABLE_ADDR, /* Not used under Xen. */
cpu_bringup();
boot_init_stack_canary();
cpu_startup_entry(CPUHP_AP_ONLINE_IDLE);
+ prevent_tail_call_optimization();
}
void xen_smp_intr_free_pv(unsigned int cpu)
}
/*
- * Non-mq queues do not honor REQ_NOWAIT, so complete a bio
- * with BLK_STS_AGAIN status in order to catch -EAGAIN and
- * to give a chance to the caller to repeat request gracefully.
+ * For a REQ_NOWAIT based request, return -EOPNOTSUPP
+ * if queue is not a request based queue.
*/
- if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q)) {
- status = BLK_STS_AGAIN;
- goto end_io;
- }
+ if ((bio->bi_opf & REQ_NOWAIT) && !queue_is_mq(q))
+ goto not_supported;
if (should_fail_bio(bio))
goto end_io;
config CRYPTO_CTR
tristate "CTR support"
select CRYPTO_SKCIPHER
- select CRYPTO_SEQIV
select CRYPTO_MANAGER
help
CTR: Counter mode
config CRYPTO_DRBG_CTR
bool "Enable CTR DRBG"
select CRYPTO_AES
- depends on CRYPTO_CTR
+ select CRYPTO_CTR
help
Enable the CTR DRBG variant as defined in NIST SP800-90A.
err = wait_for_completion_killable(&larval->completion);
WARN_ON(err);
if (!err)
- crypto_probing_notify(CRYPTO_MSG_ALG_LOADED, larval);
+ crypto_notify(CRYPTO_MSG_ALG_LOADED, larval);
out:
crypto_larval_kill(&larval->alg);
static struct crypto_alg *crypto_spawn_alg(struct crypto_spawn *spawn)
{
- struct crypto_alg *alg;
+ struct crypto_alg *alg = ERR_PTR(-EAGAIN);
+ struct crypto_alg *target;
+ bool shoot = false;
down_read(&crypto_alg_sem);
- alg = spawn->alg;
- if (!spawn->dead && !crypto_mod_get(alg)) {
- alg->cra_flags |= CRYPTO_ALG_DYING;
- alg = NULL;
+ if (!spawn->dead) {
+ alg = spawn->alg;
+ if (!crypto_mod_get(alg)) {
+ target = crypto_alg_get(alg);
+ shoot = true;
+ alg = ERR_PTR(-EAGAIN);
+ }
}
up_read(&crypto_alg_sem);
- return alg ?: ERR_PTR(-EAGAIN);
+ if (shoot) {
+ crypto_shoot_alg(target);
+ crypto_alg_put(target);
+ }
+
+ return alg;
}
struct crypto_tfm *crypto_spawn_tfm(struct crypto_spawn *spawn, u32 type,
}
EXPORT_SYMBOL_GPL(crypto_enqueue_request);
+void crypto_enqueue_request_head(struct crypto_queue *queue,
+ struct crypto_async_request *request)
+{
+ queue->qlen++;
+ list_add(&request->list, &queue->list);
+}
+EXPORT_SYMBOL_GPL(crypto_enqueue_request_head);
+
struct crypto_async_request *crypto_dequeue_request(struct crypto_queue *queue)
{
struct list_head *request;
struct sock *sk = sock->sk;
struct alg_sock *ask = alg_sk(sk);
struct rng_ctx *ctx = ask->private;
- int err = -EFAULT;
+ int err;
int genlen = 0;
u8 result[MAXSIZE];
return len;
}
-static void crypto_shoot_alg(struct crypto_alg *alg)
+void crypto_shoot_alg(struct crypto_alg *alg)
{
down_write(&crypto_alg_sem);
alg->cra_flags |= CRYPTO_ALG_DYING;
up_write(&crypto_alg_sem);
}
+EXPORT_SYMBOL_GPL(crypto_shoot_alg);
struct crypto_tfm *__crypto_alloc_tfm(struct crypto_alg *alg, u32 type,
u32 mask)
ROUND(9);
ROUND(10);
ROUND(11);
-
+#ifdef CONFIG_CC_IS_CLANG
+#pragma nounroll /* https://bugs.llvm.org/show_bug.cgi?id=45803 */
+#endif
for (i = 0; i < 8; ++i)
S->h[i] = S->h[i] ^ v[i] ^ v[i + 8];
}
* @err: error number
*/
static void crypto_finalize_request(struct crypto_engine *engine,
- struct crypto_async_request *req, int err)
+ struct crypto_async_request *req, int err)
{
unsigned long flags;
- bool finalize_cur_req = false;
+ bool finalize_req = false;
int ret;
struct crypto_engine_ctx *enginectx;
- spin_lock_irqsave(&engine->queue_lock, flags);
- if (engine->cur_req == req)
- finalize_cur_req = true;
- spin_unlock_irqrestore(&engine->queue_lock, flags);
+ /*
+ * If hardware cannot enqueue more requests
+ * and retry mechanism is not supported
+ * make sure we are completing the current request
+ */
+ if (!engine->retry_support) {
+ spin_lock_irqsave(&engine->queue_lock, flags);
+ if (engine->cur_req == req) {
+ finalize_req = true;
+ engine->cur_req = NULL;
+ }
+ spin_unlock_irqrestore(&engine->queue_lock, flags);
+ }
- if (finalize_cur_req) {
+ if (finalize_req || engine->retry_support) {
enginectx = crypto_tfm_ctx(req->tfm);
- if (engine->cur_req_prepared &&
+ if (enginectx->op.prepare_request &&
enginectx->op.unprepare_request) {
ret = enginectx->op.unprepare_request(engine, req);
if (ret)
dev_err(engine->dev, "failed to unprepare request\n");
}
- spin_lock_irqsave(&engine->queue_lock, flags);
- engine->cur_req = NULL;
- engine->cur_req_prepared = false;
- spin_unlock_irqrestore(&engine->queue_lock, flags);
}
-
req->complete(req, err);
kthread_queue_work(engine->kworker, &engine->pump_requests);
spin_lock_irqsave(&engine->queue_lock, flags);
/* Make sure we are not already running a request */
- if (engine->cur_req)
+ if (!engine->retry_support && engine->cur_req)
goto out;
/* If another context is idling then defer */
goto out;
}
+start_request:
/* Get the fist request from the engine queue to handle */
backlog = crypto_get_backlog(&engine->queue);
async_req = crypto_dequeue_request(&engine->queue);
if (!async_req)
goto out;
- engine->cur_req = async_req;
+ /*
+ * If hardware doesn't support the retry mechanism,
+ * keep track of the request we are processing now.
+ * We'll need it on completion (crypto_finalize_request).
+ */
+ if (!engine->retry_support)
+ engine->cur_req = async_req;
+
if (backlog)
backlog->complete(backlog, -EINPROGRESS);
ret = engine->prepare_crypt_hardware(engine);
if (ret) {
dev_err(engine->dev, "failed to prepare crypt hardware\n");
- goto req_err;
+ goto req_err_2;
}
}
if (ret) {
dev_err(engine->dev, "failed to prepare request: %d\n",
ret);
- goto req_err;
+ goto req_err_2;
}
- engine->cur_req_prepared = true;
}
if (!enginectx->op.do_one_request) {
dev_err(engine->dev, "failed to do request\n");
ret = -EINVAL;
- goto req_err;
+ goto req_err_1;
}
+
ret = enginectx->op.do_one_request(engine, async_req);
- if (ret) {
- dev_err(engine->dev, "Failed to do one request from queue: %d\n", ret);
- goto req_err;
+
+ /* Request unsuccessfully executed by hardware */
+ if (ret < 0) {
+ /*
+ * If hardware queue is full (-ENOSPC), requeue request
+ * regardless of backlog flag.
+ * Otherwise, unprepare and complete the request.
+ */
+ if (!engine->retry_support ||
+ (ret != -ENOSPC)) {
+ dev_err(engine->dev,
+ "Failed to do one request from queue: %d\n",
+ ret);
+ goto req_err_1;
+ }
+ /*
+ * If retry mechanism is supported,
+ * unprepare current request and
+ * enqueue it back into crypto-engine queue.
+ */
+ if (enginectx->op.unprepare_request) {
+ ret = enginectx->op.unprepare_request(engine,
+ async_req);
+ if (ret)
+ dev_err(engine->dev,
+ "failed to unprepare request\n");
+ }
+ spin_lock_irqsave(&engine->queue_lock, flags);
+ /*
+ * If hardware was unable to execute request, enqueue it
+ * back in front of crypto-engine queue, to keep the order
+ * of requests.
+ */
+ crypto_enqueue_request_head(&engine->queue, async_req);
+
+ kthread_queue_work(engine->kworker, &engine->pump_requests);
+ goto out;
}
- return;
-req_err:
- crypto_finalize_request(engine, async_req, ret);
+ goto retry;
+
+req_err_1:
+ if (enginectx->op.unprepare_request) {
+ ret = enginectx->op.unprepare_request(engine, async_req);
+ if (ret)
+ dev_err(engine->dev, "failed to unprepare request\n");
+ }
+
+req_err_2:
+ async_req->complete(async_req, ret);
+
+retry:
+ /* If retry mechanism is supported, send new requests to engine */
+ if (engine->retry_support) {
+ spin_lock_irqsave(&engine->queue_lock, flags);
+ goto start_request;
+ }
return;
out:
spin_unlock_irqrestore(&engine->queue_lock, flags);
+
+ /*
+ * Batch requests is possible only if
+ * hardware can enqueue multiple requests
+ */
+ if (engine->do_batch_requests) {
+ ret = engine->do_batch_requests(engine);
+ if (ret)
+ dev_err(engine->dev, "failed to do batch requests: %d\n",
+ ret);
+ }
+
+ return;
}
static void crypto_pump_work(struct kthread_work *work)
EXPORT_SYMBOL_GPL(crypto_engine_stop);
/**
- * crypto_engine_alloc_init - allocate crypto hardware engine structure and
- * initialize it.
+ * crypto_engine_alloc_init_and_set - allocate crypto hardware engine structure
+ * and initialize it by setting the maximum number of entries in the software
+ * crypto-engine queue.
* @dev: the device attached with one hardware engine
+ * @retry_support: whether hardware has support for retry mechanism
+ * @cbk_do_batch: pointer to a callback function to be invoked when executing a
+ * a batch of requests.
+ * This has the form:
+ * callback(struct crypto_engine *engine)
+ * where:
+ * @engine: the crypto engine structure.
* @rt: whether this queue is set to run as a realtime task
+ * @qlen: maximum size of the crypto-engine queue
*
* This must be called from context that can sleep.
* Return: the crypto engine structure on success, else NULL.
*/
-struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
+struct crypto_engine *crypto_engine_alloc_init_and_set(struct device *dev,
+ bool retry_support,
+ int (*cbk_do_batch)(struct crypto_engine *engine),
+ bool rt, int qlen)
{
struct sched_param param = { .sched_priority = MAX_RT_PRIO / 2 };
struct crypto_engine *engine;
engine->running = false;
engine->busy = false;
engine->idling = false;
- engine->cur_req_prepared = false;
+ engine->retry_support = retry_support;
engine->priv_data = dev;
+ /*
+ * Batch requests is possible only if
+ * hardware has support for retry mechanism.
+ */
+ engine->do_batch_requests = retry_support ? cbk_do_batch : NULL;
+
snprintf(engine->name, sizeof(engine->name),
"%s-engine", dev_name(dev));
- crypto_init_queue(&engine->queue, CRYPTO_ENGINE_MAX_QLEN);
+ crypto_init_queue(&engine->queue, qlen);
spin_lock_init(&engine->queue_lock);
engine->kworker = kthread_create_worker(0, "%s", engine->name);
return engine;
}
+EXPORT_SYMBOL_GPL(crypto_engine_alloc_init_and_set);
+
+/**
+ * crypto_engine_alloc_init - allocate crypto hardware engine structure and
+ * initialize it.
+ * @dev: the device attached with one hardware engine
+ * @rt: whether this queue is set to run as a realtime task
+ *
+ * This must be called from context that can sleep.
+ * Return: the crypto engine structure on success, else NULL.
+ */
+struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt)
+{
+ return crypto_engine_alloc_init_and_set(dev, false, NULL, rt,
+ CRYPTO_ENGINE_MAX_QLEN);
+}
EXPORT_SYMBOL_GPL(crypto_engine_alloc_init);
/**
if (ret)
goto unlock;
- /* If nonblocking pool is initialized, deactivate Jitter RNG */
- crypto_free_rng(drbg->jent);
- drbg->jent = NULL;
-
/* Set seeded to false so that if __drbg_seed fails the
* next generate call will trigger a reseed.
*/
entropylen);
if (ret) {
pr_devel("DRBG: jent failed with %d\n", ret);
- goto out;
+
+ /*
+ * Do not treat the transient failure of the
+ * Jitter RNG as an error that needs to be
+ * reported. The combined number of the
+ * maximum reseed threshold times the maximum
+ * number of Jitter RNG transient errors is
+ * less than the reseed threshold required by
+ * SP800-90A allowing us to treat the
+ * transient errors as such.
+ *
+ * However, we mandate that at least the first
+ * seeding operation must succeed with the
+ * Jitter RNG.
+ */
+ if (!reseed || ret != -EAGAIN)
+ goto out;
}
drbg_string_fill(&data1, entropy, entropylen * 2);
if (IS_ENABLED(CONFIG_CRYPTO_FIPS)) {
drbg->prev = kzalloc(drbg_sec_strength(drbg->core->flags),
GFP_KERNEL);
- if (!drbg->prev)
+ if (!drbg->prev) {
+ ret = -ENOMEM;
goto fini;
+ }
drbg->fips_primed = false;
}
if (list_empty(&drbg->test_data.list))
return 0;
+ drbg->jent = crypto_alloc_rng("jitterentropy_rng", 0, 0);
+
INIT_WORK(&drbg->seed_work, drbg_async_seed);
drbg->random_ready.owner = THIS_MODULE;
return err;
}
- drbg->jent = crypto_alloc_rng("jitterentropy_rng", 0, 0);
-
/*
* Require frequent reseeds until the seed source is fully
* initialized.
const u8 *key, unsigned int keylen)
{
struct essiv_tfm_ctx *tctx = crypto_skcipher_ctx(tfm);
- SHASH_DESC_ON_STACK(desc, tctx->hash);
u8 salt[HASH_MAX_DIGESTSIZE];
int err;
if (err)
return err;
- desc->tfm = tctx->hash;
- err = crypto_shash_digest(desc, key, keylen, salt);
+ err = crypto_shash_tfm_digest(tctx->hash, key, keylen, salt);
if (err)
return err;
void crypto_remove_spawns(struct crypto_alg *alg, struct list_head *list,
struct crypto_alg *nalg);
void crypto_remove_final(struct list_head *list);
+void crypto_shoot_alg(struct crypto_alg *alg);
struct crypto_tfm *__crypto_alloc_tfm(struct crypto_alg *alg, u32 type,
u32 mask);
void *crypto_create_tfm(struct crypto_alg *alg,
struct jitterentropy {
spinlock_t jent_lock;
struct rand_data *entropy_collector;
+ unsigned int reset_cnt;
};
static int jent_kcapi_init(struct crypto_tfm *tfm)
int ret = 0;
spin_lock(&rng->jent_lock);
+
+ /* Return a permanent error in case we had too many resets in a row. */
+ if (rng->reset_cnt > (1<<10)) {
+ ret = -EFAULT;
+ goto out;
+ }
+
ret = jent_read_entropy(rng->entropy_collector, rdata, dlen);
+
+ /* Reset RNG in case of health failures */
+ if (ret < -1) {
+ pr_warn_ratelimited("Reset Jitter RNG due to health test failure: %s failure\n",
+ (ret == -2) ? "Repetition Count Test" :
+ "Adaptive Proportion Test");
+
+ rng->reset_cnt++;
+
+ ret = -EAGAIN;
+ } else {
+ rng->reset_cnt = 0;
+
+ /* Convert the Jitter RNG error into a usable error code */
+ if (ret == -1)
+ ret = -EINVAL;
+ }
+
+out:
spin_unlock(&rng->jent_lock);
return ret;
* Non-physical true random number generator based on timing jitter --
* Jitter RNG standalone code.
*
- * Copyright Stephan Mueller <smueller@chronox.de>, 2015 - 2019
+ * Copyright Stephan Mueller <smueller@chronox.de>, 2015 - 2020
*
* Design
* ======
/*
* This Jitterentropy RNG is based on the jitterentropy library
- * version 2.1.2 provided at http://www.chronox.de/jent.html
+ * version 2.2.0 provided at http://www.chronox.de/jent.html
*/
#ifdef __OPTIMIZE__
unsigned int memblocksize; /* Size of one memory block in bytes */
unsigned int memaccessloops; /* Number of memory accesses per random
* bit generation */
+
+ /* Repetition Count Test */
+ int rct_count; /* Number of stuck values */
+
+ /* Adaptive Proportion Test for a significance level of 2^-30 */
+#define JENT_APT_CUTOFF 325 /* Taken from SP800-90B sec 4.4.2 */
+#define JENT_APT_WINDOW_SIZE 512 /* Data window size */
+ /* LSB of time stamp to process */
+#define JENT_APT_LSB 16
+#define JENT_APT_WORD_MASK (JENT_APT_LSB - 1)
+ unsigned int apt_observations; /* Number of collected observations */
+ unsigned int apt_count; /* APT counter */
+ unsigned int apt_base; /* APT base reference */
+ unsigned int apt_base_set:1; /* APT base reference set? */
+
+ unsigned int health_failure:1; /* Permanent health failure */
};
/* Flags that can be used to initialize the RNG */
* variations (2nd derivation of time is
* zero). */
#define JENT_ESTUCK 8 /* Too many stuck results during init. */
+#define JENT_EHEALTH 9 /* Health test failed during initialization */
+#define JENT_ERCT 10 /* RCT failed during initialization */
+
+#include "jitterentropy.h"
/***************************************************************************
- * Helper functions
+ * Adaptive Proportion Test
+ *
+ * This test complies with SP800-90B section 4.4.2.
***************************************************************************/
-#include "jitterentropy.h"
+/**
+ * Reset the APT counter
+ *
+ * @ec [in] Reference to entropy collector
+ */
+static void jent_apt_reset(struct rand_data *ec, unsigned int delta_masked)
+{
+ /* Reset APT counter */
+ ec->apt_count = 0;
+ ec->apt_base = delta_masked;
+ ec->apt_observations = 0;
+}
+
+/**
+ * Insert a new entropy event into APT
+ *
+ * @ec [in] Reference to entropy collector
+ * @delta_masked [in] Masked time delta to process
+ */
+static void jent_apt_insert(struct rand_data *ec, unsigned int delta_masked)
+{
+ /* Initialize the base reference */
+ if (!ec->apt_base_set) {
+ ec->apt_base = delta_masked;
+ ec->apt_base_set = 1;
+ return;
+ }
+
+ if (delta_masked == ec->apt_base) {
+ ec->apt_count++;
+
+ if (ec->apt_count >= JENT_APT_CUTOFF)
+ ec->health_failure = 1;
+ }
+
+ ec->apt_observations++;
+
+ if (ec->apt_observations >= JENT_APT_WINDOW_SIZE)
+ jent_apt_reset(ec, delta_masked);
+}
+
+/***************************************************************************
+ * Stuck Test and its use as Repetition Count Test
+ *
+ * The Jitter RNG uses an enhanced version of the Repetition Count Test
+ * (RCT) specified in SP800-90B section 4.4.1. Instead of counting identical
+ * back-to-back values, the input to the RCT is the counting of the stuck
+ * values during the generation of one Jitter RNG output block.
+ *
+ * The RCT is applied with an alpha of 2^{-30} compliant to FIPS 140-2 IG 9.8.
+ *
+ * During the counting operation, the Jitter RNG always calculates the RCT
+ * cut-off value of C. If that value exceeds the allowed cut-off value,
+ * the Jitter RNG output block will be calculated completely but discarded at
+ * the end. The caller of the Jitter RNG is informed with an error code.
+ ***************************************************************************/
+
+/**
+ * Repetition Count Test as defined in SP800-90B section 4.4.1
+ *
+ * @ec [in] Reference to entropy collector
+ * @stuck [in] Indicator whether the value is stuck
+ */
+static void jent_rct_insert(struct rand_data *ec, int stuck)
+{
+ /*
+ * If we have a count less than zero, a previous RCT round identified
+ * a failure. We will not overwrite it.
+ */
+ if (ec->rct_count < 0)
+ return;
+
+ if (stuck) {
+ ec->rct_count++;
+
+ /*
+ * The cutoff value is based on the following consideration:
+ * alpha = 2^-30 as recommended in FIPS 140-2 IG 9.8.
+ * In addition, we require an entropy value H of 1/OSR as this
+ * is the minimum entropy required to provide full entropy.
+ * Note, we collect 64 * OSR deltas for inserting them into
+ * the entropy pool which should then have (close to) 64 bits
+ * of entropy.
+ *
+ * Note, ec->rct_count (which equals to value B in the pseudo
+ * code of SP800-90B section 4.4.1) starts with zero. Hence
+ * we need to subtract one from the cutoff value as calculated
+ * following SP800-90B.
+ */
+ if ((unsigned int)ec->rct_count >= (31 * ec->osr)) {
+ ec->rct_count = -1;
+ ec->health_failure = 1;
+ }
+ } else {
+ ec->rct_count = 0;
+ }
+}
+
+/**
+ * Is there an RCT health test failure?
+ *
+ * @ec [in] Reference to entropy collector
+ *
+ * @return
+ * 0 No health test failure
+ * 1 Permanent health test failure
+ */
+static int jent_rct_failure(struct rand_data *ec)
+{
+ if (ec->rct_count < 0)
+ return 1;
+ return 0;
+}
+
+static inline __u64 jent_delta(__u64 prev, __u64 next)
+{
+#define JENT_UINT64_MAX (__u64)(~((__u64) 0))
+ return (prev < next) ? (next - prev) :
+ (JENT_UINT64_MAX - prev + 1 + next);
+}
+
+/**
+ * Stuck test by checking the:
+ * 1st derivative of the jitter measurement (time delta)
+ * 2nd derivative of the jitter measurement (delta of time deltas)
+ * 3rd derivative of the jitter measurement (delta of delta of time deltas)
+ *
+ * All values must always be non-zero.
+ *
+ * @ec [in] Reference to entropy collector
+ * @current_delta [in] Jitter time delta
+ *
+ * @return
+ * 0 jitter measurement not stuck (good bit)
+ * 1 jitter measurement stuck (reject bit)
+ */
+static int jent_stuck(struct rand_data *ec, __u64 current_delta)
+{
+ __u64 delta2 = jent_delta(ec->last_delta, current_delta);
+ __u64 delta3 = jent_delta(ec->last_delta2, delta2);
+ unsigned int delta_masked = current_delta & JENT_APT_WORD_MASK;
+
+ ec->last_delta = current_delta;
+ ec->last_delta2 = delta2;
+
+ /*
+ * Insert the result of the comparison of two back-to-back time
+ * deltas.
+ */
+ jent_apt_insert(ec, delta_masked);
+
+ if (!current_delta || !delta2 || !delta3) {
+ /* RCT with a stuck bit */
+ jent_rct_insert(ec, 1);
+ return 1;
+ }
+
+ /* RCT with a non-stuck bit */
+ jent_rct_insert(ec, 0);
+
+ return 0;
+}
+
+/**
+ * Report any health test failures
+ *
+ * @ec [in] Reference to entropy collector
+ *
+ * @return
+ * 0 No health test failure
+ * 1 Permanent health test failure
+ */
+static int jent_health_failure(struct rand_data *ec)
+{
+ /* Test is only enabled in FIPS mode */
+ if (!jent_fips_enabled())
+ return 0;
+
+ return ec->health_failure;
+}
+
+/***************************************************************************
+ * Noise sources
+ ***************************************************************************/
/**
* Update of the loop count used for the next round of
return (shuffle + (1<<min));
}
-/***************************************************************************
- * Noise sources
- ***************************************************************************/
-
/**
* CPU Jitter noise source -- this is the noise source based on the CPU
* execution time jitter
* the CPU execution time jitter. Any change to the loop in this function
* implies that careful retesting must be done.
*
- * Input:
- * @ec entropy collector struct
- * @time time stamp to be injected
- * @loop_cnt if a value not equal to 0 is set, use the given value as number of
- * loops to perform the folding
+ * @ec [in] entropy collector struct
+ * @time [in] time stamp to be injected
+ * @loop_cnt [in] if a value not equal to 0 is set, use the given value as
+ * number of loops to perform the folding
+ * @stuck [in] Is the time stamp identified as stuck?
*
* Output:
* updated ec->data
*
* @return Number of loops the folding operation is performed
*/
-static __u64 jent_lfsr_time(struct rand_data *ec, __u64 time, __u64 loop_cnt)
+static void jent_lfsr_time(struct rand_data *ec, __u64 time, __u64 loop_cnt,
+ int stuck)
{
unsigned int i;
__u64 j = 0;
new ^= tmp;
}
}
- ec->data = new;
- return fold_loop_cnt;
+ /*
+ * If the time stamp is stuck, do not finally insert the value into
+ * the entropy pool. Although this operation should not do any harm
+ * even when the time stamp has no entropy, SP800-90B requires that
+ * any conditioning operation (SP800-90B considers the LFSR to be a
+ * conditioning operation) to have an identical amount of input
+ * data according to section 3.1.5.
+ */
+ if (!stuck)
+ ec->data = new;
}
/**
* to reliably access either L3 or memory, the ec->mem memory must be quite
* large which is usually not desirable.
*
- * Input:
- * @ec Reference to the entropy collector with the memory access data -- if
- * the reference to the memory block to be accessed is NULL, this noise
- * source is disabled
- * @loop_cnt if a value not equal to 0 is set, use the given value as number of
- * loops to perform the folding
- *
- * @return Number of memory access operations
+ * @ec [in] Reference to the entropy collector with the memory access data -- if
+ * the reference to the memory block to be accessed is NULL, this noise
+ * source is disabled
+ * @loop_cnt [in] if a value not equal to 0 is set, use the given value
+ * number of loops to perform the LFSR
*/
-static unsigned int jent_memaccess(struct rand_data *ec, __u64 loop_cnt)
+static void jent_memaccess(struct rand_data *ec, __u64 loop_cnt)
{
unsigned int wrap = 0;
__u64 i = 0;
jent_loop_shuffle(ec, MAX_ACC_LOOP_BIT, MIN_ACC_LOOP_BIT);
if (NULL == ec || NULL == ec->mem)
- return 0;
+ return;
wrap = ec->memblocksize * ec->memblocks;
/*
ec->memlocation = ec->memlocation + ec->memblocksize - 1;
ec->memlocation = ec->memlocation % wrap;
}
- return i;
}
/***************************************************************************
* Start of entropy processing logic
***************************************************************************/
-
-/**
- * Stuck test by checking the:
- * 1st derivation of the jitter measurement (time delta)
- * 2nd derivation of the jitter measurement (delta of time deltas)
- * 3rd derivation of the jitter measurement (delta of delta of time deltas)
- *
- * All values must always be non-zero.
- *
- * Input:
- * @ec Reference to entropy collector
- * @current_delta Jitter time delta
- *
- * @return
- * 0 jitter measurement not stuck (good bit)
- * 1 jitter measurement stuck (reject bit)
- */
-static int jent_stuck(struct rand_data *ec, __u64 current_delta)
-{
- __s64 delta2 = ec->last_delta - current_delta;
- __s64 delta3 = delta2 - ec->last_delta2;
-
- ec->last_delta = current_delta;
- ec->last_delta2 = delta2;
-
- if (!current_delta || !delta2 || !delta3)
- return 1;
-
- return 0;
-}
-
/**
* This is the heart of the entropy generation: calculate time deltas and
* use the CPU jitter in the time deltas. The jitter is injected into the
* of this function! This can be done by calling this function
* and not using its result.
*
- * Input:
- * @entropy_collector Reference to entropy collector
+ * @ec [in] Reference to entropy collector
*
* @return result of stuck test
*/
{
__u64 time = 0;
__u64 current_delta = 0;
+ int stuck;
/* Invoke one noise source before time measurement to add variations */
jent_memaccess(ec, 0);
* invocation to measure the timing variations
*/
jent_get_nstime(&time);
- current_delta = time - ec->prev_time;
+ current_delta = jent_delta(ec->prev_time, time);
ec->prev_time = time;
+ /* Check whether we have a stuck measurement. */
+ stuck = jent_stuck(ec, current_delta);
+
/* Now call the next noise sources which also injects the data */
- jent_lfsr_time(ec, current_delta, 0);
+ jent_lfsr_time(ec, current_delta, 0, stuck);
- /* Check whether we have a stuck measurement. */
- return jent_stuck(ec, current_delta);
+ return stuck;
}
/**
* Generator of one 64 bit random number
* Function fills rand_data->data
*
- * Input:
- * @ec Reference to entropy collector
+ * @ec [in] Reference to entropy collector
*/
static void jent_gen_entropy(struct rand_data *ec)
{
}
}
-/**
- * The continuous test required by FIPS 140-2 -- the function automatically
- * primes the test if needed.
- *
- * Return:
- * returns normally if FIPS test passed
- * panics the kernel if FIPS test failed
- */
-static void jent_fips_test(struct rand_data *ec)
-{
- if (!jent_fips_enabled())
- return;
-
- /* prime the FIPS test */
- if (!ec->old_data) {
- ec->old_data = ec->data;
- jent_gen_entropy(ec);
- }
-
- if (ec->data == ec->old_data)
- jent_panic("jitterentropy: Duplicate output detected\n");
-
- ec->old_data = ec->data;
-}
-
/**
* Entry function: Obtain entropy for the caller.
*
* This function truncates the last 64 bit entropy value output to the exact
* size specified by the caller.
*
- * Input:
- * @ec Reference to entropy collector
- * @data pointer to buffer for storing random data -- buffer must already
- * exist
- * @len size of the buffer, specifying also the requested number of random
- * in bytes
+ * @ec [in] Reference to entropy collector
+ * @data [in] pointer to buffer for storing random data -- buffer must already
+ * exist
+ * @len [in] size of the buffer, specifying also the requested number of random
+ * in bytes
*
* @return 0 when request is fulfilled or an error
*
* The following error codes can occur:
* -1 entropy_collector is NULL
+ * -2 RCT failed
+ * -3 APT test failed
*/
int jent_read_entropy(struct rand_data *ec, unsigned char *data,
unsigned int len)
unsigned int tocopy;
jent_gen_entropy(ec);
- jent_fips_test(ec);
+
+ if (jent_health_failure(ec)) {
+ int ret;
+
+ if (jent_rct_failure(ec))
+ ret = -2;
+ else
+ ret = -3;
+
+ /*
+ * Re-initialize the noise source
+ *
+ * If the health test fails, the Jitter RNG remains
+ * in failure state and will return a health failure
+ * during next invocation.
+ */
+ if (jent_entropy_init())
+ return ret;
+
+ /* Set APT to initial state */
+ jent_apt_reset(ec, 0);
+ ec->apt_base_set = 0;
+
+ /* Set RCT to initial state */
+ ec->rct_count = 0;
+
+ /* Re-enable Jitter RNG */
+ ec->health_failure = 0;
+
+ /*
+ * Return the health test failure status to the
+ * caller as the generated value is not appropriate.
+ */
+ return ret;
+ }
+
if ((DATA_SIZE_BITS / 8) < len)
tocopy = (DATA_SIZE_BITS / 8);
else
int i;
__u64 delta_sum = 0;
__u64 old_delta = 0;
+ unsigned int nonstuck = 0;
int time_backwards = 0;
int count_mod = 0;
int count_stuck = 0;
struct rand_data ec = { 0 };
+ /* Required for RCT */
+ ec.osr = 1;
+
/* We could perform statistical tests here, but the problem is
* that we only have a few loop counts to do testing. These
* loop counts may show some slight skew and we produce
/*
* TESTLOOPCOUNT needs some loops to identify edge systems. 100 is
* definitely too little.
+ *
+ * SP800-90B requires at least 1024 initial test cycles.
*/
-#define TESTLOOPCOUNT 300
+#define TESTLOOPCOUNT 1024
#define CLEARCACHE 100
for (i = 0; (TESTLOOPCOUNT + CLEARCACHE) > i; i++) {
__u64 time = 0;
/* Invoke core entropy collection logic */
jent_get_nstime(&time);
ec.prev_time = time;
- jent_lfsr_time(&ec, time, 0);
+ jent_lfsr_time(&ec, time, 0, 0);
jent_get_nstime(&time2);
/* test whether timer works */
if (!time || !time2)
return JENT_ENOTIME;
- delta = time2 - time;
+ delta = jent_delta(time, time2);
/*
* test whether timer is fine grained enough to provide
* delta even when called shortly after each other -- this
if (stuck)
count_stuck++;
+ else {
+ nonstuck++;
+
+ /*
+ * Ensure that the APT succeeded.
+ *
+ * With the check below that count_stuck must be less
+ * than 10% of the overall generated raw entropy values
+ * it is guaranteed that the APT is invoked at
+ * floor((TESTLOOPCOUNT * 0.9) / 64) == 14 times.
+ */
+ if ((nonstuck % JENT_APT_WINDOW_SIZE) == 0) {
+ jent_apt_reset(&ec,
+ delta & JENT_APT_WORD_MASK);
+ if (jent_health_failure(&ec))
+ return JENT_EHEALTH;
+ }
+ }
+
+ /* Validate RCT */
+ if (jent_rct_failure(&ec))
+ return JENT_ERCT;
/* test whether we have an increasing timer */
if (!(time2 > time))
crypto_free_skcipher(ctx->child);
}
-static void free_inst(struct skcipher_instance *inst)
+static void crypto_lrw_free(struct skcipher_instance *inst)
{
crypto_drop_skcipher(skcipher_instance_ctx(inst));
kfree(inst);
inst->alg.encrypt = encrypt;
inst->alg.decrypt = decrypt;
- inst->free = free_inst;
+ inst->free = crypto_lrw_free;
err = skcipher_register_instance(tmpl, inst);
if (err) {
err_free_inst:
- free_inst(inst);
+ crypto_lrw_free(inst);
}
return err;
}
#include <linux/init.h>
#include <linux/module.h>
#include <linux/mm.h>
-#include <linux/cryptohash.h>
#include <linux/types.h>
#include <crypto/sha.h>
#include <crypto/sha1_base.h>
static void sha1_generic_block_fn(struct sha1_state *sst, u8 const *src,
int blocks)
{
- u32 temp[SHA_WORKSPACE_WORDS];
+ u32 temp[SHA1_WORKSPACE_WORDS];
while (blocks--) {
- sha_transform(sst->state, src, temp);
+ sha1_transform(sst->state, src, temp);
src += SHA1_BLOCK_SIZE;
}
memzero_explicit(temp, sizeof(temp));
static int crypto_sha256_init(struct shash_desc *desc)
{
- return sha256_init(shash_desc_ctx(desc));
+ sha256_init(shash_desc_ctx(desc));
+ return 0;
}
static int crypto_sha224_init(struct shash_desc *desc)
{
- return sha224_init(shash_desc_ctx(desc));
+ sha224_init(shash_desc_ctx(desc));
+ return 0;
}
int crypto_sha256_update(struct shash_desc *desc, const u8 *data,
unsigned int len)
{
- return sha256_update(shash_desc_ctx(desc), data, len);
+ sha256_update(shash_desc_ctx(desc), data, len);
+ return 0;
}
EXPORT_SYMBOL(crypto_sha256_update);
static int crypto_sha256_final(struct shash_desc *desc, u8 *out)
{
if (crypto_shash_digestsize(desc->tfm) == SHA224_DIGEST_SIZE)
- return sha224_final(shash_desc_ctx(desc), out);
+ sha224_final(shash_desc_ctx(desc), out);
else
- return sha256_final(shash_desc_ctx(desc), out);
+ sha256_final(shash_desc_ctx(desc), out);
+ return 0;
}
int crypto_sha256_finup(struct shash_desc *desc, const u8 *data,
}
EXPORT_SYMBOL_GPL(crypto_shash_digest);
+int crypto_shash_tfm_digest(struct crypto_shash *tfm, const u8 *data,
+ unsigned int len, u8 *out)
+{
+ SHASH_DESC_ON_STACK(desc, tfm);
+ int err;
+
+ desc->tfm = tfm;
+
+ err = crypto_shash_digest(desc, data, len, out);
+
+ shash_desc_zero(desc);
+
+ return err;
+}
+EXPORT_SYMBOL_GPL(crypto_shash_tfm_digest);
+
static int shash_default_export(struct shash_desc *desc, void *out)
{
memcpy(out, shash_desc_ctx(desc), crypto_shash_descsize(desc->tfm));
crypto_free_cipher(ctx->tweak);
}
-static void free_inst(struct skcipher_instance *inst)
+static void crypto_xts_free(struct skcipher_instance *inst)
{
crypto_drop_skcipher(skcipher_instance_ctx(inst));
kfree(inst);
inst->alg.encrypt = encrypt;
inst->alg.decrypt = decrypt;
- inst->free = free_inst;
+ inst->free = crypto_xts_free;
err = skcipher_register_instance(tmpl, inst);
if (err) {
err_free_inst:
- free_inst(inst);
+ crypto_xts_free(inst);
}
return err;
}
acpi_set_gpe_wake_mask(NULL, first_ec->gpe, action);
}
-bool acpi_ec_other_gpes_active(void)
-{
- return acpi_any_gpe_status_set(first_ec ? first_ec->gpe : U32_MAX);
-}
-
bool acpi_ec_dispatch_gpe(void)
{
u32 ret;
if (!first_ec)
+ return acpi_any_gpe_status_set(U32_MAX);
+
+ /*
+ * Report wakeup if the status bit is set for any enabled GPE other
+ * than the EC one.
+ */
+ if (acpi_any_gpe_status_set(first_ec->gpe))
+ return true;
+
+ if (ec_no_wakeup)
return false;
+ /*
+ * Dispatch the EC GPE in-band, but do not report wakeup in any case
+ * to allow the caller to process events properly after that.
+ */
ret = acpi_dispatch_gpe(NULL, first_ec->gpe);
if (ret == ACPI_INTERRUPT_HANDLED) {
pm_pr_dbg("EC GPE dispatched\n");
- return true;
+
+ /* Flush the event and query workqueues. */
+ acpi_ec_flush_work();
}
+
return false;
}
#endif /* CONFIG_PM_SLEEP */
#ifdef CONFIG_PM_SLEEP
void acpi_ec_flush_work(void);
-bool acpi_ec_other_gpes_active(void);
bool acpi_ec_dispatch_gpe(void);
#endif
return 0;
}
-static void acpi_s2idle_sync(void)
-{
- /* The EC driver uses special workqueues that need to be flushed. */
- acpi_ec_flush_work();
- acpi_os_wait_events_complete(); /* synchronize Notify handling */
-}
-
static bool acpi_s2idle_wake(void)
{
if (!acpi_sci_irq_valid())
if (acpi_check_wakeup_handlers())
return true;
- /*
- * If the status bit is set for any enabled GPE other than the
- * EC one, the wakeup is regarded as a genuine one.
- */
- if (acpi_ec_other_gpes_active())
+ /* Check non-EC GPE wakeups and dispatch the EC GPE. */
+ if (acpi_ec_dispatch_gpe())
return true;
/*
- * If the EC GPE status bit has not been set, the wakeup is
- * regarded as a spurious one.
- */
- if (!acpi_ec_dispatch_gpe())
- return false;
-
- /*
- * Cancel the wakeup and process all pending events in case
+ * Cancel the SCI wakeup and process all pending events in case
* there are any wakeup ones in there.
*
* Note that if any non-EC GPEs are active at this point, the
* should be missed by canceling the wakeup here.
*/
pm_system_cancel_wakeup();
-
- acpi_s2idle_sync();
+ acpi_os_wait_events_complete();
/*
* The SCI is in the "suspended" state now and it cannot produce
* of GPEs.
*/
acpi_os_wait_events_complete(); /* synchronize GPE processing */
- acpi_s2idle_sync();
+ acpi_ec_flush_work(); /* flush the EC driver's workqueues */
+ acpi_os_wait_events_complete(); /* synchronize Notify handling */
s2idle_wakeup = false;
link->flags |= DL_FLAG_STATELESS;
goto reorder;
} else {
+ link->flags |= DL_FLAG_STATELESS;
goto out;
}
}
flags & DL_FLAG_PM_RUNTIME)
pm_runtime_resume(supplier);
+ list_add_tail_rcu(&link->s_node, &supplier->links.consumers);
+ list_add_tail_rcu(&link->c_node, &consumer->links.suppliers);
+
if (flags & DL_FLAG_SYNC_STATE_ONLY) {
dev_dbg(consumer,
"Linked as a sync state only consumer to %s\n",
dev_name(supplier));
goto out;
}
+
reorder:
/*
* Move the consumer and all of the devices depending on it to the end
*/
device_reorder_to_tail(consumer, NULL);
- list_add_tail_rcu(&link->s_node, &supplier->links.consumers);
- list_add_tail_rcu(&link->c_node, &consumer->links.suppliers);
-
dev_dbg(consumer, "Linked as a consumer to %s\n", dev_name(supplier));
- out:
+out:
device_pm_unlock();
device_links_write_unlock();
list_add_tail(&sup->links.defer_sync, &deferred_sync);
}
+static void device_link_drop_managed(struct device_link *link)
+{
+ link->flags &= ~DL_FLAG_MANAGED;
+ WRITE_ONCE(link->status, DL_STATE_NONE);
+ kref_put(&link->kref, __device_link_del);
+}
+
/**
* device_links_driver_bound - Update device links after probing its driver.
* @dev: Device to update the links for.
*/
void device_links_driver_bound(struct device *dev)
{
- struct device_link *link;
+ struct device_link *link, *ln;
LIST_HEAD(sync_list);
/*
else
__device_links_queue_sync_state(dev, &sync_list);
- list_for_each_entry(link, &dev->links.suppliers, c_node) {
+ list_for_each_entry_safe(link, ln, &dev->links.suppliers, c_node) {
+ struct device *supplier;
+
if (!(link->flags & DL_FLAG_MANAGED))
continue;
- WARN_ON(link->status != DL_STATE_CONSUMER_PROBE);
- WRITE_ONCE(link->status, DL_STATE_ACTIVE);
+ supplier = link->supplier;
+ if (link->flags & DL_FLAG_SYNC_STATE_ONLY) {
+ /*
+ * When DL_FLAG_SYNC_STATE_ONLY is set, it means no
+ * other DL_MANAGED_LINK_FLAGS have been set. So, it's
+ * save to drop the managed link completely.
+ */
+ device_link_drop_managed(link);
+ } else {
+ WARN_ON(link->status != DL_STATE_CONSUMER_PROBE);
+ WRITE_ONCE(link->status, DL_STATE_ACTIVE);
+ }
+ /*
+ * This needs to be done even for the deleted
+ * DL_FLAG_SYNC_STATE_ONLY device link in case it was the last
+ * device link that was preventing the supplier from getting a
+ * sync_state() call.
+ */
if (defer_sync_state_count)
- __device_links_supplier_defer_sync(link->supplier);
+ __device_links_supplier_defer_sync(supplier);
else
- __device_links_queue_sync_state(link->supplier,
- &sync_list);
+ __device_links_queue_sync_state(supplier, &sync_list);
}
dev->links.status = DL_DEV_DRIVER_BOUND;
device_links_flush_sync_list(&sync_list, dev);
}
-static void device_link_drop_managed(struct device_link *link)
-{
- link->flags &= ~DL_FLAG_MANAGED;
- WRITE_ONCE(link->status, DL_STATE_NONE);
- kref_put(&link->kref, __device_link_del);
-}
-
/**
* __device_links_no_driver - Update links of a device without a driver.
* @dev: Device without a drvier.
if (*ppos < 0 || !count)
return -EINVAL;
+ if (count > (PAGE_SIZE << (MAX_ORDER - 1)))
+ count = PAGE_SIZE << (MAX_ORDER - 1);
+
buf = kmalloc(count, GFP_KERNEL);
if (!buf)
return -ENOMEM;
if (*ppos < 0 || !count)
return -EINVAL;
+ if (count > (PAGE_SIZE << (MAX_ORDER - 1)))
+ count = PAGE_SIZE << (MAX_ORDER - 1);
+
buf = kmalloc(count, GFP_KERNEL);
if (!buf)
return -ENOMEM;
.max_raw_write = I2C_SMBUS_BLOCK_MAX,
};
+static int regmap_i2c_smbus_i2c_write_reg16(void *context, const void *data,
+ size_t count)
+{
+ struct device *dev = context;
+ struct i2c_client *i2c = to_i2c_client(dev);
+
+ if (count < 2)
+ return -EINVAL;
+
+ count--;
+ return i2c_smbus_write_i2c_block_data(i2c, ((u8 *)data)[0], count,
+ (u8 *)data + 1);
+}
+
+static int regmap_i2c_smbus_i2c_read_reg16(void *context, const void *reg,
+ size_t reg_size, void *val,
+ size_t val_size)
+{
+ struct device *dev = context;
+ struct i2c_client *i2c = to_i2c_client(dev);
+ int ret, count, len = val_size;
+
+ if (reg_size != 2)
+ return -EINVAL;
+
+ ret = i2c_smbus_write_byte_data(i2c, ((u16 *)reg)[0] & 0xff,
+ ((u16 *)reg)[0] >> 8);
+ if (ret < 0)
+ return ret;
+
+ count = 0;
+ do {
+ /* Current Address Read */
+ ret = i2c_smbus_read_byte(i2c);
+ if (ret < 0)
+ break;
+
+ *((u8 *)val++) = ret;
+ count++;
+ len--;
+ } while (len > 0);
+
+ if (count == val_size)
+ return 0;
+ else if (ret < 0)
+ return ret;
+ else
+ return -EIO;
+}
+
+static const struct regmap_bus regmap_i2c_smbus_i2c_block_reg16 = {
+ .write = regmap_i2c_smbus_i2c_write_reg16,
+ .read = regmap_i2c_smbus_i2c_read_reg16,
+ .max_raw_read = I2C_SMBUS_BLOCK_MAX,
+ .max_raw_write = I2C_SMBUS_BLOCK_MAX,
+};
+
static const struct regmap_bus *regmap_get_i2c_bus(struct i2c_client *i2c,
const struct regmap_config *config)
{
i2c_check_functionality(i2c->adapter,
I2C_FUNC_SMBUS_I2C_BLOCK))
return ®map_i2c_smbus_i2c_block;
+ else if (config->val_bits == 8 && config->reg_bits == 16 &&
+ i2c_check_functionality(i2c->adapter,
+ I2C_FUNC_SMBUS_I2C_BLOCK))
+ return ®map_i2c_smbus_i2c_block_reg16;
else if (config->val_bits == 16 && config->reg_bits == 8 &&
i2c_check_functionality(i2c->adapter,
I2C_FUNC_SMBUS_WORD_DATA))
};
/**
- * regmap_add_irq_chip() - Use standard regmap IRQ controller handling
+ * regmap_add_irq_chip_np() - Use standard regmap IRQ controller handling
*
+ * @np: The device_node where the IRQ domain should be added to.
* @map: The regmap for the device.
* @irq: The IRQ the device uses to signal interrupts.
* @irq_flags: The IRQF_ flags to use for the primary interrupt.
* register cache. The chip driver is responsible for restoring the
* register values used by the IRQ controller over suspend and resume.
*/
-int regmap_add_irq_chip(struct regmap *map, int irq, int irq_flags,
- int irq_base, const struct regmap_irq_chip *chip,
- struct regmap_irq_chip_data **data)
+int regmap_add_irq_chip_np(struct device_node *np, struct regmap *map, int irq,
+ int irq_flags, int irq_base,
+ const struct regmap_irq_chip *chip,
+ struct regmap_irq_chip_data **data)
{
struct regmap_irq_chip_data *d;
int i;
}
if (irq_base)
- d->domain = irq_domain_add_legacy(map->dev->of_node,
- chip->num_irqs, irq_base, 0,
- ®map_domain_ops, d);
+ d->domain = irq_domain_add_legacy(np, chip->num_irqs, irq_base,
+ 0, ®map_domain_ops, d);
else
- d->domain = irq_domain_add_linear(map->dev->of_node,
- chip->num_irqs,
+ d->domain = irq_domain_add_linear(np, chip->num_irqs,
®map_domain_ops, d);
if (!d->domain) {
dev_err(map->dev, "Failed to create IRQ domain\n");
kfree(d);
return ret;
}
+EXPORT_SYMBOL_GPL(regmap_add_irq_chip_np);
+
+/**
+ * regmap_add_irq_chip() - Use standard regmap IRQ controller handling
+ *
+ * @map: The regmap for the device.
+ * @irq: The IRQ the device uses to signal interrupts.
+ * @irq_flags: The IRQF_ flags to use for the primary interrupt.
+ * @irq_base: Allocate at specific IRQ number if irq_base > 0.
+ * @chip: Configuration for the interrupt controller.
+ * @data: Runtime data structure for the controller, allocated on success.
+ *
+ * Returns 0 on success or an errno on failure.
+ *
+ * This is the same as regmap_add_irq_chip_np, except that the device
+ * node of the regmap is used.
+ */
+int regmap_add_irq_chip(struct regmap *map, int irq, int irq_flags,
+ int irq_base, const struct regmap_irq_chip *chip,
+ struct regmap_irq_chip_data **data)
+{
+ return regmap_add_irq_chip_np(map->dev->of_node, map, irq, irq_flags,
+ irq_base, chip, data);
+}
EXPORT_SYMBOL_GPL(regmap_add_irq_chip);
/**
}
/**
- * devm_regmap_add_irq_chip() - Resource manager regmap_add_irq_chip()
+ * devm_regmap_add_irq_chip_np() - Resource manager regmap_add_irq_chip_np()
*
* @dev: The device pointer on which irq_chip belongs to.
+ * @np: The device_node where the IRQ domain should be added to.
* @map: The regmap for the device.
* @irq: The IRQ the device uses to signal interrupts
* @irq_flags: The IRQF_ flags to use for the primary interrupt.
* The ®map_irq_chip_data will be automatically released when the device is
* unbound.
*/
-int devm_regmap_add_irq_chip(struct device *dev, struct regmap *map, int irq,
- int irq_flags, int irq_base,
- const struct regmap_irq_chip *chip,
- struct regmap_irq_chip_data **data)
+int devm_regmap_add_irq_chip_np(struct device *dev, struct device_node *np,
+ struct regmap *map, int irq, int irq_flags,
+ int irq_base,
+ const struct regmap_irq_chip *chip,
+ struct regmap_irq_chip_data **data)
{
struct regmap_irq_chip_data **ptr, *d;
int ret;
if (!ptr)
return -ENOMEM;
- ret = regmap_add_irq_chip(map, irq, irq_flags, irq_base,
- chip, &d);
+ ret = regmap_add_irq_chip_np(np, map, irq, irq_flags, irq_base,
+ chip, &d);
if (ret < 0) {
devres_free(ptr);
return ret;
*data = d;
return 0;
}
+EXPORT_SYMBOL_GPL(devm_regmap_add_irq_chip_np);
+
+/**
+ * devm_regmap_add_irq_chip() - Resource manager regmap_add_irq_chip()
+ *
+ * @dev: The device pointer on which irq_chip belongs to.
+ * @map: The regmap for the device.
+ * @irq: The IRQ the device uses to signal interrupts
+ * @irq_flags: The IRQF_ flags to use for the primary interrupt.
+ * @irq_base: Allocate at specific IRQ number if irq_base > 0.
+ * @chip: Configuration for the interrupt controller.
+ * @data: Runtime data structure for the controller, allocated on success
+ *
+ * Returns 0 on success or an errno on failure.
+ *
+ * The ®map_irq_chip_data will be automatically released when the device is
+ * unbound.
+ */
+int devm_regmap_add_irq_chip(struct device *dev, struct regmap *map, int irq,
+ int irq_flags, int irq_base,
+ const struct regmap_irq_chip *chip,
+ struct regmap_irq_chip_data **data)
+{
+ return devm_regmap_add_irq_chip_np(dev, map->dev->of_node, map, irq,
+ irq_flags, irq_base, chip, data);
+}
EXPORT_SYMBOL_GPL(devm_regmap_add_irq_chip);
/**
} else if (!bus->read || !bus->write) {
map->reg_read = _regmap_bus_reg_read;
map->reg_write = _regmap_bus_reg_write;
+ map->reg_update_bits = bus->reg_update_bits;
map->defer_caching = false;
goto skip_format_initialization;
}
EXPORT_SYMBOL_GPL(regmap_update_bits_base);
+/**
+ * regmap_test_bits() - Check if all specified bits are set in a register.
+ *
+ * @map: Register map to operate on
+ * @reg: Register to read from
+ * @bits: Bits to test
+ *
+ * Returns -1 if the underlying regmap_read() fails, 0 if at least one of the
+ * tested bits is not set and 1 if all tested bits are set.
+ */
+int regmap_test_bits(struct regmap *map, unsigned int reg, unsigned int bits)
+{
+ unsigned int val, ret;
+
+ ret = regmap_read(map, reg, &val);
+ if (ret)
+ return ret;
+
+ return (val & bits) == bits;
+}
+EXPORT_SYMBOL_GPL(regmap_test_bits);
+
void regmap_async_complete_cb(struct regmap_async *async, int ret)
{
struct regmap *map = async->map;
{
if (nullb->dev->discard == false)
return;
+
+ if (nullb->dev->zoned) {
+ nullb->dev->discard = false;
+ pr_info("discard option is ignored in zoned mode\n");
+ return;
+ }
+
nullb->q->limits.discard_granularity = nullb->dev->blocksize;
nullb->q->limits.discard_alignment = nullb->dev->blocksize;
blk_queue_max_discard_sectors(nullb->q, UINT_MAX >> 9);
pr_err("zone_size must be power-of-two\n");
return -EINVAL;
}
+ if (dev->zone_size > dev->size) {
+ pr_err("Zone size larger than device capacity\n");
+ return -EINVAL;
+ }
dev->zone_size_sects = dev->zone_size << ZONE_SIZE_SHIFT;
dev->nr_zones = dev_size >>
if (!IS_ERR_OR_NULL(zstrm->tfm))
crypto_free_comp(zstrm->tfm);
free_pages((unsigned long)zstrm->buffer, 1);
- kfree(zstrm);
+ zstrm->tfm = NULL;
+ zstrm->buffer = NULL;
}
/*
- * allocate new zcomp_strm structure with ->tfm initialized by
- * backend, return NULL on error
+ * Initialize zcomp_strm structure with ->tfm initialized by backend, and
+ * ->buffer. Return a negative value on error.
*/
-static struct zcomp_strm *zcomp_strm_alloc(struct zcomp *comp)
+static int zcomp_strm_init(struct zcomp_strm *zstrm, struct zcomp *comp)
{
- struct zcomp_strm *zstrm = kmalloc(sizeof(*zstrm), GFP_KERNEL);
- if (!zstrm)
- return NULL;
-
zstrm->tfm = crypto_alloc_comp(comp->name, 0, 0);
/*
* allocate 2 pages. 1 for compressed data, plus 1 extra for the
zstrm->buffer = (void *)__get_free_pages(GFP_KERNEL | __GFP_ZERO, 1);
if (IS_ERR_OR_NULL(zstrm->tfm) || !zstrm->buffer) {
zcomp_strm_free(zstrm);
- zstrm = NULL;
+ return -ENOMEM;
}
- return zstrm;
+ return 0;
}
bool zcomp_available_algorithm(const char *comp)
struct zcomp_strm *zcomp_stream_get(struct zcomp *comp)
{
- return *get_cpu_ptr(comp->stream);
+ local_lock(&comp->stream->lock);
+ return this_cpu_ptr(comp->stream);
}
void zcomp_stream_put(struct zcomp *comp)
{
- put_cpu_ptr(comp->stream);
+ local_unlock(&comp->stream->lock);
}
int zcomp_compress(struct zcomp_strm *zstrm,
{
struct zcomp *comp = hlist_entry(node, struct zcomp, node);
struct zcomp_strm *zstrm;
+ int ret;
- if (WARN_ON(*per_cpu_ptr(comp->stream, cpu)))
- return 0;
+ zstrm = per_cpu_ptr(comp->stream, cpu);
+ local_lock_init(&zstrm->lock);
- zstrm = zcomp_strm_alloc(comp);
- if (IS_ERR_OR_NULL(zstrm)) {
+ ret = zcomp_strm_init(zstrm, comp);
+ if (ret)
pr_err("Can't allocate a compression stream\n");
- return -ENOMEM;
- }
- *per_cpu_ptr(comp->stream, cpu) = zstrm;
- return 0;
+ return ret;
}
int zcomp_cpu_dead(unsigned int cpu, struct hlist_node *node)
struct zcomp *comp = hlist_entry(node, struct zcomp, node);
struct zcomp_strm *zstrm;
- zstrm = *per_cpu_ptr(comp->stream, cpu);
- if (!IS_ERR_OR_NULL(zstrm))
- zcomp_strm_free(zstrm);
- *per_cpu_ptr(comp->stream, cpu) = NULL;
+ zstrm = per_cpu_ptr(comp->stream, cpu);
+ zcomp_strm_free(zstrm);
return 0;
}
{
int ret;
- comp->stream = alloc_percpu(struct zcomp_strm *);
+ comp->stream = alloc_percpu(struct zcomp_strm);
if (!comp->stream)
return -ENOMEM;
#ifndef _ZCOMP_H_
#define _ZCOMP_H_
+#include <linux/local_lock.h>
struct zcomp_strm {
+ /* The members ->buffer and ->tfm are protected by ->lock. */
+ local_lock_t lock;
/* compression/decompression buffer */
void *buffer;
struct crypto_comp *tfm;
/* dynamic per-device compression frontend */
struct zcomp {
- struct zcomp_strm * __percpu *stream;
+ struct zcomp_strm __percpu *stream;
const char *name;
struct hlist_node node;
};
}
/* Setup cmd context */
+ ret = -ENOMEM;
mhi_ctxt->cmd_ctxt = mhi_alloc_coherent(mhi_cntrl,
sizeof(*mhi_ctxt->cmd_ctxt) *
NR_OF_CMD_RINGS,
}
}
+ ret = -EINVAL;
if (dl_chan) {
/*
* If channel supports LPM notifications then status_cb should
help
This option enables Keystone's hardware random generator.
+config HW_RANDOM_CCTRNG
+ tristate "Arm CryptoCell True Random Number Generator support"
+ depends on HAS_IOMEM && OF
+ help
+ Say 'Y' to enable the True Random Number Generator driver for the
+ Arm TrustZone CryptoCell family of processors.
+ Currently the CryptoCell 713 and 703 are supported.
+ The driver is supported only in SoC where Trusted Execution
+ Environment is not used.
+ Choose 'M' to compile this driver as a module. The module
+ will be called cctrng.
+ If unsure, say 'N'.
+
endif # HW_RANDOM
config UML_RANDOM
obj-$(CONFIG_HW_RANDOM_KEYSTONE) += ks-sa-rng.o
obj-$(CONFIG_HW_RANDOM_OPTEE) += optee-rng.o
obj-$(CONFIG_HW_RANDOM_NPCM) += npcm-rng.o
+obj-$(CONFIG_HW_RANDOM_CCTRNG) += cctrng.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/* Copyright (C) 2019-2020 ARM Limited or its affiliates. */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/clk.h>
+#include <linux/hw_random.h>
+#include <linux/io.h>
+#include <linux/platform_device.h>
+#include <linux/pm_runtime.h>
+#include <linux/interrupt.h>
+#include <linux/irqreturn.h>
+#include <linux/workqueue.h>
+#include <linux/circ_buf.h>
+#include <linux/completion.h>
+#include <linux/of.h>
+#include <linux/bitfield.h>
+#include <linux/fips.h>
+
+#include "cctrng.h"
+
+#define CC_REG_LOW(name) (name ## _BIT_SHIFT)
+#define CC_REG_HIGH(name) (CC_REG_LOW(name) + name ## _BIT_SIZE - 1)
+#define CC_GENMASK(name) GENMASK(CC_REG_HIGH(name), CC_REG_LOW(name))
+
+#define CC_REG_FLD_GET(reg_name, fld_name, reg_val) \
+ (FIELD_GET(CC_GENMASK(CC_ ## reg_name ## _ ## fld_name), reg_val))
+
+#define CC_HW_RESET_LOOP_COUNT 10
+#define CC_TRNG_SUSPEND_TIMEOUT 3000
+
+/* data circular buffer in words must be:
+ * - of a power-of-2 size (limitation of circ_buf.h macros)
+ * - at least 6, the size generated in the EHR according to HW implementation
+ */
+#define CCTRNG_DATA_BUF_WORDS 32
+
+/* The timeout for the TRNG operation should be calculated with the formula:
+ * Timeout = EHR_NUM * VN_COEFF * EHR_LENGTH * SAMPLE_CNT * SCALE_VALUE
+ * while:
+ * - SAMPLE_CNT is input value from the characterisation process
+ * - all the rest are constants
+ */
+#define EHR_NUM 1
+#define VN_COEFF 4
+#define EHR_LENGTH CC_TRNG_EHR_IN_BITS
+#define SCALE_VALUE 2
+#define CCTRNG_TIMEOUT(smpl_cnt) \
+ (EHR_NUM * VN_COEFF * EHR_LENGTH * smpl_cnt * SCALE_VALUE)
+
+struct cctrng_drvdata {
+ struct platform_device *pdev;
+ void __iomem *cc_base;
+ struct clk *clk;
+ struct hwrng rng;
+ u32 active_rosc;
+ /* Sampling interval for each ring oscillator:
+ * count of ring oscillator cycles between consecutive bits sampling.
+ * Value of 0 indicates non-valid rosc
+ */
+ u32 smpl_ratio[CC_TRNG_NUM_OF_ROSCS];
+
+ u32 data_buf[CCTRNG_DATA_BUF_WORDS];
+ struct circ_buf circ;
+ struct work_struct compwork;
+ struct work_struct startwork;
+
+ /* pending_hw - 1 when HW is pending, 0 when it is idle */
+ atomic_t pending_hw;
+
+ /* protects against multiple concurrent consumers of data_buf */
+ spinlock_t read_lock;
+};
+
+
+/* functions for write/read CC registers */
+static inline void cc_iowrite(struct cctrng_drvdata *drvdata, u32 reg, u32 val)
+{
+ iowrite32(val, (drvdata->cc_base + reg));
+}
+static inline u32 cc_ioread(struct cctrng_drvdata *drvdata, u32 reg)
+{
+ return ioread32(drvdata->cc_base + reg);
+}
+
+
+static int cc_trng_pm_get(struct device *dev)
+{
+ int rc = 0;
+
+ rc = pm_runtime_get_sync(dev);
+
+ /* pm_runtime_get_sync() can return 1 as a valid return code */
+ return (rc == 1 ? 0 : rc);
+}
+
+static void cc_trng_pm_put_suspend(struct device *dev)
+{
+ int rc = 0;
+
+ pm_runtime_mark_last_busy(dev);
+ rc = pm_runtime_put_autosuspend(dev);
+ if (rc)
+ dev_err(dev, "pm_runtime_put_autosuspend returned %x\n", rc);
+}
+
+static int cc_trng_pm_init(struct cctrng_drvdata *drvdata)
+{
+ struct device *dev = &(drvdata->pdev->dev);
+
+ /* must be before the enabling to avoid redundant suspending */
+ pm_runtime_set_autosuspend_delay(dev, CC_TRNG_SUSPEND_TIMEOUT);
+ pm_runtime_use_autosuspend(dev);
+ /* set us as active - note we won't do PM ops until cc_trng_pm_go()! */
+ return pm_runtime_set_active(dev);
+}
+
+static void cc_trng_pm_go(struct cctrng_drvdata *drvdata)
+{
+ struct device *dev = &(drvdata->pdev->dev);
+
+ /* enable the PM module*/
+ pm_runtime_enable(dev);
+}
+
+static void cc_trng_pm_fini(struct cctrng_drvdata *drvdata)
+{
+ struct device *dev = &(drvdata->pdev->dev);
+
+ pm_runtime_disable(dev);
+}
+
+
+static inline int cc_trng_parse_sampling_ratio(struct cctrng_drvdata *drvdata)
+{
+ struct device *dev = &(drvdata->pdev->dev);
+ struct device_node *np = drvdata->pdev->dev.of_node;
+ int rc;
+ int i;
+ /* ret will be set to 0 if at least one rosc has (sampling ratio > 0) */
+ int ret = -EINVAL;
+
+ rc = of_property_read_u32_array(np, "arm,rosc-ratio",
+ drvdata->smpl_ratio,
+ CC_TRNG_NUM_OF_ROSCS);
+ if (rc) {
+ /* arm,rosc-ratio was not found in device tree */
+ return rc;
+ }
+
+ /* verify that at least one rosc has (sampling ratio > 0) */
+ for (i = 0; i < CC_TRNG_NUM_OF_ROSCS; ++i) {
+ dev_dbg(dev, "rosc %d sampling ratio %u",
+ i, drvdata->smpl_ratio[i]);
+
+ if (drvdata->smpl_ratio[i] > 0)
+ ret = 0;
+ }
+
+ return ret;
+}
+
+static int cc_trng_change_rosc(struct cctrng_drvdata *drvdata)
+{
+ struct device *dev = &(drvdata->pdev->dev);
+
+ dev_dbg(dev, "cctrng change rosc (was %d)\n", drvdata->active_rosc);
+ drvdata->active_rosc += 1;
+
+ while (drvdata->active_rosc < CC_TRNG_NUM_OF_ROSCS) {
+ if (drvdata->smpl_ratio[drvdata->active_rosc] > 0)
+ return 0;
+
+ drvdata->active_rosc += 1;
+ }
+ return -EINVAL;
+}
+
+
+static void cc_trng_enable_rnd_source(struct cctrng_drvdata *drvdata)
+{
+ u32 max_cycles;
+
+ /* Set watchdog threshold to maximal allowed time (in CPU cycles) */
+ max_cycles = CCTRNG_TIMEOUT(drvdata->smpl_ratio[drvdata->active_rosc]);
+ cc_iowrite(drvdata, CC_RNG_WATCHDOG_VAL_REG_OFFSET, max_cycles);
+
+ /* enable the RND source */
+ cc_iowrite(drvdata, CC_RND_SOURCE_ENABLE_REG_OFFSET, 0x1);
+
+ /* unmask RNG interrupts */
+ cc_iowrite(drvdata, CC_RNG_IMR_REG_OFFSET, (u32)~CC_RNG_INT_MASK);
+}
+
+
+/* increase circular data buffer index (head/tail) */
+static inline void circ_idx_inc(int *idx, int bytes)
+{
+ *idx += (bytes + 3) >> 2;
+ *idx &= (CCTRNG_DATA_BUF_WORDS - 1);
+}
+
+static inline size_t circ_buf_space(struct cctrng_drvdata *drvdata)
+{
+ return CIRC_SPACE(drvdata->circ.head,
+ drvdata->circ.tail, CCTRNG_DATA_BUF_WORDS);
+
+}
+
+static int cctrng_read(struct hwrng *rng, void *data, size_t max, bool wait)
+{
+ /* current implementation ignores "wait" */
+
+ struct cctrng_drvdata *drvdata = (struct cctrng_drvdata *)rng->priv;
+ struct device *dev = &(drvdata->pdev->dev);
+ u32 *buf = (u32 *)drvdata->circ.buf;
+ size_t copied = 0;
+ size_t cnt_w;
+ size_t size;
+ size_t left;
+
+ if (!spin_trylock(&drvdata->read_lock)) {
+ /* concurrent consumers from data_buf cannot be served */
+ dev_dbg_ratelimited(dev, "unable to hold lock\n");
+ return 0;
+ }
+
+ /* copy till end of data buffer (without wrap back) */
+ cnt_w = CIRC_CNT_TO_END(drvdata->circ.head,
+ drvdata->circ.tail, CCTRNG_DATA_BUF_WORDS);
+ size = min((cnt_w<<2), max);
+ memcpy(data, &(buf[drvdata->circ.tail]), size);
+ copied = size;
+ circ_idx_inc(&drvdata->circ.tail, size);
+ /* copy rest of data in data buffer */
+ left = max - copied;
+ if (left > 0) {
+ cnt_w = CIRC_CNT(drvdata->circ.head,
+ drvdata->circ.tail, CCTRNG_DATA_BUF_WORDS);
+ size = min((cnt_w<<2), left);
+ memcpy(data, &(buf[drvdata->circ.tail]), size);
+ copied += size;
+ circ_idx_inc(&drvdata->circ.tail, size);
+ }
+
+ spin_unlock(&drvdata->read_lock);
+
+ if (circ_buf_space(drvdata) >= CC_TRNG_EHR_IN_WORDS) {
+ if (atomic_cmpxchg(&drvdata->pending_hw, 0, 1) == 0) {
+ /* re-check space in buffer to avoid potential race */
+ if (circ_buf_space(drvdata) >= CC_TRNG_EHR_IN_WORDS) {
+ /* increment device's usage counter */
+ int rc = cc_trng_pm_get(dev);
+
+ if (rc) {
+ dev_err(dev,
+ "cc_trng_pm_get returned %x\n",
+ rc);
+ return rc;
+ }
+
+ /* schedule execution of deferred work handler
+ * for filling of data buffer
+ */
+ schedule_work(&drvdata->startwork);
+ } else {
+ atomic_set(&drvdata->pending_hw, 0);
+ }
+ }
+ }
+
+ return copied;
+}
+
+static void cc_trng_hw_trigger(struct cctrng_drvdata *drvdata)
+{
+ u32 tmp_smpl_cnt = 0;
+ struct device *dev = &(drvdata->pdev->dev);
+
+ dev_dbg(dev, "cctrng hw trigger.\n");
+
+ /* enable the HW RND clock */
+ cc_iowrite(drvdata, CC_RNG_CLK_ENABLE_REG_OFFSET, 0x1);
+
+ /* do software reset */
+ cc_iowrite(drvdata, CC_RNG_SW_RESET_REG_OFFSET, 0x1);
+ /* in order to verify that the reset has completed,
+ * the sample count need to be verified
+ */
+ do {
+ /* enable the HW RND clock */
+ cc_iowrite(drvdata, CC_RNG_CLK_ENABLE_REG_OFFSET, 0x1);
+
+ /* set sampling ratio (rng_clocks) between consecutive bits */
+ cc_iowrite(drvdata, CC_SAMPLE_CNT1_REG_OFFSET,
+ drvdata->smpl_ratio[drvdata->active_rosc]);
+
+ /* read the sampling ratio */
+ tmp_smpl_cnt = cc_ioread(drvdata, CC_SAMPLE_CNT1_REG_OFFSET);
+
+ } while (tmp_smpl_cnt != drvdata->smpl_ratio[drvdata->active_rosc]);
+
+ /* disable the RND source for setting new parameters in HW */
+ cc_iowrite(drvdata, CC_RND_SOURCE_ENABLE_REG_OFFSET, 0);
+
+ cc_iowrite(drvdata, CC_RNG_ICR_REG_OFFSET, 0xFFFFFFFF);
+
+ cc_iowrite(drvdata, CC_TRNG_CONFIG_REG_OFFSET, drvdata->active_rosc);
+
+ /* Debug Control register: set to 0 - no bypasses */
+ cc_iowrite(drvdata, CC_TRNG_DEBUG_CONTROL_REG_OFFSET, 0);
+
+ cc_trng_enable_rnd_source(drvdata);
+}
+
+static void cc_trng_compwork_handler(struct work_struct *w)
+{
+ u32 isr = 0;
+ u32 ehr_valid = 0;
+ struct cctrng_drvdata *drvdata =
+ container_of(w, struct cctrng_drvdata, compwork);
+ struct device *dev = &(drvdata->pdev->dev);
+ int i;
+
+ /* stop DMA and the RNG source */
+ cc_iowrite(drvdata, CC_RNG_DMA_ENABLE_REG_OFFSET, 0);
+ cc_iowrite(drvdata, CC_RND_SOURCE_ENABLE_REG_OFFSET, 0);
+
+ /* read RNG_ISR and check for errors */
+ isr = cc_ioread(drvdata, CC_RNG_ISR_REG_OFFSET);
+ ehr_valid = CC_REG_FLD_GET(RNG_ISR, EHR_VALID, isr);
+ dev_dbg(dev, "Got RNG_ISR=0x%08X (EHR_VALID=%u)\n", isr, ehr_valid);
+
+ if (fips_enabled && CC_REG_FLD_GET(RNG_ISR, CRNGT_ERR, isr)) {
+ fips_fail_notify();
+ /* FIPS error is fatal */
+ panic("Got HW CRNGT error while fips is enabled!\n");
+ }
+
+ /* Clear all pending RNG interrupts */
+ cc_iowrite(drvdata, CC_RNG_ICR_REG_OFFSET, isr);
+
+
+ if (!ehr_valid) {
+ /* in case of AUTOCORR/TIMEOUT error, try the next ROSC */
+ if (CC_REG_FLD_GET(RNG_ISR, AUTOCORR_ERR, isr) ||
+ CC_REG_FLD_GET(RNG_ISR, WATCHDOG, isr)) {
+ dev_dbg(dev, "cctrng autocorr/timeout error.\n");
+ goto next_rosc;
+ }
+
+ /* in case of VN error, ignore it */
+ }
+
+ /* read EHR data from registers */
+ for (i = 0; i < CC_TRNG_EHR_IN_WORDS; i++) {
+ /* calc word ptr in data_buf */
+ u32 *buf = (u32 *)drvdata->circ.buf;
+
+ buf[drvdata->circ.head] = cc_ioread(drvdata,
+ CC_EHR_DATA_0_REG_OFFSET + (i*sizeof(u32)));
+
+ /* EHR_DATA registers are cleared on read. In case 0 value was
+ * returned, restart the entropy collection.
+ */
+ if (buf[drvdata->circ.head] == 0) {
+ dev_dbg(dev, "Got 0 value in EHR. active_rosc %u\n",
+ drvdata->active_rosc);
+ goto next_rosc;
+ }
+
+ circ_idx_inc(&drvdata->circ.head, 1<<2);
+ }
+
+ atomic_set(&drvdata->pending_hw, 0);
+
+ /* continue to fill data buffer if needed */
+ if (circ_buf_space(drvdata) >= CC_TRNG_EHR_IN_WORDS) {
+ if (atomic_cmpxchg(&drvdata->pending_hw, 0, 1) == 0) {
+ /* Re-enable rnd source */
+ cc_trng_enable_rnd_source(drvdata);
+ return;
+ }
+ }
+
+ cc_trng_pm_put_suspend(dev);
+
+ dev_dbg(dev, "compwork handler done\n");
+ return;
+
+next_rosc:
+ if ((circ_buf_space(drvdata) >= CC_TRNG_EHR_IN_WORDS) &&
+ (cc_trng_change_rosc(drvdata) == 0)) {
+ /* trigger trng hw with next rosc */
+ cc_trng_hw_trigger(drvdata);
+ } else {
+ atomic_set(&drvdata->pending_hw, 0);
+ cc_trng_pm_put_suspend(dev);
+ }
+}
+
+static irqreturn_t cc_isr(int irq, void *dev_id)
+{
+ struct cctrng_drvdata *drvdata = (struct cctrng_drvdata *)dev_id;
+ struct device *dev = &(drvdata->pdev->dev);
+ u32 irr;
+
+ /* if driver suspended return, probably shared interrupt */
+ if (pm_runtime_suspended(dev))
+ return IRQ_NONE;
+
+ /* read the interrupt status */
+ irr = cc_ioread(drvdata, CC_HOST_RGF_IRR_REG_OFFSET);
+ dev_dbg(dev, "Got IRR=0x%08X\n", irr);
+
+ if (irr == 0) /* Probably shared interrupt line */
+ return IRQ_NONE;
+
+ /* clear interrupt - must be before processing events */
+ cc_iowrite(drvdata, CC_HOST_RGF_ICR_REG_OFFSET, irr);
+
+ /* RNG interrupt - most probable */
+ if (irr & CC_HOST_RNG_IRQ_MASK) {
+ /* Mask RNG interrupts - will be unmasked in deferred work */
+ cc_iowrite(drvdata, CC_RNG_IMR_REG_OFFSET, 0xFFFFFFFF);
+
+ /* We clear RNG interrupt here,
+ * to avoid it from firing as we'll unmask RNG interrupts.
+ */
+ cc_iowrite(drvdata, CC_HOST_RGF_ICR_REG_OFFSET,
+ CC_HOST_RNG_IRQ_MASK);
+
+ irr &= ~CC_HOST_RNG_IRQ_MASK;
+
+ /* schedule execution of deferred work handler */
+ schedule_work(&drvdata->compwork);
+ }
+
+ if (irr) {
+ dev_dbg_ratelimited(dev,
+ "IRR includes unknown cause bits (0x%08X)\n",
+ irr);
+ /* Just warning */
+ }
+
+ return IRQ_HANDLED;
+}
+
+static void cc_trng_startwork_handler(struct work_struct *w)
+{
+ struct cctrng_drvdata *drvdata =
+ container_of(w, struct cctrng_drvdata, startwork);
+
+ drvdata->active_rosc = 0;
+ cc_trng_hw_trigger(drvdata);
+}
+
+
+static int cc_trng_clk_init(struct cctrng_drvdata *drvdata)
+{
+ struct clk *clk;
+ struct device *dev = &(drvdata->pdev->dev);
+ int rc = 0;
+
+ clk = devm_clk_get_optional(dev, NULL);
+ if (IS_ERR(clk)) {
+ if (PTR_ERR(clk) != -EPROBE_DEFER)
+ dev_err(dev, "Error getting clock: %pe\n", clk);
+ return PTR_ERR(clk);
+ }
+ drvdata->clk = clk;
+
+ rc = clk_prepare_enable(drvdata->clk);
+ if (rc) {
+ dev_err(dev, "Failed to enable clock\n");
+ return rc;
+ }
+
+ return 0;
+}
+
+static void cc_trng_clk_fini(struct cctrng_drvdata *drvdata)
+{
+ clk_disable_unprepare(drvdata->clk);
+}
+
+
+static int cctrng_probe(struct platform_device *pdev)
+{
+ struct resource *req_mem_cc_regs = NULL;
+ struct cctrng_drvdata *drvdata;
+ struct device *dev = &pdev->dev;
+ int rc = 0;
+ u32 val;
+ int irq;
+
+ drvdata = devm_kzalloc(dev, sizeof(*drvdata), GFP_KERNEL);
+ if (!drvdata)
+ return -ENOMEM;
+
+ drvdata->rng.name = devm_kstrdup(dev, dev_name(dev), GFP_KERNEL);
+ if (!drvdata->rng.name)
+ return -ENOMEM;
+
+ drvdata->rng.read = cctrng_read;
+ drvdata->rng.priv = (unsigned long)drvdata;
+ drvdata->rng.quality = CC_TRNG_QUALITY;
+
+ platform_set_drvdata(pdev, drvdata);
+ drvdata->pdev = pdev;
+
+ drvdata->circ.buf = (char *)drvdata->data_buf;
+
+ /* Get device resources */
+ /* First CC registers space */
+ req_mem_cc_regs = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ /* Map registers space */
+ drvdata->cc_base = devm_ioremap_resource(dev, req_mem_cc_regs);
+ if (IS_ERR(drvdata->cc_base)) {
+ dev_err(dev, "Failed to ioremap registers");
+ return PTR_ERR(drvdata->cc_base);
+ }
+
+ dev_dbg(dev, "Got MEM resource (%s): %pR\n", req_mem_cc_regs->name,
+ req_mem_cc_regs);
+ dev_dbg(dev, "CC registers mapped from %pa to 0x%p\n",
+ &req_mem_cc_regs->start, drvdata->cc_base);
+
+ /* Then IRQ */
+ irq = platform_get_irq(pdev, 0);
+ if (irq < 0) {
+ dev_err(dev, "Failed getting IRQ resource\n");
+ return irq;
+ }
+
+ /* parse sampling rate from device tree */
+ rc = cc_trng_parse_sampling_ratio(drvdata);
+ if (rc) {
+ dev_err(dev, "Failed to get legal sampling ratio for rosc\n");
+ return rc;
+ }
+
+ rc = cc_trng_clk_init(drvdata);
+ if (rc) {
+ dev_err(dev, "cc_trng_clk_init failed\n");
+ return rc;
+ }
+
+ INIT_WORK(&drvdata->compwork, cc_trng_compwork_handler);
+ INIT_WORK(&drvdata->startwork, cc_trng_startwork_handler);
+ spin_lock_init(&drvdata->read_lock);
+
+ /* register the driver isr function */
+ rc = devm_request_irq(dev, irq, cc_isr, IRQF_SHARED, "cctrng", drvdata);
+ if (rc) {
+ dev_err(dev, "Could not register to interrupt %d\n", irq);
+ goto post_clk_err;
+ }
+ dev_dbg(dev, "Registered to IRQ: %d\n", irq);
+
+ /* Clear all pending interrupts */
+ val = cc_ioread(drvdata, CC_HOST_RGF_IRR_REG_OFFSET);
+ dev_dbg(dev, "IRR=0x%08X\n", val);
+ cc_iowrite(drvdata, CC_HOST_RGF_ICR_REG_OFFSET, val);
+
+ /* unmask HOST RNG interrupt */
+ cc_iowrite(drvdata, CC_HOST_RGF_IMR_REG_OFFSET,
+ cc_ioread(drvdata, CC_HOST_RGF_IMR_REG_OFFSET) &
+ ~CC_HOST_RNG_IRQ_MASK);
+
+ /* init PM */
+ rc = cc_trng_pm_init(drvdata);
+ if (rc) {
+ dev_err(dev, "cc_trng_pm_init failed\n");
+ goto post_clk_err;
+ }
+
+ /* increment device's usage counter */
+ rc = cc_trng_pm_get(dev);
+ if (rc) {
+ dev_err(dev, "cc_trng_pm_get returned %x\n", rc);
+ goto post_pm_err;
+ }
+
+ /* set pending_hw to verify that HW won't be triggered from read */
+ atomic_set(&drvdata->pending_hw, 1);
+
+ /* registration of the hwrng device */
+ rc = hwrng_register(&drvdata->rng);
+ if (rc) {
+ dev_err(dev, "Could not register hwrng device.\n");
+ goto post_pm_err;
+ }
+
+ /* trigger HW to start generate data */
+ drvdata->active_rosc = 0;
+ cc_trng_hw_trigger(drvdata);
+
+ /* All set, we can allow auto-suspend */
+ cc_trng_pm_go(drvdata);
+
+ dev_info(dev, "ARM cctrng device initialized\n");
+
+ return 0;
+
+post_pm_err:
+ cc_trng_pm_fini(drvdata);
+
+post_clk_err:
+ cc_trng_clk_fini(drvdata);
+
+ return rc;
+}
+
+static int cctrng_remove(struct platform_device *pdev)
+{
+ struct cctrng_drvdata *drvdata = platform_get_drvdata(pdev);
+ struct device *dev = &pdev->dev;
+
+ dev_dbg(dev, "Releasing cctrng resources...\n");
+
+ hwrng_unregister(&drvdata->rng);
+
+ cc_trng_pm_fini(drvdata);
+
+ cc_trng_clk_fini(drvdata);
+
+ dev_info(dev, "ARM cctrng device terminated\n");
+
+ return 0;
+}
+
+static int __maybe_unused cctrng_suspend(struct device *dev)
+{
+ struct cctrng_drvdata *drvdata = dev_get_drvdata(dev);
+
+ dev_dbg(dev, "set HOST_POWER_DOWN_EN\n");
+ cc_iowrite(drvdata, CC_HOST_POWER_DOWN_EN_REG_OFFSET,
+ POWER_DOWN_ENABLE);
+
+ clk_disable_unprepare(drvdata->clk);
+
+ return 0;
+}
+
+static bool cctrng_wait_for_reset_completion(struct cctrng_drvdata *drvdata)
+{
+ unsigned int val;
+ unsigned int i;
+
+ for (i = 0; i < CC_HW_RESET_LOOP_COUNT; i++) {
+ /* in cc7x3 NVM_IS_IDLE indicates that CC reset is
+ * completed and device is fully functional
+ */
+ val = cc_ioread(drvdata, CC_NVM_IS_IDLE_REG_OFFSET);
+ if (val & BIT(CC_NVM_IS_IDLE_VALUE_BIT_SHIFT)) {
+ /* hw indicate reset completed */
+ return true;
+ }
+ /* allow scheduling other process on the processor */
+ schedule();
+ }
+ /* reset not completed */
+ return false;
+}
+
+static int __maybe_unused cctrng_resume(struct device *dev)
+{
+ struct cctrng_drvdata *drvdata = dev_get_drvdata(dev);
+ int rc;
+
+ dev_dbg(dev, "unset HOST_POWER_DOWN_EN\n");
+ /* Enables the device source clk */
+ rc = clk_prepare_enable(drvdata->clk);
+ if (rc) {
+ dev_err(dev, "failed getting clock back on. We're toast.\n");
+ return rc;
+ }
+
+ /* wait for Cryptocell reset completion */
+ if (!cctrng_wait_for_reset_completion(drvdata)) {
+ dev_err(dev, "Cryptocell reset not completed");
+ return -EBUSY;
+ }
+
+ /* unmask HOST RNG interrupt */
+ cc_iowrite(drvdata, CC_HOST_RGF_IMR_REG_OFFSET,
+ cc_ioread(drvdata, CC_HOST_RGF_IMR_REG_OFFSET) &
+ ~CC_HOST_RNG_IRQ_MASK);
+
+ cc_iowrite(drvdata, CC_HOST_POWER_DOWN_EN_REG_OFFSET,
+ POWER_DOWN_DISABLE);
+
+ return 0;
+}
+
+static UNIVERSAL_DEV_PM_OPS(cctrng_pm, cctrng_suspend, cctrng_resume, NULL);
+
+static const struct of_device_id arm_cctrng_dt_match[] = {
+ { .compatible = "arm,cryptocell-713-trng", },
+ { .compatible = "arm,cryptocell-703-trng", },
+ {},
+};
+MODULE_DEVICE_TABLE(of, arm_cctrng_dt_match);
+
+static struct platform_driver cctrng_driver = {
+ .driver = {
+ .name = "cctrng",
+ .of_match_table = arm_cctrng_dt_match,
+ .pm = &cctrng_pm,
+ },
+ .probe = cctrng_probe,
+ .remove = cctrng_remove,
+};
+
+static int __init cctrng_mod_init(void)
+{
+ /* Compile time assertion checks */
+ BUILD_BUG_ON(CCTRNG_DATA_BUF_WORDS < 6);
+ BUILD_BUG_ON((CCTRNG_DATA_BUF_WORDS & (CCTRNG_DATA_BUF_WORDS-1)) != 0);
+
+ return platform_driver_register(&cctrng_driver);
+}
+module_init(cctrng_mod_init);
+
+static void __exit cctrng_mod_exit(void)
+{
+ platform_driver_unregister(&cctrng_driver);
+}
+module_exit(cctrng_mod_exit);
+
+/* Module description */
+MODULE_DESCRIPTION("ARM CryptoCell TRNG Driver");
+MODULE_AUTHOR("ARM");
+MODULE_LICENSE("GPL v2");
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2019-2020 ARM Limited or its affiliates. */
+
+#include <linux/bitops.h>
+
+#define POWER_DOWN_ENABLE 0x01
+#define POWER_DOWN_DISABLE 0x00
+
+/* hwrng quality: bits of true entropy per 1024 bits of input */
+#define CC_TRNG_QUALITY 1024
+
+/* CryptoCell TRNG HW definitions */
+#define CC_TRNG_NUM_OF_ROSCS 4
+/* The number of words generated in the entropy holding register (EHR)
+ * 6 words (192 bit) according to HW implementation
+ */
+#define CC_TRNG_EHR_IN_WORDS 6
+#define CC_TRNG_EHR_IN_BITS (CC_TRNG_EHR_IN_WORDS * BITS_PER_TYPE(u32))
+
+#define CC_HOST_RNG_IRQ_MASK BIT(CC_HOST_RGF_IRR_RNG_INT_BIT_SHIFT)
+
+/* RNG interrupt mask */
+#define CC_RNG_INT_MASK (BIT(CC_RNG_IMR_EHR_VALID_INT_MASK_BIT_SHIFT) | \
+ BIT(CC_RNG_IMR_AUTOCORR_ERR_INT_MASK_BIT_SHIFT) | \
+ BIT(CC_RNG_IMR_CRNGT_ERR_INT_MASK_BIT_SHIFT) | \
+ BIT(CC_RNG_IMR_VN_ERR_INT_MASK_BIT_SHIFT) | \
+ BIT(CC_RNG_IMR_WATCHDOG_INT_MASK_BIT_SHIFT))
+
+// --------------------------------------
+// BLOCK: RNG
+// --------------------------------------
+#define CC_RNG_IMR_REG_OFFSET 0x0100UL
+#define CC_RNG_IMR_EHR_VALID_INT_MASK_BIT_SHIFT 0x0UL
+#define CC_RNG_IMR_AUTOCORR_ERR_INT_MASK_BIT_SHIFT 0x1UL
+#define CC_RNG_IMR_CRNGT_ERR_INT_MASK_BIT_SHIFT 0x2UL
+#define CC_RNG_IMR_VN_ERR_INT_MASK_BIT_SHIFT 0x3UL
+#define CC_RNG_IMR_WATCHDOG_INT_MASK_BIT_SHIFT 0x4UL
+#define CC_RNG_ISR_REG_OFFSET 0x0104UL
+#define CC_RNG_ISR_EHR_VALID_BIT_SHIFT 0x0UL
+#define CC_RNG_ISR_EHR_VALID_BIT_SIZE 0x1UL
+#define CC_RNG_ISR_AUTOCORR_ERR_BIT_SHIFT 0x1UL
+#define CC_RNG_ISR_AUTOCORR_ERR_BIT_SIZE 0x1UL
+#define CC_RNG_ISR_CRNGT_ERR_BIT_SHIFT 0x2UL
+#define CC_RNG_ISR_CRNGT_ERR_BIT_SIZE 0x1UL
+#define CC_RNG_ISR_WATCHDOG_BIT_SHIFT 0x4UL
+#define CC_RNG_ISR_WATCHDOG_BIT_SIZE 0x1UL
+#define CC_RNG_ICR_REG_OFFSET 0x0108UL
+#define CC_TRNG_CONFIG_REG_OFFSET 0x010CUL
+#define CC_EHR_DATA_0_REG_OFFSET 0x0114UL
+#define CC_RND_SOURCE_ENABLE_REG_OFFSET 0x012CUL
+#define CC_SAMPLE_CNT1_REG_OFFSET 0x0130UL
+#define CC_TRNG_DEBUG_CONTROL_REG_OFFSET 0x0138UL
+#define CC_RNG_SW_RESET_REG_OFFSET 0x0140UL
+#define CC_RNG_CLK_ENABLE_REG_OFFSET 0x01C4UL
+#define CC_RNG_DMA_ENABLE_REG_OFFSET 0x01C8UL
+#define CC_RNG_WATCHDOG_VAL_REG_OFFSET 0x01D8UL
+// --------------------------------------
+// BLOCK: SEC_HOST_RGF
+// --------------------------------------
+#define CC_HOST_RGF_IRR_REG_OFFSET 0x0A00UL
+#define CC_HOST_RGF_IRR_RNG_INT_BIT_SHIFT 0xAUL
+#define CC_HOST_RGF_IMR_REG_OFFSET 0x0A04UL
+#define CC_HOST_RGF_ICR_REG_OFFSET 0x0A08UL
+
+#define CC_HOST_POWER_DOWN_EN_REG_OFFSET 0x0A78UL
+
+// --------------------------------------
+// BLOCK: NVM
+// --------------------------------------
+#define CC_NVM_IS_IDLE_REG_OFFSET 0x0F10UL
+#define CC_NVM_IS_IDLE_VALUE_BIT_SHIFT 0x0UL
+#define CC_NVM_IS_IDLE_VALUE_BIT_SIZE 0x1UL
if (of_device_is_compatible(dev->of_node, "ti,omap4-rng") ||
of_device_is_compatible(dev->of_node, "inside-secure,safexcel-eip76")) {
irq = platform_get_irq(pdev, 0);
- if (irq < 0) {
- dev_err(dev, "%s: error getting IRQ resource - %d\n",
- __func__, irq);
+ if (irq < 0)
return irq;
- }
err = devm_request_irq(dev, irq, omap4_rng_irq,
IRQF_TRIGGER_NONE, dev_name(dev), priv);
return -ENODEV;
/* Open session with hwrng Trusted App */
- memcpy(sess_arg.uuid, rng_device->id.uuid.b, TEE_IOCTL_UUID_LEN);
+ export_uuid(sess_arg.uuid, &rng_device->id.uuid);
sess_arg.clnt_login = TEE_IOCTL_LOGIN_PUBLIC;
sess_arg.num_params = 0;
return PTR_ERR(ctx->csr_base);
rc = platform_get_irq(pdev, 0);
- if (rc < 0) {
- dev_err(&pdev->dev, "No IRQ resource\n");
+ if (rc < 0)
return rc;
- }
ctx->irq = rc;
dev_dbg(&pdev->dev, "APM X-Gene RNG BASE %p ALARM IRQ %d",
if (adev->type != &i2c_adapter_type)
return 0;
- addr_info->added_client = i2c_new_device(to_i2c_adapter(adev),
- &addr_info->binfo);
+ addr_info->added_client = i2c_new_client_device(to_i2c_adapter(adev),
+ &addr_info->binfo);
if (!addr_info->adapter_name)
return 1; /* Only try the first I2C adapter by default. */
#include <linux/spinlock.h>
#include <linux/kthread.h>
#include <linux/percpu.h>
-#include <linux/cryptohash.h>
#include <linux/fips.h>
#include <linux/ptrace.h>
#include <linux/workqueue.h>
#include <linux/completion.h>
#include <linux/uuid.h>
#include <crypto/chacha.h>
+#include <crypto/sha.h>
#include <asm/processor.h>
#include <linux/uaccess.h>
__u32 w[5];
unsigned long l[LONGS(20)];
} hash;
- __u32 workspace[SHA_WORKSPACE_WORDS];
+ __u32 workspace[SHA1_WORKSPACE_WORDS];
unsigned long flags;
/*
* If we have an architectural hardware random number
* generator, use it for SHA's initial vector
*/
- sha_init(hash.w);
+ sha1_init(hash.w);
for (i = 0; i < LONGS(20); i++) {
unsigned long v;
if (!arch_get_random_long(&v))
/* Generate a hash across the pool, 16 words (512 bits) at a time */
spin_lock_irqsave(&r->lock, flags);
for (i = 0; i < r->poolinfo->poolwords; i += 16)
- sha_transform(hash.w, (__u8 *)(r->pool + i), workspace);
+ sha1_transform(hash.w, (__u8 *)(r->pool + i), workspace);
/*
* We mix the hash back into the pool to prevent backtracking
int i;
event_header = addr;
- size = sizeof(struct tcg_pcr_event) - sizeof(event_header->event)
- + event_header->event_size;
+ size = struct_size(event_header, event, event_header->event_size);
if (*pos == 0) {
if (addr + size < limit) {
event_header = log->bios_event_log;
if (v == SEQ_START_TOKEN) {
- event_size = sizeof(struct tcg_pcr_event) -
- sizeof(event_header->event) + event_header->event_size;
+ event_size = struct_size(event_header, event,
+ event_header->event_size);
marker = event_header;
} else {
event = v;
size_t size;
if (v == SEQ_START_TOKEN) {
- size = sizeof(struct tcg_pcr_event) -
- sizeof(event_header->event) + event_header->event_size;
-
+ size = struct_size(event_header, event,
+ event_header->event_size);
temp_ptr = event_header;
if (size > 0)
/* Open a session with fTPM TA */
memset(&sess_arg, 0, sizeof(sess_arg));
- memcpy(sess_arg.uuid, ftpm_ta_uuid.b, TEE_IOCTL_UUID_LEN);
+ export_uuid(sess_arg.uuid, &ftpm_ta_uuid);
sess_arg.clnt_login = TEE_IOCTL_LOGIN_PUBLIC;
sess_arg.num_params = 0;
out:
clk_pm_runtime_put(core);
unlock:
+ if (ret)
+ hlist_del_init(&core->child_node);
+
clk_prepare_unlock();
if (!ret)
config SM_GCC_8250
tristate "SM8250 Global Clock Controller"
+ select QCOM_GDSC
help
Support for the global clock controller on SM8250 devices.
Say Y if you want to use peripheral devices such as UART,
.clkr.hw.init = &(struct clk_init_data){
.name = "gpll0_out_even",
.parent_data = &(const struct clk_parent_data){
- .fw_name = "bi_tcxo",
- .name = "bi_tcxo",
+ .hw = &gpll0.clkr.hw,
},
.num_parents = 1,
.ops = &clk_trion_pll_postdiv_ops,
PNAME(mux_i2s2_p) = { "i2s2_src", "i2s2_frac", "xin12m" };
PNAME(mux_sclk_spdif_p) = { "sclk_spdif_src", "spdif_frac", "xin12m" };
-PNAME(mux_aclk_gpu_pre_p) = { "cpll_gpu", "gpll_gpu", "hdmiphy_gpu", "usb480m_gpu" };
-
PNAME(mux_uart0_p) = { "uart0_src", "uart0_frac", "xin24m" };
PNAME(mux_uart1_p) = { "uart1_src", "uart1_frac", "xin24m" };
PNAME(mux_uart2_p) = { "uart2_src", "uart2_frac", "xin24m" };
RK2928_CLKSEL_CON(24), 6, 10, DFLAGS,
RK2928_CLKGATE_CON(2), 8, GFLAGS),
- GATE(0, "cpll_gpu", "cpll", 0,
- RK2928_CLKGATE_CON(3), 13, GFLAGS),
- GATE(0, "gpll_gpu", "gpll", 0,
- RK2928_CLKGATE_CON(3), 13, GFLAGS),
- GATE(0, "hdmiphy_gpu", "hdmiphy", 0,
- RK2928_CLKGATE_CON(3), 13, GFLAGS),
- GATE(0, "usb480m_gpu", "usb480m", 0,
+ COMPOSITE(0, "aclk_gpu_pre", mux_pll_src_4plls_p, 0,
+ RK2928_CLKSEL_CON(34), 5, 2, MFLAGS, 0, 5, DFLAGS,
RK2928_CLKGATE_CON(3), 13, GFLAGS),
- COMPOSITE_NOGATE(0, "aclk_gpu_pre", mux_aclk_gpu_pre_p, 0,
- RK2928_CLKSEL_CON(34), 5, 2, MFLAGS, 0, 5, DFLAGS),
COMPOSITE(SCLK_SPI0, "sclk_spi0", mux_pll_src_2plls_p, 0,
RK2928_CLKSEL_CON(25), 8, 1, MFLAGS, 0, 7, DFLAGS,
GATE(0, "pclk_peri_noc", "pclk_peri", CLK_IGNORE_UNUSED, RK2928_CLKGATE_CON(12), 2, GFLAGS),
/* PD_GPU */
- GATE(ACLK_GPU, "aclk_gpu", "aclk_gpu_pre", 0, RK2928_CLKGATE_CON(13), 14, GFLAGS),
- GATE(0, "aclk_gpu_noc", "aclk_gpu_pre", 0, RK2928_CLKGATE_CON(13), 15, GFLAGS),
+ GATE(ACLK_GPU, "aclk_gpu", "aclk_gpu_pre", 0, RK2928_CLKGATE_CON(7), 14, GFLAGS),
+ GATE(0, "aclk_gpu_noc", "aclk_gpu_pre", 0, RK2928_CLKGATE_CON(7), 15, GFLAGS),
/* PD_BUS */
GATE(0, "sclk_initmem_mbist", "aclk_cpu", 0, RK2928_CLKGATE_CON(8), 1, GFLAGS),
{ TEGRA124_CLK_UARTB, TEGRA124_CLK_PLL_P, 408000000, 0 },
{ TEGRA124_CLK_UARTC, TEGRA124_CLK_PLL_P, 408000000, 0 },
{ TEGRA124_CLK_UARTD, TEGRA124_CLK_PLL_P, 408000000, 0 },
- { TEGRA124_CLK_PLL_A, TEGRA124_CLK_CLK_MAX, 564480000, 0 },
+ { TEGRA124_CLK_PLL_A, TEGRA124_CLK_CLK_MAX, 282240000, 0 },
{ TEGRA124_CLK_PLL_A_OUT0, TEGRA124_CLK_CLK_MAX, 11289600, 0 },
{ TEGRA124_CLK_I2S0, TEGRA124_CLK_PLL_A_OUT0, 11289600, 0 },
{ TEGRA124_CLK_I2S1, TEGRA124_CLK_PLL_A_OUT0, 11289600, 0 },
};
static const struct omap_clkctrl_reg_data am3_l4_rtc_clkctrl_regs[] __initconst = {
- { AM3_L4_RTC_RTC_CLKCTRL, NULL, CLKF_SW_SUP, "clk_32768_ck" },
+ { AM3_L4_RTC_RTC_CLKCTRL, NULL, CLKF_SW_SUP, "clk-24mhz-clkctrl:0000:0" },
{ 0 },
};
return entry->clk;
}
+/* Get clkctrl clock base name based on clkctrl_name or dts node */
+static const char * __init clkctrl_get_clock_name(struct device_node *np,
+ const char *clkctrl_name,
+ int offset, int index,
+ bool legacy_naming)
+{
+ char *clock_name;
+
+ /* l4per-clkctrl:1234:0 style naming based on clkctrl_name */
+ if (clkctrl_name && !legacy_naming) {
+ clock_name = kasprintf(GFP_KERNEL, "%s-clkctrl:%04x:%d",
+ clkctrl_name, offset, index);
+ strreplace(clock_name, '_', '-');
+
+ return clock_name;
+ }
+
+ /* l4per:1234:0 old style naming based on clkctrl_name */
+ if (clkctrl_name)
+ return kasprintf(GFP_KERNEL, "%s_cm:clk:%04x:%d",
+ clkctrl_name, offset, index);
+
+ /* l4per_cm:1234:0 old style naming based on parent node name */
+ if (legacy_naming)
+ return kasprintf(GFP_KERNEL, "%pOFn:clk:%04x:%d",
+ np->parent, offset, index);
+
+ /* l4per-clkctrl:1234:0 style naming based on node name */
+ return kasprintf(GFP_KERNEL, "%pOFn:%04x:%d", np, offset, index);
+}
+
static int __init
_ti_clkctrl_clk_register(struct omap_clkctrl_provider *provider,
struct device_node *node, struct clk_hw *clk_hw,
u16 offset, u8 bit, const char * const *parents,
- int num_parents, const struct clk_ops *ops)
+ int num_parents, const struct clk_ops *ops,
+ const char *clkctrl_name)
{
struct clk_init_data init = { NULL };
struct clk *clk;
struct omap_clkctrl_clk *clkctrl_clk;
int ret = 0;
- if (ti_clk_get_features()->flags & TI_CLK_CLKCTRL_COMPAT)
- init.name = kasprintf(GFP_KERNEL, "%pOFn:%pOFn:%04x:%d",
- node->parent, node, offset,
- bit);
- else
- init.name = kasprintf(GFP_KERNEL, "%pOFn:%04x:%d", node,
- offset, bit);
+ init.name = clkctrl_get_clock_name(node, clkctrl_name, offset, bit,
+ ti_clk_get_features()->flags &
+ TI_CLK_CLKCTRL_COMPAT);
+
clkctrl_clk = kzalloc(sizeof(*clkctrl_clk), GFP_KERNEL);
if (!init.name || !clkctrl_clk) {
ret = -ENOMEM;
_ti_clkctrl_setup_gate(struct omap_clkctrl_provider *provider,
struct device_node *node, u16 offset,
const struct omap_clkctrl_bit_data *data,
- void __iomem *reg)
+ void __iomem *reg, const char *clkctrl_name)
{
struct clk_hw_omap *clk_hw;
if (_ti_clkctrl_clk_register(provider, node, &clk_hw->hw, offset,
data->bit, data->parents, 1,
- &omap_gate_clk_ops))
+ &omap_gate_clk_ops, clkctrl_name))
kfree(clk_hw);
}
_ti_clkctrl_setup_mux(struct omap_clkctrl_provider *provider,
struct device_node *node, u16 offset,
const struct omap_clkctrl_bit_data *data,
- void __iomem *reg)
+ void __iomem *reg, const char *clkctrl_name)
{
struct clk_omap_mux *mux;
int num_parents = 0;
if (_ti_clkctrl_clk_register(provider, node, &mux->hw, offset,
data->bit, data->parents, num_parents,
- &ti_clk_mux_ops))
+ &ti_clk_mux_ops, clkctrl_name))
kfree(mux);
}
_ti_clkctrl_setup_div(struct omap_clkctrl_provider *provider,
struct device_node *node, u16 offset,
const struct omap_clkctrl_bit_data *data,
- void __iomem *reg)
+ void __iomem *reg, const char *clkctrl_name)
{
struct clk_omap_divider *div;
const struct omap_clkctrl_div_data *div_data = data->data;
if (_ti_clkctrl_clk_register(provider, node, &div->hw, offset,
data->bit, data->parents, 1,
- &ti_clk_divider_ops))
+ &ti_clk_divider_ops, clkctrl_name))
kfree(div);
}
_ti_clkctrl_setup_subclks(struct omap_clkctrl_provider *provider,
struct device_node *node,
const struct omap_clkctrl_reg_data *data,
- void __iomem *reg)
+ void __iomem *reg, const char *clkctrl_name)
{
const struct omap_clkctrl_bit_data *bits = data->bit_data;
switch (bits->type) {
case TI_CLK_GATE:
_ti_clkctrl_setup_gate(provider, node, data->offset,
- bits, reg);
+ bits, reg, clkctrl_name);
break;
case TI_CLK_DIVIDER:
_ti_clkctrl_setup_div(provider, node, data->offset,
- bits, reg);
+ bits, reg, clkctrl_name);
break;
case TI_CLK_MUX:
_ti_clkctrl_setup_mux(provider, node, data->offset,
- bits, reg);
+ bits, reg, clkctrl_name);
break;
default:
return name;
}
}
- of_node_put(np);
return NULL;
}
-/* Get clkctrl clock base name based on clkctrl_name or dts node */
-static const char * __init clkctrl_get_clock_name(struct device_node *np,
- const char *clkctrl_name,
- int offset, int index,
- bool legacy_naming)
-{
- char *clock_name;
-
- /* l4per-clkctrl:1234:0 style naming based on clkctrl_name */
- if (clkctrl_name && !legacy_naming) {
- clock_name = kasprintf(GFP_KERNEL, "%s-clkctrl:%04x:%d",
- clkctrl_name, offset, index);
- strreplace(clock_name, '_', '-');
-
- return clock_name;
- }
-
- /* l4per:1234:0 old style naming based on clkctrl_name */
- if (clkctrl_name)
- return kasprintf(GFP_KERNEL, "%s_cm:clk:%04x:%d",
- clkctrl_name, offset, index);
-
- /* l4per_cm:1234:0 old style naming based on parent node name */
- if (legacy_naming)
- return kasprintf(GFP_KERNEL, "%pOFn:clk:%04x:%d",
- np->parent, offset, index);
-
- /* l4per-clkctrl:1234:0 style naming based on node name */
- return kasprintf(GFP_KERNEL, "%pOFn:%04x:%d", np, offset, index);
-}
-
static void __init _ti_omap4_clkctrl_setup(struct device_node *node)
{
struct omap_clkctrl_provider *provider;
hw->enable_reg.ptr = provider->base + reg_data->offset;
_ti_clkctrl_setup_subclks(provider, node, reg_data,
- hw->enable_reg.ptr);
+ hw->enable_reg.ptr, clkctrl_name);
if (reg_data->flags & CLKF_SW_SUP)
hw->enable_bit = MODULEMODE_SWCTRL;
return -ENODEV;
}
+ of_property_read_string(np, "clock-output-names", &name);
parent_name = of_clk_get_parent_name(np, 0);
clk = icst_clk_setup(NULL, desc, name, parent_name, map,
ICST_INTEGRATOR_IM_PD1);
#include <linux/pid_namespace.h>
#include <linux/cn_proc.h>
+#include <linux/local_lock.h>
/*
* Size of a cn_msg followed by a proc_event structure. Since the
static atomic_t proc_event_num_listeners = ATOMIC_INIT(0);
static struct cb_id cn_proc_event_id = { CN_IDX_PROC, CN_VAL_PROC };
-/* proc_event_counts is used as the sequence number of the netlink message */
-static DEFINE_PER_CPU(__u32, proc_event_counts) = { 0 };
+/* local_event.count is used as the sequence number of the netlink message */
+struct local_event {
+ local_lock_t lock;
+ __u32 count;
+};
+static DEFINE_PER_CPU(struct local_event, local_event) = {
+ .lock = INIT_LOCAL_LOCK(lock),
+};
static inline void send_msg(struct cn_msg *msg)
{
- preempt_disable();
+ local_lock(&local_event.lock);
- msg->seq = __this_cpu_inc_return(proc_event_counts) - 1;
+ msg->seq = __this_cpu_inc_return(local_event.count) - 1;
((struct proc_event *)msg->data)->cpu = smp_processor_id();
/*
- * Preemption remains disabled during send to ensure the messages are
- * ordered according to their sequence numbers.
+ * local_lock() disables preemption during send to ensure the messages
+ * are ordered according to their sequence numbers.
*
* If cn_netlink_send() fails, the data is not sent.
*/
cn_netlink_send(msg, 0, CN_IDX_PROC, GFP_NOWAIT);
- preempt_enable();
+ local_unlock(&local_event.lock);
}
void proc_fork_connector(struct task_struct *task)
return err;
irq = platform_get_irq(pdev, 0);
- if (irq < 0) {
- dev_err(ss->dev, "Cannot get SecuritySystem IRQ\n");
+ if (irq < 0)
return irq;
- }
ss->reset = devm_reset_control_get(&pdev->dev, NULL);
if (IS_ERR(ss->reset)) {
mc->irqs = devm_kcalloc(mc->dev, MAXFLOW, sizeof(int), GFP_KERNEL);
for (i = 0; i < MAXFLOW; i++) {
mc->irqs[i] = platform_get_irq(pdev, i);
- if (mc->irqs[i] < 0) {
- dev_err(mc->dev, "Cannot get IRQ for flow %d\n", i);
+ if (mc->irqs[i] < 0)
return mc->irqs[i];
- }
err = devm_request_irq(&pdev->dev, mc->irqs[i], meson_irq_handler, 0,
"gxl-crypto", mc);
#include <linux/of_device.h>
#include <linux/delay.h>
#include <linux/crypto.h>
-#include <linux/cryptohash.h>
#include <crypto/scatterwalk.h>
#include <crypto/algapi.h>
#include <crypto/sha.h>
blocksize = crypto_tfm_alg_blocksize(crypto_ahash_tfm(tfm));
if (keylen > blocksize) {
- SHASH_DESC_ON_STACK(hdesc, tfm_ctx->child_hash);
-
- hdesc->tfm = tfm_ctx->child_hash;
-
tfm_ctx->hmac_key_length = blocksize;
- ret = crypto_shash_digest(hdesc, key, keylen,
- tfm_ctx->hmac_key);
+
+ ret = crypto_shash_tfm_digest(tfm_ctx->child_hash, key, keylen,
+ tfm_ctx->hmac_key);
if (ret)
return ret;
-
} else {
memcpy(tfm_ctx->hmac_key, key, keylen);
tfm_ctx->hmac_key_length = keylen;
container_of(areq, struct skcipher_request, base);
struct iproc_ctx_s *ctx = rctx->ctx;
struct spu_cipher_parms cipher_parms;
- int err = 0;
- unsigned int chunksize = 0; /* Num bytes of request to submit */
- int remaining = 0; /* Bytes of request still to process */
+ int err;
+ unsigned int chunksize; /* Num bytes of request to submit */
+ int remaining; /* Bytes of request still to process */
int chunk_start; /* Beginning of data for current SPU msg */
/* IV or ctr value to use in this SPU msg */
/* number of bytes still to be hashed in this req */
unsigned int nbytes_to_hash = 0;
- int err = 0;
+ int err;
unsigned int chunksize = 0; /* length of hash carry + new data */
/*
* length of new data, not from hash carry, to be submitted in
struct spu_hw *spu = &iproc_priv.spu;
struct brcm_message *mssg = msg;
struct iproc_reqctx_s *rctx;
- int err = 0;
+ int err;
rctx = mssg->ctx;
if (unlikely(!rctx)) {
struct iproc_reqctx_s *rctx = ahash_request_ctx(req);
struct crypto_ahash *tfm = crypto_ahash_reqtfm(req);
struct iproc_ctx_s *ctx = crypto_ahash_ctx(tfm);
- int err = 0;
+ int err;
const char *alg_name;
flow_log("ahash_enqueue() nbytes:%u\n", req->nbytes);
static int ahash_digest(struct ahash_request *req)
{
- int err = 0;
+ int err;
flow_log("ahash_digest() nbytes:%u\n", req->nbytes);
for (i = 0; i < iproc_priv.spu.num_chan; i++) {
iproc_priv.mbox[i] = mbox_request_channel(mcl, i);
if (IS_ERR(iproc_priv.mbox[i])) {
- err = (int)PTR_ERR(iproc_priv.mbox[i]);
+ err = PTR_ERR(iproc_priv.mbox[i]);
dev_err(dev,
"Mbox channel %d request failed with err %d",
i, err);
matched_spu_type = of_device_get_match_data(dev);
if (!matched_spu_type) {
- dev_err(&pdev->dev, "Failed to match device\n");
+ dev_err(dev, "Failed to match device\n");
return -ENODEV;
}
spu->spu_type = matched_spu_type->type;
spu->spu_subtype = matched_spu_type->subtype;
- i = 0;
for (i = 0; (i < MAX_SPUS) && ((spu_ctrl_regs =
platform_get_resource(pdev, IORESOURCE_MEM, i)) != NULL); i++) {
spu->reg_vbase[i] = devm_ioremap_resource(dev, spu_ctrl_regs);
if (IS_ERR(spu->reg_vbase[i])) {
err = PTR_ERR(spu->reg_vbase[i]);
- dev_err(&pdev->dev, "Failed to map registers: %d\n",
+ dev_err(dev, "Failed to map registers: %d\n",
err);
spu->reg_vbase[i] = NULL;
return err;
{
struct device *dev = &pdev->dev;
struct spu_hw *spu = &iproc_priv.spu;
- int err = 0;
+ int err;
iproc_priv.pdev = pdev;
platform_set_drvdata(iproc_priv.pdev,
if (err < 0)
goto failure;
- err = spu_mb_init(&pdev->dev);
+ err = spu_mb_init(dev);
if (err < 0)
goto failure;
else if (spu->spu_type == SPU_TYPE_SPU2)
iproc_priv.bcm_hdr_len = 0;
- spu_functions_register(&pdev->dev, spu->spu_type, spu->spu_subtype);
+ spu_functions_register(dev, spu->spu_type, spu->spu_subtype);
spu_counters_init();
}
/**
- * nitrox_bist_check - Check NITORX BIST registers status
+ * nitrox_bist_check - Check NITROX BIST registers status
* @ndev: NITROX device
*/
static int nitrox_bist_check(struct nitrox_device *ndev)
config CRYPTO_DEV_SP_CCP
bool "Cryptographic Coprocessor device"
default y
- depends on CRYPTO_DEV_CCP_DD
+ depends on CRYPTO_DEV_CCP_DD && DMADEVICES
select HW_RANDOM
select DMA_ENGINE
- select DMADEVICES
select CRYPTO_SHA1
select CRYPTO_SHA256
help
{
struct ccp_ctx *ctx = crypto_tfm_ctx(crypto_ahash_tfm(tfm));
struct crypto_shash *shash = ctx->u.sha.hmac_tfm;
-
- SHASH_DESC_ON_STACK(sdesc, shash);
-
unsigned int block_size = crypto_shash_blocksize(shash);
unsigned int digest_size = crypto_shash_digestsize(shash);
int i, ret;
if (key_len > block_size) {
/* Must hash the input key */
- sdesc->tfm = shash;
-
- ret = crypto_shash_digest(sdesc, key, key_len,
- ctx->u.sha.key);
+ ret = crypto_shash_tfm_digest(shash, key, key_len,
+ ctx->u.sha.key);
if (ret)
return -EINVAL;
#include <linux/hw_random.h>
#include <linux/ccp.h>
#include <linux/firmware.h>
+#include <linux/gfp.h>
#include <asm/smp.h>
static bool psp_dead;
static int psp_timeout;
+/* Trusted Memory Region (TMR):
+ * The TMR is a 1MB area that must be 1MB aligned. Use the page allocator
+ * to allocate the memory, which will return aligned memory for the specified
+ * allocation order.
+ */
+#define SEV_ES_TMR_SIZE (1024 * 1024)
+static void *sev_es_tmr;
+
static inline bool sev_version_greater_or_equal(u8 maj, u8 min)
{
struct sev_device *sev = psp_master->sev_data;
if (sev->state == SEV_STATE_INIT)
return 0;
+ if (sev_es_tmr) {
+ u64 tmr_pa;
+
+ /*
+ * Do not include the encryption mask on the physical
+ * address of the TMR (firmware should clear it anyway).
+ */
+ tmr_pa = __pa(sev_es_tmr);
+
+ sev->init_cmd_buf.flags |= SEV_INIT_FLAGS_SEV_ES;
+ sev->init_cmd_buf.tmr_address = tmr_pa;
+ sev->init_cmd_buf.tmr_len = SEV_ES_TMR_SIZE;
+ }
+
rc = __sev_do_cmd_locked(SEV_CMD_INIT, &sev->init_cmd_buf, error);
if (rc)
return rc;
void sev_pci_init(void)
{
struct sev_device *sev = psp_master->sev_data;
+ struct page *tmr_page;
int error, rc;
if (!sev)
sev_update_firmware(sev->dev) == 0)
sev_get_api_version();
+ /* Obtain the TMR memory area for SEV-ES use */
+ tmr_page = alloc_pages(GFP_KERNEL, get_order(SEV_ES_TMR_SIZE));
+ if (tmr_page) {
+ sev_es_tmr = page_address(tmr_page);
+ } else {
+ sev_es_tmr = NULL;
+ dev_warn(sev->dev,
+ "SEV: TMR allocation failed, SEV-ES support unavailable\n");
+ }
+
/* Initialize the platform */
rc = sev_platform_init(&error);
if (rc && (error == SEV_RET_SECURE_DATA_INVALID)) {
return;
sev_platform_shutdown(NULL);
+
+ if (sev_es_tmr) {
+ /* The TMR area was encrypted, flush it from the cache */
+ wbinvd_on_all_cpus();
+
+ free_pages((unsigned long)sev_es_tmr,
+ get_order(SEV_ES_TMR_SIZE));
+ sev_es_tmr = NULL;
+ }
}
int key_len = keylen >> 1;
int err;
- SHASH_DESC_ON_STACK(desc, ctx_p->shash_tfm);
-
- desc->tfm = ctx_p->shash_tfm;
-
- err = crypto_shash_digest(desc, ctx_p->user.key, key_len,
- ctx_p->user.key + key_len);
+ err = crypto_shash_tfm_digest(ctx_p->shash_tfm,
+ ctx_p->user.key, key_len,
+ ctx_p->user.key + key_len);
if (err) {
dev_err(dev, "Failed to hash ESSIV key.\n");
return err;
{ .name = "VERSION" }, /* Must be 1st */
};
-static struct debugfs_reg32 pid_cid_regs[] = {
+static const struct debugfs_reg32 pid_cid_regs[] = {
CC_DEBUG_REG(PERIPHERAL_ID_0),
CC_DEBUG_REG(PERIPHERAL_ID_1),
CC_DEBUG_REG(PERIPHERAL_ID_2),
CC_DEBUG_REG(COMPONENT_ID_3),
};
-static struct debugfs_reg32 debug_regs[] = {
+static const struct debugfs_reg32 debug_regs[] = {
CC_DEBUG_REG(HOST_IRR),
CC_DEBUG_REG(HOST_POWER_DOWN_EN),
CC_DEBUG_REG(AXIM_MON_ERR),
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/crypto.h>
-#include <linux/cryptohash.h>
#include <linux/skbuff.h>
#include <linux/rtnetlink.h>
#include <linux/highmem.h>
struct uld_ctx *u_ctx = ULD_CTX(h_ctx(rtfm));
struct chcr_context *ctx = h_ctx(rtfm);
u8 bs = crypto_tfm_alg_blocksize(crypto_ahash_tfm(rtfm));
- int error = -EINVAL;
+ int error;
unsigned int cpu;
cpu = get_cpu();
#include <linux/kernel.h>
#include <linux/module.h>
#include <linux/crypto.h>
-#include <linux/cryptohash.h>
#include <linux/skbuff.h>
#include <linux/rtnetlink.h>
#include <linux/highmem.h>
make_tx_data_wr(sk, skb, immdlen, len,
credits_needed, completion);
tp->snd_nxt += len;
- tp->lsndtime = tcp_time_stamp(tp);
+ tp->lsndtime = tcp_jiffies32;
if (completion)
ULP_SKB_CB(skb)->flags &= ~ULPCB_FLAG_NEED_HDR;
} else {
depends on PCI && PCI_MSI
depends on UACCE || UACCE=n
depends on ARM64 || (COMPILE_TEST && 64BIT)
+ depends on ACPI
help
Support for HiSilicon SEC Engine of version 2 in crypto subsystem.
It provides AES, SM4, and 3DES algorithms with ECB
depends on ARM64 || COMPILE_TEST
depends on PCI && PCI_MSI
depends on UACCE || UACCE=n
+ depends on ACPI
help
HiSilicon accelerator engines use a common queue management
interface. Specific engine driver may use this module.
depends on ARM64 || (COMPILE_TEST && 64BIT)
depends on !CPU_BIG_ENDIAN || COMPILE_TEST
depends on UACCE || UACCE=n
+ depends on ACPI
select CRYPTO_DEV_HISI_QM
help
Support for HiSilicon ZIP Driver
depends on PCI && PCI_MSI
depends on UACCE || UACCE=n
depends on ARM64 || (COMPILE_TEST && 64BIT)
+ depends on ACPI
select CRYPTO_DEV_HISI_QM
select CRYPTO_DH
select CRYPTO_RSA
HPRE_DEBUG_FILE_NUM,
};
+enum hpre_dfx_dbgfs_file {
+ HPRE_SEND_CNT,
+ HPRE_RECV_CNT,
+ HPRE_SEND_FAIL_CNT,
+ HPRE_SEND_BUSY_CNT,
+ HPRE_OVER_THRHLD_CNT,
+ HPRE_OVERTIME_THRHLD,
+ HPRE_INVALID_REQ_CNT,
+ HPRE_DFX_FILE_NUM
+};
+
#define HPRE_DEBUGFS_FILE_NUM (HPRE_DEBUG_FILE_NUM + HPRE_CLUSTERS_NUM - 1)
struct hpre_debugfs_file {
struct hpre_debug *debug;
};
+struct hpre_dfx {
+ atomic64_t value;
+ enum hpre_dfx_dbgfs_file type;
+};
+
/*
* One HPRE controller has one PF and multiple VFs, some global configurations
* which PF has need this structure.
*/
struct hpre_debug {
struct dentry *debug_root;
+ struct hpre_dfx dfx[HPRE_DFX_FILE_NUM];
struct hpre_debugfs_file files[HPRE_DEBUGFS_FILE_NUM];
};
struct hpre {
struct hisi_qm qm;
struct hpre_debug debug;
- u32 num_vfs;
unsigned long status;
};
#include <linux/dma-mapping.h>
#include <linux/fips.h>
#include <linux/module.h>
+#include <linux/time.h>
#include "hpre.h"
struct hpre_ctx;
#define HPRE_SQE_DONE_SHIFT 30
#define HPRE_DH_MAX_P_SZ 512
+#define HPRE_DFX_SEC_TO_US 1000000
+#define HPRE_DFX_US_TO_NS 1000
+
typedef void (*hpre_cb)(struct hpre_ctx *ctx, void *sqe);
struct hpre_rsa_ctx {
struct hpre_ctx {
struct hisi_qp *qp;
struct hpre_asym_request **req_list;
+ struct hpre *hpre;
spinlock_t req_lock;
unsigned int key_sz;
bool crt_g2_mode;
int err;
int req_id;
hpre_cb cb;
+ struct timespec64 req_time;
};
static DEFINE_MUTEX(hpre_alg_lock);
static int hpre_add_req_to_ctx(struct hpre_asym_request *hpre_req)
{
struct hpre_ctx *ctx;
+ struct hpre_dfx *dfx;
int id;
ctx = hpre_req->ctx;
ctx->req_list[id] = hpre_req;
hpre_req->req_id = id;
+ dfx = ctx->hpre->debug.dfx;
+ if (atomic64_read(&dfx[HPRE_OVERTIME_THRHLD].value))
+ ktime_get_ts64(&hpre_req->req_time);
+
return id;
}
static int hpre_ctx_set(struct hpre_ctx *ctx, struct hisi_qp *qp, int qlen)
{
+ struct hpre *hpre;
+
if (!ctx || !qp || qlen < 0)
return -EINVAL;
spin_lock_init(&ctx->req_lock);
ctx->qp = qp;
+ hpre = container_of(ctx->qp->qm, struct hpre, qm);
+ ctx->hpre = hpre;
ctx->req_list = kcalloc(qlen, sizeof(void *), GFP_KERNEL);
if (!ctx->req_list)
return -ENOMEM;
ctx->key_sz = 0;
}
+static bool hpre_is_bd_timeout(struct hpre_asym_request *req,
+ u64 overtime_thrhld)
+{
+ struct timespec64 reply_time;
+ u64 time_use_us;
+
+ ktime_get_ts64(&reply_time);
+ time_use_us = (reply_time.tv_sec - req->req_time.tv_sec) *
+ HPRE_DFX_SEC_TO_US +
+ (reply_time.tv_nsec - req->req_time.tv_nsec) /
+ HPRE_DFX_US_TO_NS;
+
+ if (time_use_us <= overtime_thrhld)
+ return false;
+
+ return true;
+}
+
static void hpre_dh_cb(struct hpre_ctx *ctx, void *resp)
{
+ struct hpre_dfx *dfx = ctx->hpre->debug.dfx;
struct hpre_asym_request *req;
struct kpp_request *areq;
+ u64 overtime_thrhld;
int ret;
ret = hpre_alg_res_post_hf(ctx, resp, (void **)&req);
areq = req->areq.dh;
areq->dst_len = ctx->key_sz;
+
+ overtime_thrhld = atomic64_read(&dfx[HPRE_OVERTIME_THRHLD].value);
+ if (overtime_thrhld && hpre_is_bd_timeout(req, overtime_thrhld))
+ atomic64_inc(&dfx[HPRE_OVER_THRHLD_CNT].value);
+
hpre_hw_data_clr_all(ctx, req, areq->dst, areq->src);
kpp_request_complete(areq, ret);
+ atomic64_inc(&dfx[HPRE_RECV_CNT].value);
}
static void hpre_rsa_cb(struct hpre_ctx *ctx, void *resp)
{
+ struct hpre_dfx *dfx = ctx->hpre->debug.dfx;
struct hpre_asym_request *req;
struct akcipher_request *areq;
+ u64 overtime_thrhld;
int ret;
ret = hpre_alg_res_post_hf(ctx, resp, (void **)&req);
+
+ overtime_thrhld = atomic64_read(&dfx[HPRE_OVERTIME_THRHLD].value);
+ if (overtime_thrhld && hpre_is_bd_timeout(req, overtime_thrhld))
+ atomic64_inc(&dfx[HPRE_OVER_THRHLD_CNT].value);
+
areq = req->areq.rsa;
areq->dst_len = ctx->key_sz;
hpre_hw_data_clr_all(ctx, req, areq->dst, areq->src);
akcipher_request_complete(areq, ret);
+ atomic64_inc(&dfx[HPRE_RECV_CNT].value);
}
static void hpre_alg_cb(struct hisi_qp *qp, void *resp)
{
struct hpre_ctx *ctx = qp->qp_ctx;
+ struct hpre_dfx *dfx = ctx->hpre->debug.dfx;
struct hpre_sqe *sqe = resp;
+ struct hpre_asym_request *req = ctx->req_list[le16_to_cpu(sqe->tag)];
- ctx->req_list[le16_to_cpu(sqe->tag)]->cb(ctx, resp);
+
+ if (unlikely(!req)) {
+ atomic64_inc(&dfx[HPRE_INVALID_REQ_CNT].value);
+ return;
+ }
+
+ req->cb(ctx, resp);
}
static int hpre_ctx_init(struct hpre_ctx *ctx)
return 0;
}
+static int hpre_send(struct hpre_ctx *ctx, struct hpre_sqe *msg)
+{
+ struct hpre_dfx *dfx = ctx->hpre->debug.dfx;
+ int ctr = 0;
+ int ret;
+
+ do {
+ atomic64_inc(&dfx[HPRE_SEND_CNT].value);
+ ret = hisi_qp_send(ctx->qp, msg);
+ if (ret != -EBUSY)
+ break;
+ atomic64_inc(&dfx[HPRE_SEND_BUSY_CNT].value);
+ } while (ctr++ < HPRE_TRY_SEND_TIMES);
+
+ if (likely(!ret))
+ return ret;
+
+ if (ret != -EBUSY)
+ atomic64_inc(&dfx[HPRE_SEND_FAIL_CNT].value);
+
+ return ret;
+}
+
#ifdef CONFIG_CRYPTO_DH
static int hpre_dh_compute_value(struct kpp_request *req)
{
void *tmp = kpp_request_ctx(req);
struct hpre_asym_request *hpre_req = PTR_ALIGN(tmp, HPRE_ALIGN_SZ);
struct hpre_sqe *msg = &hpre_req->req;
- int ctr = 0;
int ret;
ret = hpre_msg_request_set(ctx, req, false);
msg->dw0 = cpu_to_le32(le32_to_cpu(msg->dw0) | HPRE_ALG_DH_G2);
else
msg->dw0 = cpu_to_le32(le32_to_cpu(msg->dw0) | HPRE_ALG_DH);
- do {
- ret = hisi_qp_send(ctx->qp, msg);
- } while (ret == -EBUSY && ctr++ < HPRE_TRY_SEND_TIMES);
/* success */
+ ret = hpre_send(ctx, msg);
if (likely(!ret))
return -EINPROGRESS;
void *tmp = akcipher_request_ctx(req);
struct hpre_asym_request *hpre_req = PTR_ALIGN(tmp, HPRE_ALIGN_SZ);
struct hpre_sqe *msg = &hpre_req->req;
- int ctr = 0;
int ret;
/* For 512 and 1536 bits key size, use soft tfm instead */
if (unlikely(ret))
goto clear_all;
- do {
- ret = hisi_qp_send(ctx->qp, msg);
- } while (ret == -EBUSY && ctr++ < HPRE_TRY_SEND_TIMES);
-
/* success */
+ ret = hpre_send(ctx, msg);
if (likely(!ret))
return -EINPROGRESS;
void *tmp = akcipher_request_ctx(req);
struct hpre_asym_request *hpre_req = PTR_ALIGN(tmp, HPRE_ALIGN_SZ);
struct hpre_sqe *msg = &hpre_req->req;
- int ctr = 0;
int ret;
/* For 512 and 1536 bits key size, use soft tfm instead */
if (unlikely(ret))
goto clear_all;
- do {
- ret = hisi_qp_send(ctx->qp, msg);
- } while (ret == -EBUSY && ctr++ < HPRE_TRY_SEND_TIMES);
-
/* success */
+ ret = hpre_send(ctx, msg);
if (likely(!ret))
return -EINPROGRESS;
#define HPRE_HAC_ECC2_CNT 0x301a08
#define HPRE_HAC_INT_STATUS 0x301800
#define HPRE_HAC_SOURCE_INT 0x301600
-#define MASTER_GLOBAL_CTRL_SHUTDOWN 1
-#define MASTER_TRANS_RETURN_RW 3
-#define HPRE_MASTER_TRANS_RETURN 0x300150
-#define HPRE_MASTER_GLOBAL_CTRL 0x300000
#define HPRE_CLSTR_ADDR_INTRVL 0x1000
#define HPRE_CLUSTER_INQURY 0x100
#define HPRE_CLSTR_ADDR_INQRY_RSLT 0x104
#define HPRE_BD_USR_MASK 0x3
#define HPRE_CLUSTER_CORE_MASK 0xf
+#define HPRE_AM_OOO_SHUTDOWN_ENB 0x301044
+#define HPRE_AM_OOO_SHUTDOWN_ENABLE BIT(0)
+#define HPRE_WR_MSI_PORT BIT(2)
+
+#define HPRE_CORE_ECC_2BIT_ERR BIT(1)
+#define HPRE_OOO_ECC_2BIT_ERR BIT(5)
+
#define HPRE_VIA_MSI_DSM 1
+#define HPRE_SQE_MASK_OFFSET 8
+#define HPRE_SQE_MASK_LEN 24
static struct hisi_qm_list hpre_devices;
static const char hpre_name[] = "hisi_hpre";
HPRE_CLSTR_BASE + HPRE_CLUSTER3 * HPRE_CLSTR_ADDR_INTRVL,
};
-static struct debugfs_reg32 hpre_cluster_dfx_regs[] = {
+static const struct debugfs_reg32 hpre_cluster_dfx_regs[] = {
{"CORES_EN_STATUS ", HPRE_CORE_EN_OFFSET},
{"CORES_INI_CFG ", HPRE_CORE_INI_CFG_OFFSET},
{"CORES_INI_STATUS ", HPRE_CORE_INI_STATUS_OFFSET},
{"CORES_IS_SCHD ", HPRE_CORE_IS_SCHD_OFFSET},
};
-static struct debugfs_reg32 hpre_com_dfx_regs[] = {
+static const struct debugfs_reg32 hpre_com_dfx_regs[] = {
{"READ_CLR_EN ", HPRE_CTRL_CNT_CLR_CE},
{"AXQOS ", HPRE_VFG_AXQOS},
{"AWUSR_CFG ", HPRE_AWUSR_FP_CFG},
{"INT_STATUS ", HPRE_INT_STATUS},
};
-static int hpre_pf_q_num_set(const char *val, const struct kernel_param *kp)
-{
- struct pci_dev *pdev;
- u32 n, q_num;
- u8 rev_id;
- int ret;
-
- if (!val)
- return -EINVAL;
-
- pdev = pci_get_device(PCI_VENDOR_ID_HUAWEI, HPRE_PCI_DEVICE_ID, NULL);
- if (!pdev) {
- q_num = HPRE_QUEUE_NUM_V2;
- pr_info("No device found currently, suppose queue number is %d\n",
- q_num);
- } else {
- rev_id = pdev->revision;
- if (rev_id != QM_HW_V2)
- return -EINVAL;
-
- q_num = HPRE_QUEUE_NUM_V2;
- }
-
- ret = kstrtou32(val, 10, &n);
- if (ret != 0 || n == 0 || n > q_num)
- return -EINVAL;
+static const char *hpre_dfx_files[HPRE_DFX_FILE_NUM] = {
+ "send_cnt",
+ "recv_cnt",
+ "send_fail_cnt",
+ "send_busy_cnt",
+ "over_thrhld_cnt",
+ "overtime_thrhld",
+ "invalid_req_cnt"
+};
- return param_set_int(val, kp);
+static int pf_q_num_set(const char *val, const struct kernel_param *kp)
+{
+ return q_num_set(val, kp, HPRE_PCI_DEVICE_ID);
}
static const struct kernel_param_ops hpre_pf_q_num_ops = {
- .set = hpre_pf_q_num_set,
+ .set = pf_q_num_set,
.get = param_get_int,
};
-static u32 hpre_pf_q_num = HPRE_PF_DEF_Q_NUM;
-module_param_cb(hpre_pf_q_num, &hpre_pf_q_num_ops, &hpre_pf_q_num, 0444);
-MODULE_PARM_DESC(hpre_pf_q_num, "Number of queues in PF of CS(1-1024)");
+static u32 pf_q_num = HPRE_PF_DEF_Q_NUM;
+module_param_cb(pf_q_num, &hpre_pf_q_num_ops, &pf_q_num, 0444);
+MODULE_PARM_DESC(pf_q_num, "Number of queues in PF of CS(1-1024)");
+
+static const struct kernel_param_ops vfs_num_ops = {
+ .set = vfs_num_set,
+ .get = param_get_int,
+};
+
+static u32 vfs_num;
+module_param_cb(vfs_num, &vfs_num_ops, &vfs_num, 0444);
+MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63), 0(default)");
struct hisi_qp *hpre_create_qp(void)
{
return 0;
}
-static int hpre_set_user_domain_and_cache(struct hpre *hpre)
+static int hpre_set_user_domain_and_cache(struct hisi_qm *qm)
{
- struct hisi_qm *qm = &hpre->qm;
struct device *dev = &qm->pdev->dev;
unsigned long offset;
int ret, i;
static void hpre_hw_error_disable(struct hisi_qm *qm)
{
+ u32 val;
+
/* disable hpre hw error interrupts */
writel(HPRE_CORE_INT_DISABLE, qm->io_base + HPRE_INT_MASK);
+
+ /* disable HPRE block master OOO when m-bit error occur */
+ val = readl(qm->io_base + HPRE_AM_OOO_SHUTDOWN_ENB);
+ val &= ~HPRE_AM_OOO_SHUTDOWN_ENABLE;
+ writel(val, qm->io_base + HPRE_AM_OOO_SHUTDOWN_ENB);
}
static void hpre_hw_error_enable(struct hisi_qm *qm)
{
+ u32 val;
+
+ /* clear HPRE hw error source if having */
+ writel(HPRE_CORE_INT_DISABLE, qm->io_base + HPRE_HAC_SOURCE_INT);
+
/* enable hpre hw error interrupts */
writel(HPRE_CORE_INT_ENABLE, qm->io_base + HPRE_INT_MASK);
writel(HPRE_HAC_RAS_CE_ENABLE, qm->io_base + HPRE_RAS_CE_ENB);
writel(HPRE_HAC_RAS_NFE_ENABLE, qm->io_base + HPRE_RAS_NFE_ENB);
writel(HPRE_HAC_RAS_FE_ENABLE, qm->io_base + HPRE_RAS_FE_ENB);
+
+ /* enable HPRE block master OOO when m-bit error occur */
+ val = readl(qm->io_base + HPRE_AM_OOO_SHUTDOWN_ENB);
+ val |= HPRE_AM_OOO_SHUTDOWN_ENABLE;
+ writel(val, qm->io_base + HPRE_AM_OOO_SHUTDOWN_ENB);
}
static inline struct hisi_qm *hpre_file_to_qm(struct hpre_debugfs_file *file)
static int hpre_current_qm_write(struct hpre_debugfs_file *file, u32 val)
{
struct hisi_qm *qm = hpre_file_to_qm(file);
- struct hpre_debug *debug = file->debug;
- struct hpre *hpre = container_of(debug, struct hpre, debug);
- u32 num_vfs = hpre->num_vfs;
+ u32 num_vfs = qm->vfs_num;
u32 vfq_num, tmp;
.write = hpre_ctrl_debug_write,
};
+static int hpre_debugfs_atomic64_get(void *data, u64 *val)
+{
+ struct hpre_dfx *dfx_item = data;
+
+ *val = atomic64_read(&dfx_item->value);
+
+ return 0;
+}
+
+static int hpre_debugfs_atomic64_set(void *data, u64 val)
+{
+ struct hpre_dfx *dfx_item = data;
+ struct hpre_dfx *hpre_dfx = dfx_item - HPRE_OVERTIME_THRHLD;
+
+ if (val)
+ return -EINVAL;
+
+ if (dfx_item->type == HPRE_OVERTIME_THRHLD)
+ atomic64_set(&hpre_dfx[HPRE_OVER_THRHLD_CNT].value, 0);
+ atomic64_set(&dfx_item->value, val);
+
+ return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(hpre_atomic64_ops, hpre_debugfs_atomic64_get,
+ hpre_debugfs_atomic64_set, "%llu\n");
+
static int hpre_create_debugfs_file(struct hpre_debug *dbg, struct dentry *dir,
enum hpre_ctrl_dbgfs_file type, int indx)
{
return hpre_cluster_debugfs_init(debug);
}
+static void hpre_dfx_debug_init(struct hpre_debug *debug)
+{
+ struct hpre *hpre = container_of(debug, struct hpre, debug);
+ struct hpre_dfx *dfx = hpre->debug.dfx;
+ struct hisi_qm *qm = &hpre->qm;
+ struct dentry *parent;
+ int i;
+
+ parent = debugfs_create_dir("hpre_dfx", qm->debug.debug_root);
+ for (i = 0; i < HPRE_DFX_FILE_NUM; i++) {
+ dfx[i].type = i;
+ debugfs_create_file(hpre_dfx_files[i], 0644, parent, &dfx[i],
+ &hpre_atomic64_ops);
+ }
+}
+
static int hpre_debugfs_init(struct hpre *hpre)
{
struct hisi_qm *qm = &hpre->qm;
dir = debugfs_create_dir(dev_name(dev), hpre_debugfs_root);
qm->debug.debug_root = dir;
+ qm->debug.sqe_mask_offset = HPRE_SQE_MASK_OFFSET;
+ qm->debug.sqe_mask_len = HPRE_SQE_MASK_LEN;
ret = hisi_qm_debug_init(qm);
if (ret)
if (ret)
goto failed_to_create;
}
+
+ hpre_dfx_debug_init(&hpre->debug);
+
return 0;
failed_to_create:
debugfs_remove_recursive(qm->debug.debug_root);
}
-static int hpre_qm_pre_init(struct hisi_qm *qm, struct pci_dev *pdev)
+static int hpre_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
{
- enum qm_hw_ver rev_id;
-
- rev_id = hisi_qm_get_hw_version(pdev);
- if (rev_id < 0)
- return -ENODEV;
-
- if (rev_id == QM_HW_V1) {
+ if (pdev->revision == QM_HW_V1) {
pci_warn(pdev, "HPRE version 1 is not supported!\n");
return -EINVAL;
}
qm->pdev = pdev;
- qm->ver = rev_id;
+ qm->ver = pdev->revision;
qm->sqe_size = HPRE_SQE_SIZE;
qm->dev_name = hpre_name;
+
qm->fun_type = (pdev->device == HPRE_PCI_DEVICE_ID) ?
- QM_HW_PF : QM_HW_VF;
- if (pdev->is_physfn) {
+ QM_HW_PF : QM_HW_VF;
+ if (qm->fun_type == QM_HW_PF) {
qm->qp_base = HPRE_PF_DEF_Q_BASE;
- qm->qp_num = hpre_pf_q_num;
+ qm->qp_num = pf_q_num;
+ qm->qm_list = &hpre_devices;
}
- qm->use_dma_api = true;
- return 0;
+ return hisi_qm_init(qm);
}
static void hpre_log_hw_error(struct hisi_qm *qm, u32 err_sts)
err->msg, err->int_msk);
err++;
}
-
- writel(err_sts, qm->io_base + HPRE_HAC_SOURCE_INT);
}
static u32 hpre_get_hw_err_status(struct hisi_qm *qm)
return readl(qm->io_base + HPRE_HAC_INT_STATUS);
}
+static void hpre_clear_hw_err_status(struct hisi_qm *qm, u32 err_sts)
+{
+ writel(err_sts, qm->io_base + HPRE_HAC_SOURCE_INT);
+}
+
+static void hpre_open_axi_master_ooo(struct hisi_qm *qm)
+{
+ u32 value;
+
+ value = readl(qm->io_base + HPRE_AM_OOO_SHUTDOWN_ENB);
+ writel(value & ~HPRE_AM_OOO_SHUTDOWN_ENABLE,
+ HPRE_ADDR(qm, HPRE_AM_OOO_SHUTDOWN_ENB));
+ writel(value | HPRE_AM_OOO_SHUTDOWN_ENABLE,
+ HPRE_ADDR(qm, HPRE_AM_OOO_SHUTDOWN_ENB));
+}
+
static const struct hisi_qm_err_ini hpre_err_ini = {
+ .hw_init = hpre_set_user_domain_and_cache,
.hw_err_enable = hpre_hw_error_enable,
.hw_err_disable = hpre_hw_error_disable,
.get_dev_hw_err_status = hpre_get_hw_err_status,
+ .clear_dev_hw_err_status = hpre_clear_hw_err_status,
.log_dev_hw_err = hpre_log_hw_error,
+ .open_axi_master_ooo = hpre_open_axi_master_ooo,
.err_info = {
.ce = QM_BASE_CE,
.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT,
.fe = 0,
- .msi = QM_DB_RANDOM_INVALID,
+ .ecc_2bits_mask = HPRE_CORE_ECC_2BIT_ERR |
+ HPRE_OOO_ECC_2BIT_ERR,
+ .msi_wr_port = HPRE_WR_MSI_PORT,
+ .acpi_rst = "HRST",
}
};
qm->ctrl_qp_num = HPRE_QUEUE_NUM_V2;
- ret = hpre_set_user_domain_and_cache(hpre);
+ ret = hpre_set_user_domain_and_cache(qm);
if (ret)
return ret;
return 0;
}
+static int hpre_probe_init(struct hpre *hpre)
+{
+ struct hisi_qm *qm = &hpre->qm;
+ int ret;
+
+ if (qm->fun_type == QM_HW_PF) {
+ ret = hpre_pf_probe_init(hpre);
+ if (ret)
+ return ret;
+ }
+
+ return 0;
+}
+
static int hpre_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct hisi_qm *qm;
if (!hpre)
return -ENOMEM;
- pci_set_drvdata(pdev, hpre);
-
qm = &hpre->qm;
- ret = hpre_qm_pre_init(qm, pdev);
- if (ret)
- return ret;
-
- ret = hisi_qm_init(qm);
- if (ret)
+ ret = hpre_qm_init(qm, pdev);
+ if (ret) {
+ pci_err(pdev, "Failed to init HPRE QM (%d)!\n", ret);
return ret;
+ }
- if (pdev->is_physfn) {
- ret = hpre_pf_probe_init(hpre);
- if (ret)
- goto err_with_qm_init;
- } else if (qm->fun_type == QM_HW_VF && qm->ver == QM_HW_V2) {
- /* v2 starts to support get vft by mailbox */
- ret = hisi_qm_get_vft(qm, &qm->qp_base, &qm->qp_num);
- if (ret)
- goto err_with_qm_init;
+ ret = hpre_probe_init(hpre);
+ if (ret) {
+ pci_err(pdev, "Failed to probe (%d)!\n", ret);
+ goto err_with_qm_init;
}
ret = hisi_qm_start(qm);
pci_err(pdev, "fail to register algs to crypto!\n");
goto err_with_qm_start;
}
+
+ if (qm->fun_type == QM_HW_PF && vfs_num) {
+ ret = hisi_qm_sriov_enable(pdev, vfs_num);
+ if (ret < 0)
+ goto err_with_crypto_register;
+ }
+
return 0;
+err_with_crypto_register:
+ hpre_algs_unregister();
+
err_with_qm_start:
hisi_qm_del_from_list(qm, &hpre_devices);
hisi_qm_stop(qm);
return ret;
}
-static int hpre_vf_q_assign(struct hpre *hpre, int num_vfs)
-{
- struct hisi_qm *qm = &hpre->qm;
- u32 qp_num = qm->qp_num;
- int q_num, remain_q_num, i;
- u32 q_base = qp_num;
- int ret;
-
- if (!num_vfs)
- return -EINVAL;
-
- remain_q_num = qm->ctrl_qp_num - qp_num;
-
- /* If remaining queues are not enough, return error. */
- if (remain_q_num < num_vfs)
- return -EINVAL;
-
- q_num = remain_q_num / num_vfs;
- for (i = 1; i <= num_vfs; i++) {
- if (i == num_vfs)
- q_num += remain_q_num % num_vfs;
- ret = hisi_qm_set_vft(qm, i, q_base, (u32)q_num);
- if (ret)
- return ret;
- q_base += q_num;
- }
-
- return 0;
-}
-
-static int hpre_clear_vft_config(struct hpre *hpre)
-{
- struct hisi_qm *qm = &hpre->qm;
- u32 num_vfs = hpre->num_vfs;
- int ret;
- u32 i;
-
- for (i = 1; i <= num_vfs; i++) {
- ret = hisi_qm_set_vft(qm, i, 0, 0);
- if (ret)
- return ret;
- }
- hpre->num_vfs = 0;
-
- return 0;
-}
-
-static int hpre_sriov_enable(struct pci_dev *pdev, int max_vfs)
-{
- struct hpre *hpre = pci_get_drvdata(pdev);
- int pre_existing_vfs, num_vfs, ret;
-
- pre_existing_vfs = pci_num_vf(pdev);
- if (pre_existing_vfs) {
- pci_err(pdev,
- "Can't enable VF. Please disable pre-enabled VFs!\n");
- return 0;
- }
-
- num_vfs = min_t(int, max_vfs, HPRE_VF_NUM);
- ret = hpre_vf_q_assign(hpre, num_vfs);
- if (ret) {
- pci_err(pdev, "Can't assign queues for VF!\n");
- return ret;
- }
-
- hpre->num_vfs = num_vfs;
-
- ret = pci_enable_sriov(pdev, num_vfs);
- if (ret) {
- pci_err(pdev, "Can't enable VF!\n");
- hpre_clear_vft_config(hpre);
- return ret;
- }
-
- return num_vfs;
-}
-
-static int hpre_sriov_disable(struct pci_dev *pdev)
-{
- struct hpre *hpre = pci_get_drvdata(pdev);
-
- if (pci_vfs_assigned(pdev)) {
- pci_err(pdev, "Failed to disable VFs while VFs are assigned!\n");
- return -EPERM;
- }
-
- /* remove in hpre_pci_driver will be called to free VF resources */
- pci_disable_sriov(pdev);
-
- return hpre_clear_vft_config(hpre);
-}
-
-static int hpre_sriov_configure(struct pci_dev *pdev, int num_vfs)
-{
- if (num_vfs)
- return hpre_sriov_enable(pdev, num_vfs);
- else
- return hpre_sriov_disable(pdev);
-}
-
static void hpre_remove(struct pci_dev *pdev)
{
struct hpre *hpre = pci_get_drvdata(pdev);
hpre_algs_unregister();
hisi_qm_del_from_list(qm, &hpre_devices);
- if (qm->fun_type == QM_HW_PF && hpre->num_vfs != 0) {
- ret = hpre_sriov_disable(pdev);
+ if (qm->fun_type == QM_HW_PF && qm->vfs_num) {
+ ret = hisi_qm_sriov_disable(pdev);
if (ret) {
pci_err(pdev, "Disable SRIOV fail!\n");
return;
static const struct pci_error_handlers hpre_err_handler = {
.error_detected = hisi_qm_dev_err_detected,
+ .slot_reset = hisi_qm_dev_slot_reset,
+ .reset_prepare = hisi_qm_reset_prepare,
+ .reset_done = hisi_qm_reset_done,
};
static struct pci_driver hpre_pci_driver = {
.id_table = hpre_dev_ids,
.probe = hpre_probe,
.remove = hpre_remove,
- .sriov_configure = hpre_sriov_configure,
+ .sriov_configure = hisi_qm_sriov_configure,
.err_handler = &hpre_err_handler,
};
// SPDX-License-Identifier: GPL-2.0
/* Copyright (c) 2019 HiSilicon Limited. */
#include <asm/page.h>
+#include <linux/acpi.h>
+#include <linux/aer.h>
#include <linux/bitmap.h>
#include <linux/debugfs.h>
#include <linux/dma-mapping.h>
+#include <linux/idr.h>
#include <linux/io.h>
#include <linux/irqreturn.h>
#include <linux/log2.h>
#define QM_SQ_TYPE_SHIFT 8
#define QM_SQ_TYPE_MASK GENMASK(3, 0)
+#define QM_SQ_TAIL_IDX(sqc) ((le16_to_cpu((sqc)->w11) >> 6) & 0x1)
/* cqc shift */
#define QM_CQ_HOP_NUM_SHIFT 0
#define QM_CQE_PHASE(cqe) (le16_to_cpu((cqe)->w7) & 0x1)
#define QM_QC_CQE_SIZE 4
+#define QM_CQ_TAIL_IDX(cqc) ((le16_to_cpu((cqc)->w11) >> 6) & 0x1)
/* eqc shift */
#define QM_EQE_AEQE_SIZE (2UL << 12)
#define QM_DFX_CNT_CLR_CE 0x100118
#define QM_ABNORMAL_INT_SOURCE 0x100000
+#define QM_ABNORMAL_INT_SOURCE_CLR GENMASK(12, 0)
#define QM_ABNORMAL_INT_MASK 0x100004
#define QM_ABNORMAL_INT_MASK_VALUE 0x1fff
#define QM_ABNORMAL_INT_STATUS 0x100008
+#define QM_ABNORMAL_INT_SET 0x10000c
#define QM_ABNORMAL_INF00 0x100010
#define QM_FIFO_OVERFLOW_TYPE 0xc0
#define QM_FIFO_OVERFLOW_TYPE_SHIFT 6
#define QM_RAS_CE_TIMES_PER_IRQ 1
#define QM_RAS_MSI_INT_SEL 0x1040f4
+#define QM_DEV_RESET_FLAG 0
+#define QM_RESET_WAIT_TIMEOUT 400
+#define QM_PEH_VENDOR_ID 0x1000d8
+#define ACC_VENDOR_ID_VALUE 0x5a5a
+#define QM_PEH_DFX_INFO0 0x1000fc
+#define ACC_PEH_SRIOV_CTRL_VF_MSE_SHIFT 3
+#define ACC_PEH_MSI_DISABLE GENMASK(31, 0)
+#define ACC_MASTER_GLOBAL_CTRL_SHUTDOWN 0x1
+#define ACC_MASTER_TRANS_RETURN_RW 3
+#define ACC_MASTER_TRANS_RETURN 0x300150
+#define ACC_MASTER_GLOBAL_CTRL 0x300000
+#define ACC_AM_CFG_PORT_WR_EN 0x30001c
+#define QM_RAS_NFE_MBIT_DISABLE ~QM_ECC_MBIT
+#define ACC_AM_ROB_ECC_INT_STS 0x300104
+#define ACC_ROB_ECC_ERR_MULTPL BIT(1)
+
+#define POLL_PERIOD 10
+#define POLL_TIMEOUT 1000
+#define WAIT_PERIOD_US_MAX 200
+#define WAIT_PERIOD_US_MIN 100
+#define MAX_WAIT_COUNTS 1000
#define QM_CACHE_WB_START 0x204
#define QM_CACHE_WB_DONE 0x208
#define QM_SQE_DATA_ALIGN_MASK GENMASK(6, 0)
#define QMC_ALIGN(sz) ALIGN(sz, 32)
+#define QM_DBG_READ_LEN 256
+#define QM_DBG_WRITE_LEN 1024
#define QM_DBG_TMP_BUF_LEN 22
+#define QM_PCI_COMMAND_INVALID ~0
+
+#define QM_SQE_ADDR_MASK GENMASK(7, 0)
#define QM_MK_CQC_DW3_V1(hop_num, pg_sz, buf_sz, cqe_sz) \
(((hop_num) << QM_CQ_HOP_NUM_SHIFT) | \
CQC_VFT,
};
+enum acc_err_result {
+ ACC_ERR_NONE,
+ ACC_ERR_NEED_RESET,
+ ACC_ERR_RECOVERED,
+};
+
struct qm_cqe {
__le32 rsvd0;
__le16 cmd_id;
u8 cmd, u16 index, u8 priority);
u32 (*get_irq_num)(struct hisi_qm *qm);
int (*debug_init)(struct hisi_qm *qm);
- void (*hw_error_init)(struct hisi_qm *qm, u32 ce, u32 nfe, u32 fe,
- u32 msi);
+ void (*hw_error_init)(struct hisi_qm *qm, u32 ce, u32 nfe, u32 fe);
void (*hw_error_uninit)(struct hisi_qm *qm);
- pci_ers_result_t (*hw_error_handle)(struct hisi_qm *qm);
+ enum acc_err_result (*hw_error_handle)(struct hisi_qm *qm);
+};
+
+struct qm_dfx_item {
+ const char *name;
+ u32 offset;
+};
+
+static struct qm_dfx_item qm_dfx_files[] = {
+ {"err_irq", offsetof(struct qm_dfx, err_irq_cnt)},
+ {"aeq_irq", offsetof(struct qm_dfx, aeq_irq_cnt)},
+ {"abnormal_irq", offsetof(struct qm_dfx, abnormal_irq_cnt)},
+ {"create_qp_err", offsetof(struct qm_dfx, create_qp_err_cnt)},
+ {"mb_err", offsetof(struct qm_dfx, mb_err_cnt)},
};
static const char * const qm_debug_file_name[] = {
"cq", "eq", "aeq",
};
+static const char * const qm_s[] = {
+ "init", "start", "close", "stop",
+};
+
+static const char * const qp_s[] = {
+ "none", "init", "start", "stop", "close",
+};
+
+static bool qm_avail_state(struct hisi_qm *qm, enum qm_state new)
+{
+ enum qm_state curr = atomic_read(&qm->status.flags);
+ bool avail = false;
+
+ switch (curr) {
+ case QM_INIT:
+ if (new == QM_START || new == QM_CLOSE)
+ avail = true;
+ break;
+ case QM_START:
+ if (new == QM_STOP)
+ avail = true;
+ break;
+ case QM_STOP:
+ if (new == QM_CLOSE || new == QM_START)
+ avail = true;
+ break;
+ default:
+ break;
+ }
+
+ dev_dbg(&qm->pdev->dev, "change qm state from %s to %s\n",
+ qm_s[curr], qm_s[new]);
+
+ if (!avail)
+ dev_warn(&qm->pdev->dev, "Can not change qm state from %s to %s\n",
+ qm_s[curr], qm_s[new]);
+
+ return avail;
+}
+
+static bool qm_qp_avail_state(struct hisi_qm *qm, struct hisi_qp *qp,
+ enum qp_state new)
+{
+ enum qm_state qm_curr = atomic_read(&qm->status.flags);
+ enum qp_state qp_curr = 0;
+ bool avail = false;
+
+ if (qp)
+ qp_curr = atomic_read(&qp->qp_status.flags);
+
+ switch (new) {
+ case QP_INIT:
+ if (qm_curr == QM_START || qm_curr == QM_INIT)
+ avail = true;
+ break;
+ case QP_START:
+ if ((qm_curr == QM_START && qp_curr == QP_INIT) ||
+ (qm_curr == QM_START && qp_curr == QP_STOP))
+ avail = true;
+ break;
+ case QP_STOP:
+ if ((qm_curr == QM_START && qp_curr == QP_START) ||
+ (qp_curr == QP_INIT))
+ avail = true;
+ break;
+ case QP_CLOSE:
+ if ((qm_curr == QM_START && qp_curr == QP_INIT) ||
+ (qm_curr == QM_START && qp_curr == QP_STOP) ||
+ (qm_curr == QM_STOP && qp_curr == QP_STOP) ||
+ (qm_curr == QM_STOP && qp_curr == QP_INIT))
+ avail = true;
+ break;
+ default:
+ break;
+ }
+
+ dev_dbg(&qm->pdev->dev, "change qp state from %s to %s in QM %s\n",
+ qp_s[qp_curr], qp_s[new], qm_s[qm_curr]);
+
+ if (!avail)
+ dev_warn(&qm->pdev->dev,
+ "Can not change qp state from %s to %s in QM %s\n",
+ qp_s[qp_curr], qp_s[new], qm_s[qm_curr]);
+
+ return avail;
+}
+
/* return 0 mailbox ready, -ETIMEDOUT hardware timeout */
static int qm_wait_mb_ready(struct hisi_qm *qm)
{
busy_unlock:
mutex_unlock(&qm->mailbox_lock);
+ if (ret)
+ atomic64_inc(&qm->debug.dfx.mb_err_cnt);
return ret;
}
{
u16 cqn = le32_to_cpu(eqe->dw0) & QM_EQE_CQN_MASK;
- return qm->qp_array[cqn];
+ return &qm->qp_array[cqn];
}
static void qm_cq_head_update(struct hisi_qp *qp)
while (QM_EQE_PHASE(eqe) == qm->status.eqc_phase) {
eqe_num++;
qp = qm_to_hisi_qp(qm, eqe);
- if (qp)
- qm_poll_qp(qp, qm);
+ qm_poll_qp(qp, qm);
if (qm->status.eq_head == QM_Q_DEPTH - 1) {
qm->status.eqc_phase = !qm->status.eqc_phase;
if (readl(qm->io_base + QM_VF_EQ_INT_SOURCE))
return do_qm_irq(irq, data);
+ atomic64_inc(&qm->debug.dfx.err_irq_cnt);
dev_err(&qm->pdev->dev, "invalid int source\n");
qm_db(qm, 0, QM_DOORBELL_CMD_EQ, qm->status.eq_head, 0);
struct qm_aeqe *aeqe = qm->aeqe + qm->status.aeq_head;
u32 type;
+ atomic64_inc(&qm->debug.dfx.aeq_irq_cnt);
if (!readl(qm->io_base + QM_VF_AEQ_INT_SOURCE))
return IRQ_NONE;
return IRQ_HANDLED;
}
-static irqreturn_t qm_abnormal_irq(int irq, void *data)
-{
- const struct hisi_qm_hw_error *err = qm_hw_error;
- struct hisi_qm *qm = data;
- struct device *dev = &qm->pdev->dev;
- u32 error_status, tmp;
-
- /* read err sts */
- tmp = readl(qm->io_base + QM_ABNORMAL_INT_STATUS);
- error_status = qm->msi_mask & tmp;
-
- while (err->msg) {
- if (err->int_msk & error_status)
- dev_err(dev, "%s [error status=0x%x] found\n",
- err->msg, err->int_msk);
-
- err++;
- }
-
- /* clear err sts */
- writel(error_status, qm->io_base + QM_ABNORMAL_INT_SOURCE);
-
- return IRQ_HANDLED;
-}
-
-static int qm_irq_register(struct hisi_qm *qm)
-{
- struct pci_dev *pdev = qm->pdev;
- int ret;
-
- ret = request_irq(pci_irq_vector(pdev, QM_EQ_EVENT_IRQ_VECTOR),
- qm_irq, IRQF_SHARED, qm->dev_name, qm);
- if (ret)
- return ret;
-
- if (qm->ver == QM_HW_V2) {
- ret = request_irq(pci_irq_vector(pdev, QM_AEQ_EVENT_IRQ_VECTOR),
- qm_aeq_irq, IRQF_SHARED, qm->dev_name, qm);
- if (ret)
- goto err_aeq_irq;
-
- if (qm->fun_type == QM_HW_PF) {
- ret = request_irq(pci_irq_vector(pdev,
- QM_ABNORMAL_EVENT_IRQ_VECTOR),
- qm_abnormal_irq, IRQF_SHARED,
- qm->dev_name, qm);
- if (ret)
- goto err_abonormal_irq;
- }
- }
-
- return 0;
-
-err_abonormal_irq:
- free_irq(pci_irq_vector(pdev, QM_AEQ_EVENT_IRQ_VECTOR), qm);
-err_aeq_irq:
- free_irq(pci_irq_vector(pdev, QM_EQ_EVENT_IRQ_VECTOR), qm);
- return ret;
-}
-
static void qm_irq_unregister(struct hisi_qm *qm)
{
struct pci_dev *pdev = qm->pdev;
free_irq(pci_irq_vector(pdev, QM_EQ_EVENT_IRQ_VECTOR), qm);
- if (qm->ver == QM_HW_V2) {
- free_irq(pci_irq_vector(pdev, QM_AEQ_EVENT_IRQ_VECTOR), qm);
+ if (qm->ver == QM_HW_V1)
+ return;
+
+ free_irq(pci_irq_vector(pdev, QM_AEQ_EVENT_IRQ_VECTOR), qm);
- if (qm->fun_type == QM_HW_PF)
- free_irq(pci_irq_vector(pdev,
- QM_ABNORMAL_EVENT_IRQ_VECTOR), qm);
- }
+ if (qm->fun_type == QM_HW_PF)
+ free_irq(pci_irq_vector(pdev,
+ QM_ABNORMAL_EVENT_IRQ_VECTOR), qm);
}
static void qm_init_qp_status(struct hisi_qp *qp)
qp_status->sq_tail = 0;
qp_status->cq_head = 0;
qp_status->cqc_phase = true;
- qp_status->flags = 0;
+ atomic_set(&qp_status->flags, 0);
}
static void qm_vft_data_cfg(struct hisi_qm *qm, enum vft_type type, u32 base,
if (number > 0) {
switch (type) {
case SQC_VFT:
- switch (qm->ver) {
- case QM_HW_V1:
+ if (qm->ver == QM_HW_V1) {
tmp = QM_SQC_VFT_BUF_SIZE |
QM_SQC_VFT_SQC_SIZE |
QM_SQC_VFT_INDEX_NUMBER |
QM_SQC_VFT_VALID |
(u64)base << QM_SQC_VFT_START_SQN_SHIFT;
- break;
- case QM_HW_V2:
+ } else {
tmp = (u64)base << QM_SQC_VFT_START_SQN_SHIFT |
QM_SQC_VFT_VALID |
(u64)(number - 1) << QM_SQC_VFT_SQN_SHIFT;
- break;
- case QM_HW_UNKNOWN:
- break;
}
break;
case CQC_VFT:
- switch (qm->ver) {
- case QM_HW_V1:
+ if (qm->ver == QM_HW_V1) {
tmp = QM_CQC_VFT_BUF_SIZE |
QM_CQC_VFT_SQC_SIZE |
QM_CQC_VFT_INDEX_NUMBER |
QM_CQC_VFT_VALID;
- break;
- case QM_HW_V2:
+ } else {
tmp = QM_CQC_VFT_VALID;
- break;
- case QM_HW_UNKNOWN:
- break;
}
break;
}
.release = single_release,
};
-static int qm_create_debugfs_file(struct hisi_qm *qm, enum qm_debug_file index)
+static ssize_t qm_cmd_read(struct file *filp, char __user *buffer,
+ size_t count, loff_t *pos)
{
- struct dentry *qm_d = qm->debug.qm_d;
- struct debugfs_file *file = qm->debug.files + index;
+ char buf[QM_DBG_READ_LEN];
+ int len;
- debugfs_create_file(qm_debug_file_name[index], 0600, qm_d, file,
- &qm_debug_fops);
+ if (*pos)
+ return 0;
- file->index = index;
- mutex_init(&file->lock);
- file->debug = &qm->debug;
+ if (count < QM_DBG_READ_LEN)
+ return -ENOSPC;
- return 0;
-}
+ len = snprintf(buf, QM_DBG_READ_LEN, "%s\n",
+ "Please echo help to cmd to get help information");
-static void qm_hw_error_init_v1(struct hisi_qm *qm, u32 ce, u32 nfe, u32 fe,
- u32 msi)
-{
- writel(QM_ABNORMAL_INT_MASK_VALUE, qm->io_base + QM_ABNORMAL_INT_MASK);
+ if (copy_to_user(buffer, buf, len))
+ return -EFAULT;
+
+ return (*pos = len);
}
-static void qm_hw_error_init_v2(struct hisi_qm *qm, u32 ce, u32 nfe, u32 fe,
- u32 msi)
+static void *qm_ctx_alloc(struct hisi_qm *qm, size_t ctx_size,
+ dma_addr_t *dma_addr)
{
- u32 irq_enable = ce | nfe | fe | msi;
- u32 irq_unmask = ~irq_enable;
-
- qm->error_mask = ce | nfe | fe;
- qm->msi_mask = msi;
+ struct device *dev = &qm->pdev->dev;
+ void *ctx_addr;
- /* configure error type */
- writel(ce, qm->io_base + QM_RAS_CE_ENABLE);
- writel(QM_RAS_CE_TIMES_PER_IRQ, qm->io_base + QM_RAS_CE_THRESHOLD);
- writel(nfe, qm->io_base + QM_RAS_NFE_ENABLE);
- writel(fe, qm->io_base + QM_RAS_FE_ENABLE);
+ ctx_addr = kzalloc(ctx_size, GFP_KERNEL);
+ if (!ctx_addr)
+ return ERR_PTR(-ENOMEM);
- /* use RAS irq default, so only set QM_RAS_MSI_INT_SEL for MSI */
- writel(msi, qm->io_base + QM_RAS_MSI_INT_SEL);
+ *dma_addr = dma_map_single(dev, ctx_addr, ctx_size, DMA_FROM_DEVICE);
+ if (dma_mapping_error(dev, *dma_addr)) {
+ dev_err(dev, "DMA mapping error!\n");
+ kfree(ctx_addr);
+ return ERR_PTR(-ENOMEM);
+ }
- irq_unmask &= readl(qm->io_base + QM_ABNORMAL_INT_MASK);
- writel(irq_unmask, qm->io_base + QM_ABNORMAL_INT_MASK);
+ return ctx_addr;
}
-static void qm_hw_error_uninit_v2(struct hisi_qm *qm)
+static void qm_ctx_free(struct hisi_qm *qm, size_t ctx_size,
+ const void *ctx_addr, dma_addr_t *dma_addr)
{
- writel(QM_ABNORMAL_INT_MASK_VALUE, qm->io_base + QM_ABNORMAL_INT_MASK);
+ struct device *dev = &qm->pdev->dev;
+
+ dma_unmap_single(dev, *dma_addr, ctx_size, DMA_FROM_DEVICE);
+ kfree(ctx_addr);
}
-static void qm_log_hw_error(struct hisi_qm *qm, u32 error_status)
+static int dump_show(struct hisi_qm *qm, void *info,
+ unsigned int info_size, char *info_name)
{
- const struct hisi_qm_hw_error *err;
struct device *dev = &qm->pdev->dev;
- u32 reg_val, type, vf_num;
- int i;
-
- for (i = 0; i < ARRAY_SIZE(qm_hw_error); i++) {
- err = &qm_hw_error[i];
- if (!(err->int_msk & error_status))
- continue;
+ u8 *info_buf, *info_curr = info;
+ u32 i;
+#define BYTE_PER_DW 4
- dev_err(dev, "%s [error status=0x%x] found\n",
- err->msg, err->int_msk);
+ info_buf = kzalloc(info_size, GFP_KERNEL);
+ if (!info_buf)
+ return -ENOMEM;
- if (err->int_msk & QM_DB_TIMEOUT) {
- reg_val = readl(qm->io_base + QM_ABNORMAL_INF01);
- type = (reg_val & QM_DB_TIMEOUT_TYPE) >>
- QM_DB_TIMEOUT_TYPE_SHIFT;
- vf_num = reg_val & QM_DB_TIMEOUT_VF;
- dev_err(dev, "qm %s doorbell timeout in function %u\n",
- qm_db_timeout[type], vf_num);
- } else if (err->int_msk & QM_OF_FIFO_OF) {
- reg_val = readl(qm->io_base + QM_ABNORMAL_INF00);
- type = (reg_val & QM_FIFO_OVERFLOW_TYPE) >>
- QM_FIFO_OVERFLOW_TYPE_SHIFT;
- vf_num = reg_val & QM_FIFO_OVERFLOW_VF;
+ for (i = 0; i < info_size; i++, info_curr++) {
+ if (i % BYTE_PER_DW == 0)
+ info_buf[i + 3UL] = *info_curr;
+ else if (i % BYTE_PER_DW == 1)
+ info_buf[i + 1UL] = *info_curr;
+ else if (i % BYTE_PER_DW == 2)
+ info_buf[i - 1] = *info_curr;
+ else if (i % BYTE_PER_DW == 3)
+ info_buf[i - 3] = *info_curr;
+ }
- if (type < ARRAY_SIZE(qm_fifo_overflow))
- dev_err(dev, "qm %s fifo overflow in function %u\n",
- qm_fifo_overflow[type], vf_num);
- else
- dev_err(dev, "unknown error type\n");
- }
+ dev_info(dev, "%s DUMP\n", info_name);
+ for (i = 0; i < info_size; i += BYTE_PER_DW) {
+ pr_info("DW%d: %02X%02X %02X%02X\n", i / BYTE_PER_DW,
+ info_buf[i], info_buf[i + 1UL],
+ info_buf[i + 2UL], info_buf[i + 3UL]);
}
+
+ kfree(info_buf);
+
+ return 0;
}
-static pci_ers_result_t qm_hw_error_handle_v2(struct hisi_qm *qm)
+static int qm_dump_sqc_raw(struct hisi_qm *qm, dma_addr_t dma_addr, u16 qp_id)
{
- u32 error_status, tmp;
+ return qm_mb(qm, QM_MB_CMD_SQC, dma_addr, qp_id, 1);
+}
- /* read err sts */
- tmp = readl(qm->io_base + QM_ABNORMAL_INT_STATUS);
- error_status = qm->error_mask & tmp;
+static int qm_dump_cqc_raw(struct hisi_qm *qm, dma_addr_t dma_addr, u16 qp_id)
+{
+ return qm_mb(qm, QM_MB_CMD_CQC, dma_addr, qp_id, 1);
+}
- if (error_status) {
- qm_log_hw_error(qm, error_status);
+static int qm_sqc_dump(struct hisi_qm *qm, const char *s)
+{
+ struct device *dev = &qm->pdev->dev;
+ struct qm_sqc *sqc, *sqc_curr;
+ dma_addr_t sqc_dma;
+ u32 qp_id;
+ int ret;
- /* clear err sts */
- writel(error_status, qm->io_base + QM_ABNORMAL_INT_SOURCE);
+ if (!s)
+ return -EINVAL;
- return PCI_ERS_RESULT_NEED_RESET;
+ ret = kstrtou32(s, 0, &qp_id);
+ if (ret || qp_id >= qm->qp_num) {
+ dev_err(dev, "Please input qp num (0-%d)", qm->qp_num - 1);
+ return -EINVAL;
}
- return PCI_ERS_RESULT_RECOVERED;
-}
+ sqc = qm_ctx_alloc(qm, sizeof(*sqc), &sqc_dma);
+ if (IS_ERR(sqc))
+ return PTR_ERR(sqc);
-static const struct hisi_qm_hw_ops qm_hw_ops_v1 = {
- .qm_db = qm_db_v1,
- .get_irq_num = qm_get_irq_num_v1,
- .hw_error_init = qm_hw_error_init_v1,
-};
+ ret = qm_dump_sqc_raw(qm, sqc_dma, qp_id);
+ if (ret) {
+ down_read(&qm->qps_lock);
+ if (qm->sqc) {
+ sqc_curr = qm->sqc + qp_id;
-static const struct hisi_qm_hw_ops qm_hw_ops_v2 = {
- .get_vft = qm_get_vft_v2,
- .qm_db = qm_db_v2,
- .get_irq_num = qm_get_irq_num_v2,
- .hw_error_init = qm_hw_error_init_v2,
- .hw_error_uninit = qm_hw_error_uninit_v2,
- .hw_error_handle = qm_hw_error_handle_v2,
-};
+ ret = dump_show(qm, sqc_curr, sizeof(*sqc),
+ "SOFT SQC");
+ if (ret)
+ dev_info(dev, "Show soft sqc failed!\n");
+ }
+ up_read(&qm->qps_lock);
-static void *qm_get_avail_sqe(struct hisi_qp *qp)
-{
- struct hisi_qp_status *qp_status = &qp->qp_status;
- u16 sq_tail = qp_status->sq_tail;
+ goto err_free_ctx;
+ }
- if (unlikely(atomic_read(&qp->qp_status.used) == QM_Q_DEPTH))
- return NULL;
+ ret = dump_show(qm, sqc, sizeof(*sqc), "SQC");
+ if (ret)
+ dev_info(dev, "Show hw sqc failed!\n");
- return qp->sqe + sq_tail * qp->qm->sqe_size;
+err_free_ctx:
+ qm_ctx_free(qm, sizeof(*sqc), sqc, &sqc_dma);
+ return ret;
}
-/**
- * hisi_qm_create_qp() - Create a queue pair from qm.
- * @qm: The qm we create a qp from.
- * @alg_type: Accelerator specific algorithm type in sqc.
- *
- * return created qp, -EBUSY if all qps in qm allocated, -ENOMEM if allocating
- * qp memory fails.
- */
-struct hisi_qp *hisi_qm_create_qp(struct hisi_qm *qm, u8 alg_type)
+static int qm_cqc_dump(struct hisi_qm *qm, const char *s)
{
struct device *dev = &qm->pdev->dev;
- struct hisi_qp *qp;
- int qp_id, ret;
-
- qp = kzalloc(sizeof(*qp), GFP_KERNEL);
- if (!qp)
- return ERR_PTR(-ENOMEM);
+ struct qm_cqc *cqc, *cqc_curr;
+ dma_addr_t cqc_dma;
+ u32 qp_id;
+ int ret;
- write_lock(&qm->qps_lock);
+ if (!s)
+ return -EINVAL;
- qp_id = find_first_zero_bit(qm->qp_bitmap, qm->qp_num);
- if (qp_id >= qm->qp_num) {
- write_unlock(&qm->qps_lock);
- dev_info(&qm->pdev->dev, "QM all queues are busy!\n");
- ret = -EBUSY;
- goto err_free_qp;
+ ret = kstrtou32(s, 0, &qp_id);
+ if (ret || qp_id >= qm->qp_num) {
+ dev_err(dev, "Please input qp num (0-%d)", qm->qp_num - 1);
+ return -EINVAL;
}
- set_bit(qp_id, qm->qp_bitmap);
- qm->qp_array[qp_id] = qp;
- qm->qp_in_used++;
- write_unlock(&qm->qps_lock);
+ cqc = qm_ctx_alloc(qm, sizeof(*cqc), &cqc_dma);
+ if (IS_ERR(cqc))
+ return PTR_ERR(cqc);
- qp->qm = qm;
+ ret = qm_dump_cqc_raw(qm, cqc_dma, qp_id);
+ if (ret) {
+ down_read(&qm->qps_lock);
+ if (qm->cqc) {
+ cqc_curr = qm->cqc + qp_id;
- if (qm->use_dma_api) {
- qp->qdma.size = qm->sqe_size * QM_Q_DEPTH +
- sizeof(struct qm_cqe) * QM_Q_DEPTH;
- qp->qdma.va = dma_alloc_coherent(dev, qp->qdma.size,
- &qp->qdma.dma, GFP_KERNEL);
- if (!qp->qdma.va) {
- ret = -ENOMEM;
- goto err_clear_bit;
+ ret = dump_show(qm, cqc_curr, sizeof(*cqc),
+ "SOFT CQC");
+ if (ret)
+ dev_info(dev, "Show soft cqc failed!\n");
}
+ up_read(&qm->qps_lock);
- dev_dbg(dev, "allocate qp dma buf(va=%pK, dma=%pad, size=%zx)\n",
- qp->qdma.va, &qp->qdma.dma, qp->qdma.size);
+ goto err_free_ctx;
}
- qp->qp_id = qp_id;
- qp->alg_type = alg_type;
-
- return qp;
+ ret = dump_show(qm, cqc, sizeof(*cqc), "CQC");
+ if (ret)
+ dev_info(dev, "Show hw cqc failed!\n");
-err_clear_bit:
- write_lock(&qm->qps_lock);
- qm->qp_array[qp_id] = NULL;
- clear_bit(qp_id, qm->qp_bitmap);
- write_unlock(&qm->qps_lock);
-err_free_qp:
- kfree(qp);
- return ERR_PTR(ret);
+err_free_ctx:
+ qm_ctx_free(qm, sizeof(*cqc), cqc, &cqc_dma);
+ return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qm_create_qp);
-/**
- * hisi_qm_release_qp() - Release a qp back to its qm.
- * @qp: The qp we want to release.
- *
- * This function releases the resource of a qp.
- */
-void hisi_qm_release_qp(struct hisi_qp *qp)
+static int qm_eqc_aeqc_dump(struct hisi_qm *qm, char *s, size_t size,
+ int cmd, char *name)
{
- struct hisi_qm *qm = qp->qm;
- struct qm_dma *qdma = &qp->qdma;
struct device *dev = &qm->pdev->dev;
+ dma_addr_t xeqc_dma;
+ void *xeqc;
+ int ret;
- if (qm->use_dma_api && qdma->va)
- dma_free_coherent(dev, qdma->size, qdma->va, qdma->dma);
+ if (strsep(&s, " ")) {
+ dev_err(dev, "Please do not input extra characters!\n");
+ return -EINVAL;
+ }
- write_lock(&qm->qps_lock);
- qm->qp_array[qp->qp_id] = NULL;
- clear_bit(qp->qp_id, qm->qp_bitmap);
- qm->qp_in_used--;
- write_unlock(&qm->qps_lock);
+ xeqc = qm_ctx_alloc(qm, size, &xeqc_dma);
+ if (IS_ERR(xeqc))
+ return PTR_ERR(xeqc);
- kfree(qp);
-}
-EXPORT_SYMBOL_GPL(hisi_qm_release_qp);
+ ret = qm_mb(qm, cmd, xeqc_dma, 0, 1);
+ if (ret)
+ goto err_free_ctx;
-static int qm_qp_ctx_cfg(struct hisi_qp *qp, int qp_id, int pasid)
+ ret = dump_show(qm, xeqc, size, name);
+ if (ret)
+ dev_info(dev, "Show hw %s failed!\n", name);
+
+err_free_ctx:
+ qm_ctx_free(qm, size, xeqc, &xeqc_dma);
+ return ret;
+}
+
+static int q_dump_param_parse(struct hisi_qm *qm, char *s,
+ u32 *e_id, u32 *q_id)
{
- struct hisi_qm *qm = qp->qm;
struct device *dev = &qm->pdev->dev;
- enum qm_hw_ver ver = qm->ver;
- struct qm_sqc *sqc;
- struct qm_cqc *cqc;
- dma_addr_t sqc_dma;
- dma_addr_t cqc_dma;
+ unsigned int qp_num = qm->qp_num;
+ char *presult;
int ret;
- qm_init_qp_status(qp);
+ presult = strsep(&s, " ");
+ if (!presult) {
+ dev_err(dev, "Please input qp number!\n");
+ return -EINVAL;
+ }
- sqc = kzalloc(sizeof(struct qm_sqc), GFP_KERNEL);
- if (!sqc)
- return -ENOMEM;
- sqc_dma = dma_map_single(dev, sqc, sizeof(struct qm_sqc),
- DMA_TO_DEVICE);
- if (dma_mapping_error(dev, sqc_dma)) {
- kfree(sqc);
- return -ENOMEM;
+ ret = kstrtou32(presult, 0, q_id);
+ if (ret || *q_id >= qp_num) {
+ dev_err(dev, "Please input qp num (0-%d)", qp_num - 1);
+ return -EINVAL;
}
- INIT_QC_COMMON(sqc, qp->sqe_dma, pasid);
- if (ver == QM_HW_V1) {
- sqc->dw3 = cpu_to_le32(QM_MK_SQC_DW3_V1(0, 0, 0, qm->sqe_size));
- sqc->w8 = cpu_to_le16(QM_Q_DEPTH - 1);
- } else if (ver == QM_HW_V2) {
- sqc->dw3 = cpu_to_le32(QM_MK_SQC_DW3_V2(qm->sqe_size));
- sqc->w8 = 0; /* rand_qc */
+ presult = strsep(&s, " ");
+ if (!presult) {
+ dev_err(dev, "Please input sqe number!\n");
+ return -EINVAL;
}
- sqc->cq_num = cpu_to_le16(qp_id);
- sqc->w13 = cpu_to_le16(QM_MK_SQC_W13(0, 1, qp->alg_type));
- ret = qm_mb(qm, QM_MB_CMD_SQC, sqc_dma, qp_id, 0);
- dma_unmap_single(dev, sqc_dma, sizeof(struct qm_sqc), DMA_TO_DEVICE);
- kfree(sqc);
+ ret = kstrtou32(presult, 0, e_id);
+ if (ret || *e_id >= QM_Q_DEPTH) {
+ dev_err(dev, "Please input sqe num (0-%d)", QM_Q_DEPTH - 1);
+ return -EINVAL;
+ }
+
+ if (strsep(&s, " ")) {
+ dev_err(dev, "Please do not input extra characters!\n");
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static int qm_sq_dump(struct hisi_qm *qm, char *s)
+{
+ struct device *dev = &qm->pdev->dev;
+ void *sqe, *sqe_curr;
+ struct hisi_qp *qp;
+ u32 qp_id, sqe_id;
+ int ret;
+
+ ret = q_dump_param_parse(qm, s, &sqe_id, &qp_id);
if (ret)
return ret;
- cqc = kzalloc(sizeof(struct qm_cqc), GFP_KERNEL);
- if (!cqc)
- return -ENOMEM;
- cqc_dma = dma_map_single(dev, cqc, sizeof(struct qm_cqc),
- DMA_TO_DEVICE);
- if (dma_mapping_error(dev, cqc_dma)) {
- kfree(cqc);
+ sqe = kzalloc(qm->sqe_size * QM_Q_DEPTH, GFP_KERNEL);
+ if (!sqe)
return -ENOMEM;
- }
- INIT_QC_COMMON(cqc, qp->cqe_dma, pasid);
- if (ver == QM_HW_V1) {
- cqc->dw3 = cpu_to_le32(QM_MK_CQC_DW3_V1(0, 0, 0, 4));
- cqc->w8 = cpu_to_le16(QM_Q_DEPTH - 1);
- } else if (ver == QM_HW_V2) {
- cqc->dw3 = cpu_to_le32(QM_MK_CQC_DW3_V2(4));
- cqc->w8 = 0;
- }
- cqc->dw6 = cpu_to_le32(1 << QM_CQ_PHASE_SHIFT | 1 << QM_CQ_FLAG_SHIFT);
+ qp = &qm->qp_array[qp_id];
+ memcpy(sqe, qp->sqe, qm->sqe_size * QM_Q_DEPTH);
+ sqe_curr = sqe + (u32)(sqe_id * qm->sqe_size);
+ memset(sqe_curr + qm->debug.sqe_mask_offset, QM_SQE_ADDR_MASK,
+ qm->debug.sqe_mask_len);
- ret = qm_mb(qm, QM_MB_CMD_CQC, cqc_dma, qp_id, 0);
- dma_unmap_single(dev, cqc_dma, sizeof(struct qm_cqc), DMA_TO_DEVICE);
- kfree(cqc);
+ ret = dump_show(qm, sqe_curr, qm->sqe_size, "SQE");
+ if (ret)
+ dev_info(dev, "Show sqe failed!\n");
+
+ kfree(sqe);
return ret;
}
-/**
- * hisi_qm_start_qp() - Start a qp into running.
- * @qp: The qp we want to start to run.
- * @arg: Accelerator specific argument.
- *
- * After this function, qp can receive request from user. Return 0 if
- * successful, Return -EBUSY if failed.
- */
-int hisi_qm_start_qp(struct hisi_qp *qp, unsigned long arg)
+static int qm_cq_dump(struct hisi_qm *qm, char *s)
{
- struct hisi_qm *qm = qp->qm;
struct device *dev = &qm->pdev->dev;
- enum qm_hw_ver ver = qm->ver;
- int qp_id = qp->qp_id;
- int pasid = arg;
- size_t off = 0;
+ struct qm_cqe *cqe_curr;
+ struct hisi_qp *qp;
+ u32 qp_id, cqe_id;
int ret;
-#define QP_INIT_BUF(qp, type, size) do { \
- (qp)->type = ((qp)->qdma.va + (off)); \
- (qp)->type##_dma = (qp)->qdma.dma + (off); \
- off += (size); \
-} while (0)
+ ret = q_dump_param_parse(qm, s, &cqe_id, &qp_id);
+ if (ret)
+ return ret;
+
+ qp = &qm->qp_array[qp_id];
+ cqe_curr = qp->cqe + cqe_id;
+ ret = dump_show(qm, cqe_curr, sizeof(struct qm_cqe), "CQE");
+ if (ret)
+ dev_info(dev, "Show cqe failed!\n");
+
+ return ret;
+}
+
+static int qm_eq_aeq_dump(struct hisi_qm *qm, const char *s,
+ size_t size, char *name)
+{
+ struct device *dev = &qm->pdev->dev;
+ void *xeqe;
+ u32 xeqe_id;
+ int ret;
- if (!qp->qdma.dma) {
- dev_err(dev, "cannot get qm dma buffer\n");
+ if (!s)
return -EINVAL;
- }
- /* sq need 128 bytes alignment */
- if (qp->qdma.dma & QM_SQE_DATA_ALIGN_MASK) {
- dev_err(dev, "qm sq is not aligned to 128 byte\n");
+ ret = kstrtou32(s, 0, &xeqe_id);
+ if (ret || xeqe_id >= QM_Q_DEPTH) {
+ dev_err(dev, "Please input aeqe num (0-%d)", QM_Q_DEPTH - 1);
return -EINVAL;
}
- QP_INIT_BUF(qp, sqe, qm->sqe_size * QM_Q_DEPTH);
- QP_INIT_BUF(qp, cqe, sizeof(struct qm_cqe) * QM_Q_DEPTH);
+ down_read(&qm->qps_lock);
- dev_dbg(dev, "init qp buffer(v%d):\n"
- " sqe (%pK, %lx)\n"
- " cqe (%pK, %lx)\n",
- ver, qp->sqe, (unsigned long)qp->sqe_dma,
- qp->cqe, (unsigned long)qp->cqe_dma);
+ if (qm->eqe && !strcmp(name, "EQE")) {
+ xeqe = qm->eqe + xeqe_id;
+ } else if (qm->aeqe && !strcmp(name, "AEQE")) {
+ xeqe = qm->aeqe + xeqe_id;
+ } else {
+ ret = -EINVAL;
+ goto err_unlock;
+ }
- ret = qm_qp_ctx_cfg(qp, qp_id, pasid);
+ ret = dump_show(qm, xeqe, size, name);
if (ret)
- return ret;
+ dev_info(dev, "Show %s failed!\n", name);
- dev_dbg(dev, "queue %d started\n", qp_id);
-
- return 0;
+err_unlock:
+ up_read(&qm->qps_lock);
+ return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qm_start_qp);
-/**
- * hisi_qm_stop_qp() - Stop a qp in qm.
- * @qp: The qp we want to stop.
- *
- * This function is reverse of hisi_qm_start_qp. Return 0 if successful.
- */
-int hisi_qm_stop_qp(struct hisi_qp *qp)
+static int qm_dbg_help(struct hisi_qm *qm, char *s)
{
- struct device *dev = &qp->qm->pdev->dev;
- int i = 0;
-
- /* it is stopped */
- if (test_bit(QP_STOP, &qp->qp_status.flags))
- return 0;
+ struct device *dev = &qm->pdev->dev;
- while (atomic_read(&qp->qp_status.used)) {
- i++;
- msleep(20);
- if (i == 10) {
- dev_err(dev, "Cannot drain out data for stopping, Force to stop!\n");
- return 0;
- }
+ if (strsep(&s, " ")) {
+ dev_err(dev, "Please do not input extra characters!\n");
+ return -EINVAL;
}
- set_bit(QP_STOP, &qp->qp_status.flags);
-
- dev_dbg(dev, "stop queue %u!", qp->qp_id);
+ dev_info(dev, "available commands:\n");
+ dev_info(dev, "sqc <num>\n");
+ dev_info(dev, "cqc <num>\n");
+ dev_info(dev, "eqc\n");
+ dev_info(dev, "aeqc\n");
+ dev_info(dev, "sq <num> <e>\n");
+ dev_info(dev, "cq <num> <e>\n");
+ dev_info(dev, "eq <e>\n");
+ dev_info(dev, "aeq <e>\n");
return 0;
}
-EXPORT_SYMBOL_GPL(hisi_qm_stop_qp);
-/**
- * hisi_qp_send() - Queue up a task in the hardware queue.
- * @qp: The qp in which to put the message.
- * @msg: The message.
- *
- * This function will return -EBUSY if qp is currently full, and -EAGAIN
- * if qp related qm is resetting.
- */
-int hisi_qp_send(struct hisi_qp *qp, const void *msg)
+static int qm_cmd_write_dump(struct hisi_qm *qm, const char *cmd_buf)
{
- struct hisi_qp_status *qp_status = &qp->qp_status;
- u16 sq_tail = qp_status->sq_tail;
- u16 sq_tail_next = (sq_tail + 1) % QM_Q_DEPTH;
- void *sqe = qm_get_avail_sqe(qp);
+ struct device *dev = &qm->pdev->dev;
+ char *presult, *s;
+ int ret;
- if (unlikely(test_bit(QP_STOP, &qp->qp_status.flags))) {
- dev_info(&qp->qm->pdev->dev, "QP is stopped or resetting\n");
- return -EAGAIN;
+ s = kstrdup(cmd_buf, GFP_KERNEL);
+ if (!s)
+ return -ENOMEM;
+
+ presult = strsep(&s, " ");
+ if (!presult) {
+ kfree(s);
+ return -EINVAL;
}
- if (!sqe)
- return -EBUSY;
+ if (!strcmp(presult, "sqc"))
+ ret = qm_sqc_dump(qm, s);
+ else if (!strcmp(presult, "cqc"))
+ ret = qm_cqc_dump(qm, s);
+ else if (!strcmp(presult, "eqc"))
+ ret = qm_eqc_aeqc_dump(qm, s, sizeof(struct qm_eqc),
+ QM_MB_CMD_EQC, "EQC");
+ else if (!strcmp(presult, "aeqc"))
+ ret = qm_eqc_aeqc_dump(qm, s, sizeof(struct qm_aeqc),
+ QM_MB_CMD_AEQC, "AEQC");
+ else if (!strcmp(presult, "sq"))
+ ret = qm_sq_dump(qm, s);
+ else if (!strcmp(presult, "cq"))
+ ret = qm_cq_dump(qm, s);
+ else if (!strcmp(presult, "eq"))
+ ret = qm_eq_aeq_dump(qm, s, sizeof(struct qm_eqe), "EQE");
+ else if (!strcmp(presult, "aeq"))
+ ret = qm_eq_aeq_dump(qm, s, sizeof(struct qm_aeqe), "AEQE");
+ else if (!strcmp(presult, "help"))
+ ret = qm_dbg_help(qm, s);
+ else
+ ret = -EINVAL;
- memcpy(sqe, msg, qp->qm->sqe_size);
+ if (ret)
+ dev_info(dev, "Please echo help\n");
- qm_db(qp->qm, qp->qp_id, QM_DOORBELL_CMD_SQ, sq_tail_next, 0);
- atomic_inc(&qp->qp_status.used);
- qp_status->sq_tail = sq_tail_next;
+ kfree(s);
- return 0;
+ return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qp_send);
-static void hisi_qm_cache_wb(struct hisi_qm *qm)
+static ssize_t qm_cmd_write(struct file *filp, const char __user *buffer,
+ size_t count, loff_t *pos)
{
- unsigned int val;
+ struct hisi_qm *qm = filp->private_data;
+ char *cmd_buf, *cmd_buf_tmp;
+ int ret;
+
+ if (*pos)
+ return 0;
+
+ /* Judge if the instance is being reset. */
+ if (unlikely(atomic_read(&qm->status.flags) == QM_STOP))
+ return 0;
+
+ if (count > QM_DBG_WRITE_LEN)
+ return -ENOSPC;
+
+ cmd_buf = kzalloc(count + 1, GFP_KERNEL);
+ if (!cmd_buf)
+ return -ENOMEM;
- if (qm->ver == QM_HW_V2) {
- writel(0x1, qm->io_base + QM_CACHE_WB_START);
- if (readl_relaxed_poll_timeout(qm->io_base + QM_CACHE_WB_DONE,
- val, val & BIT(0), 10, 1000))
- dev_err(&qm->pdev->dev, "QM writeback sqc cache fail!\n");
+ if (copy_from_user(cmd_buf, buffer, count)) {
+ kfree(cmd_buf);
+ return -EFAULT;
}
-}
-static void qm_qp_event_notifier(struct hisi_qp *qp)
-{
- wake_up_interruptible(&qp->uacce_q->wait);
-}
+ cmd_buf[count] = '\0';
-static int hisi_qm_get_available_instances(struct uacce_device *uacce)
-{
- int i, ret;
- struct hisi_qm *qm = uacce->priv;
+ cmd_buf_tmp = strchr(cmd_buf, '\n');
+ if (cmd_buf_tmp) {
+ *cmd_buf_tmp = '\0';
+ count = cmd_buf_tmp - cmd_buf + 1;
+ }
- read_lock(&qm->qps_lock);
- for (i = 0, ret = 0; i < qm->qp_num; i++)
- if (!qm->qp_array[i])
- ret++;
- read_unlock(&qm->qps_lock);
+ ret = qm_cmd_write_dump(qm, cmd_buf);
+ if (ret) {
+ kfree(cmd_buf);
+ return ret;
+ }
- return ret;
+ kfree(cmd_buf);
+
+ return count;
}
-static int hisi_qm_uacce_get_queue(struct uacce_device *uacce,
- unsigned long arg,
- struct uacce_queue *q)
+static const struct file_operations qm_cmd_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .read = qm_cmd_read,
+ .write = qm_cmd_write,
+};
+
+static int qm_create_debugfs_file(struct hisi_qm *qm, enum qm_debug_file index)
{
- struct hisi_qm *qm = uacce->priv;
- struct hisi_qp *qp;
- u8 alg_type = 0;
+ struct dentry *qm_d = qm->debug.qm_d;
+ struct debugfs_file *file = qm->debug.files + index;
- qp = hisi_qm_create_qp(qm, alg_type);
- if (IS_ERR(qp))
- return PTR_ERR(qp);
+ debugfs_create_file(qm_debug_file_name[index], 0600, qm_d, file,
+ &qm_debug_fops);
- q->priv = qp;
- q->uacce = uacce;
- qp->uacce_q = q;
- qp->event_cb = qm_qp_event_notifier;
- qp->pasid = arg;
+ file->index = index;
+ mutex_init(&file->lock);
+ file->debug = &qm->debug;
return 0;
}
-static void hisi_qm_uacce_put_queue(struct uacce_queue *q)
+static void qm_hw_error_init_v1(struct hisi_qm *qm, u32 ce, u32 nfe, u32 fe)
{
- struct hisi_qp *qp = q->priv;
-
- hisi_qm_cache_wb(qp->qm);
- hisi_qm_release_qp(qp);
+ writel(QM_ABNORMAL_INT_MASK_VALUE, qm->io_base + QM_ABNORMAL_INT_MASK);
}
-/* map sq/cq/doorbell to user space */
-static int hisi_qm_uacce_mmap(struct uacce_queue *q,
- struct vm_area_struct *vma,
- struct uacce_qfile_region *qfr)
+static void qm_hw_error_init_v2(struct hisi_qm *qm, u32 ce, u32 nfe, u32 fe)
{
- struct hisi_qp *qp = q->priv;
- struct hisi_qm *qm = qp->qm;
- size_t sz = vma->vm_end - vma->vm_start;
- struct pci_dev *pdev = qm->pdev;
- struct device *dev = &pdev->dev;
- unsigned long vm_pgoff;
- int ret;
+ u32 irq_enable = ce | nfe | fe;
+ u32 irq_unmask = ~irq_enable;
- switch (qfr->type) {
- case UACCE_QFRT_MMIO:
- if (qm->ver == QM_HW_V2) {
- if (sz > PAGE_SIZE * (QM_DOORBELL_PAGE_NR +
- QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE))
- return -EINVAL;
- } else {
- if (sz > PAGE_SIZE * QM_DOORBELL_PAGE_NR)
- return -EINVAL;
- }
+ qm->error_mask = ce | nfe | fe;
- vma->vm_flags |= VM_IO;
+ /* clear QM hw residual error source */
+ writel(QM_ABNORMAL_INT_SOURCE_CLR,
+ qm->io_base + QM_ABNORMAL_INT_SOURCE);
- return remap_pfn_range(vma, vma->vm_start,
- qm->phys_base >> PAGE_SHIFT,
- sz, pgprot_noncached(vma->vm_page_prot));
- case UACCE_QFRT_DUS:
- if (sz != qp->qdma.size)
- return -EINVAL;
+ /* configure error type */
+ writel(ce, qm->io_base + QM_RAS_CE_ENABLE);
+ writel(QM_RAS_CE_TIMES_PER_IRQ, qm->io_base + QM_RAS_CE_THRESHOLD);
+ writel(nfe, qm->io_base + QM_RAS_NFE_ENABLE);
+ writel(fe, qm->io_base + QM_RAS_FE_ENABLE);
- /*
- * dma_mmap_coherent() requires vm_pgoff as 0
- * restore vm_pfoff to initial value for mmap()
- */
- vm_pgoff = vma->vm_pgoff;
- vma->vm_pgoff = 0;
- ret = dma_mmap_coherent(dev, vma, qp->qdma.va,
- qp->qdma.dma, sz);
- vma->vm_pgoff = vm_pgoff;
- return ret;
-
- default:
- return -EINVAL;
- }
+ irq_unmask &= readl(qm->io_base + QM_ABNORMAL_INT_MASK);
+ writel(irq_unmask, qm->io_base + QM_ABNORMAL_INT_MASK);
}
-static int hisi_qm_uacce_start_queue(struct uacce_queue *q)
+static void qm_hw_error_uninit_v2(struct hisi_qm *qm)
{
- struct hisi_qp *qp = q->priv;
-
- return hisi_qm_start_qp(qp, qp->pasid);
+ writel(QM_ABNORMAL_INT_MASK_VALUE, qm->io_base + QM_ABNORMAL_INT_MASK);
}
-static void hisi_qm_uacce_stop_queue(struct uacce_queue *q)
+static void qm_log_hw_error(struct hisi_qm *qm, u32 error_status)
{
- hisi_qm_stop_qp(q->priv);
-}
+ const struct hisi_qm_hw_error *err;
+ struct device *dev = &qm->pdev->dev;
+ u32 reg_val, type, vf_num;
+ int i;
-static int qm_set_sqctype(struct uacce_queue *q, u16 type)
-{
- struct hisi_qm *qm = q->uacce->priv;
- struct hisi_qp *qp = q->priv;
+ for (i = 0; i < ARRAY_SIZE(qm_hw_error); i++) {
+ err = &qm_hw_error[i];
+ if (!(err->int_msk & error_status))
+ continue;
- write_lock(&qm->qps_lock);
- qp->alg_type = type;
- write_unlock(&qm->qps_lock);
+ dev_err(dev, "%s [error status=0x%x] found\n",
+ err->msg, err->int_msk);
- return 0;
+ if (err->int_msk & QM_DB_TIMEOUT) {
+ reg_val = readl(qm->io_base + QM_ABNORMAL_INF01);
+ type = (reg_val & QM_DB_TIMEOUT_TYPE) >>
+ QM_DB_TIMEOUT_TYPE_SHIFT;
+ vf_num = reg_val & QM_DB_TIMEOUT_VF;
+ dev_err(dev, "qm %s doorbell timeout in function %u\n",
+ qm_db_timeout[type], vf_num);
+ } else if (err->int_msk & QM_OF_FIFO_OF) {
+ reg_val = readl(qm->io_base + QM_ABNORMAL_INF00);
+ type = (reg_val & QM_FIFO_OVERFLOW_TYPE) >>
+ QM_FIFO_OVERFLOW_TYPE_SHIFT;
+ vf_num = reg_val & QM_FIFO_OVERFLOW_VF;
+
+ if (type < ARRAY_SIZE(qm_fifo_overflow))
+ dev_err(dev, "qm %s fifo overflow in function %u\n",
+ qm_fifo_overflow[type], vf_num);
+ else
+ dev_err(dev, "unknown error type\n");
+ }
+ }
}
-static long hisi_qm_uacce_ioctl(struct uacce_queue *q, unsigned int cmd,
- unsigned long arg)
+static enum acc_err_result qm_hw_error_handle_v2(struct hisi_qm *qm)
{
- struct hisi_qp *qp = q->priv;
- struct hisi_qp_ctx qp_ctx;
+ u32 error_status, tmp;
- if (cmd == UACCE_CMD_QM_SET_QP_CTX) {
- if (copy_from_user(&qp_ctx, (void __user *)arg,
- sizeof(struct hisi_qp_ctx)))
- return -EFAULT;
+ /* read err sts */
+ tmp = readl(qm->io_base + QM_ABNORMAL_INT_STATUS);
+ error_status = qm->error_mask & tmp;
- if (qp_ctx.qc_type != 0 && qp_ctx.qc_type != 1)
- return -EINVAL;
+ if (error_status) {
+ if (error_status & QM_ECC_MBIT)
+ qm->err_status.is_qm_ecc_mbit = true;
- qm_set_sqctype(q, qp_ctx.qc_type);
- qp_ctx.id = qp->qp_id;
+ qm_log_hw_error(qm, error_status);
+ if (error_status == QM_DB_RANDOM_INVALID) {
+ writel(error_status, qm->io_base +
+ QM_ABNORMAL_INT_SOURCE);
+ return ACC_ERR_RECOVERED;
+ }
- if (copy_to_user((void __user *)arg, &qp_ctx,
- sizeof(struct hisi_qp_ctx)))
- return -EFAULT;
- } else {
- return -EINVAL;
+ return ACC_ERR_NEED_RESET;
}
- return 0;
+ return ACC_ERR_RECOVERED;
}
-static const struct uacce_ops uacce_qm_ops = {
- .get_available_instances = hisi_qm_get_available_instances,
- .get_queue = hisi_qm_uacce_get_queue,
- .put_queue = hisi_qm_uacce_put_queue,
- .start_queue = hisi_qm_uacce_start_queue,
- .stop_queue = hisi_qm_uacce_stop_queue,
- .mmap = hisi_qm_uacce_mmap,
- .ioctl = hisi_qm_uacce_ioctl,
+static const struct hisi_qm_hw_ops qm_hw_ops_v1 = {
+ .qm_db = qm_db_v1,
+ .get_irq_num = qm_get_irq_num_v1,
+ .hw_error_init = qm_hw_error_init_v1,
};
-static int qm_alloc_uacce(struct hisi_qm *qm)
+static const struct hisi_qm_hw_ops qm_hw_ops_v2 = {
+ .get_vft = qm_get_vft_v2,
+ .qm_db = qm_db_v2,
+ .get_irq_num = qm_get_irq_num_v2,
+ .hw_error_init = qm_hw_error_init_v2,
+ .hw_error_uninit = qm_hw_error_uninit_v2,
+ .hw_error_handle = qm_hw_error_handle_v2,
+};
+
+static void *qm_get_avail_sqe(struct hisi_qp *qp)
{
- struct pci_dev *pdev = qm->pdev;
- struct uacce_device *uacce;
- unsigned long mmio_page_nr;
- unsigned long dus_page_nr;
- struct uacce_interface interface = {
- .flags = UACCE_DEV_SVA,
- .ops = &uacce_qm_ops,
- };
+ struct hisi_qp_status *qp_status = &qp->qp_status;
+ u16 sq_tail = qp_status->sq_tail;
- strncpy(interface.name, pdev->driver->name, sizeof(interface.name));
+ if (unlikely(atomic_read(&qp->qp_status.used) == QM_Q_DEPTH))
+ return NULL;
- uacce = uacce_alloc(&pdev->dev, &interface);
- if (IS_ERR(uacce))
- return PTR_ERR(uacce);
+ return qp->sqe + sq_tail * qp->qm->sqe_size;
+}
- if (uacce->flags & UACCE_DEV_SVA) {
- qm->use_sva = true;
- } else {
- /* only consider sva case */
- uacce_remove(uacce);
- qm->uacce = NULL;
- return -EINVAL;
- }
+static struct hisi_qp *qm_create_qp_nolock(struct hisi_qm *qm, u8 alg_type)
+{
+ struct device *dev = &qm->pdev->dev;
+ struct hisi_qp *qp;
+ int qp_id;
- uacce->is_vf = pdev->is_virtfn;
- uacce->priv = qm;
- uacce->algs = qm->algs;
+ if (!qm_qp_avail_state(qm, NULL, QP_INIT))
+ return ERR_PTR(-EPERM);
- if (qm->ver == QM_HW_V1) {
- mmio_page_nr = QM_DOORBELL_PAGE_NR;
- uacce->api_ver = HISI_QM_API_VER_BASE;
- } else {
- mmio_page_nr = QM_DOORBELL_PAGE_NR +
- QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE;
- uacce->api_ver = HISI_QM_API_VER2_BASE;
+ if (qm->qp_in_used == qm->qp_num) {
+ dev_info_ratelimited(dev, "All %u queues of QM are busy!\n",
+ qm->qp_num);
+ atomic64_inc(&qm->debug.dfx.create_qp_err_cnt);
+ return ERR_PTR(-EBUSY);
}
- dus_page_nr = (PAGE_SIZE - 1 + qm->sqe_size * QM_Q_DEPTH +
- sizeof(struct qm_cqe) * QM_Q_DEPTH) >> PAGE_SHIFT;
+ qp_id = idr_alloc_cyclic(&qm->qp_idr, NULL, 0, qm->qp_num, GFP_ATOMIC);
+ if (qp_id < 0) {
+ dev_info_ratelimited(dev, "All %u queues of QM are busy!\n",
+ qm->qp_num);
+ atomic64_inc(&qm->debug.dfx.create_qp_err_cnt);
+ return ERR_PTR(-EBUSY);
+ }
- uacce->qf_pg_num[UACCE_QFRT_MMIO] = mmio_page_nr;
- uacce->qf_pg_num[UACCE_QFRT_DUS] = dus_page_nr;
+ qp = &qm->qp_array[qp_id];
- qm->uacce = uacce;
+ memset(qp->cqe, 0, sizeof(struct qm_cqe) * QM_Q_DEPTH);
- return 0;
+ qp->event_cb = NULL;
+ qp->req_cb = NULL;
+ qp->qp_id = qp_id;
+ qp->alg_type = alg_type;
+ qm->qp_in_used++;
+ atomic_set(&qp->qp_status.flags, QP_INIT);
+
+ return qp;
}
/**
- * hisi_qm_get_free_qp_num() - Get free number of qp in qm.
- * @qm: The qm which want to get free qp.
+ * hisi_qm_create_qp() - Create a queue pair from qm.
+ * @qm: The qm we create a qp from.
+ * @alg_type: Accelerator specific algorithm type in sqc.
*
- * This function return free number of qp in qm.
+ * return created qp, -EBUSY if all qps in qm allocated, -ENOMEM if allocating
+ * qp memory fails.
*/
-int hisi_qm_get_free_qp_num(struct hisi_qm *qm)
+struct hisi_qp *hisi_qm_create_qp(struct hisi_qm *qm, u8 alg_type)
{
- int ret;
+ struct hisi_qp *qp;
- read_lock(&qm->qps_lock);
- ret = qm->qp_num - qm->qp_in_used;
- read_unlock(&qm->qps_lock);
+ down_write(&qm->qps_lock);
+ qp = qm_create_qp_nolock(qm, alg_type);
+ up_write(&qm->qps_lock);
- return ret;
+ return qp;
}
-EXPORT_SYMBOL_GPL(hisi_qm_get_free_qp_num);
+EXPORT_SYMBOL_GPL(hisi_qm_create_qp);
/**
- * hisi_qm_init() - Initialize configures about qm.
- * @qm: The qm needing init.
+ * hisi_qm_release_qp() - Release a qp back to its qm.
+ * @qp: The qp we want to release.
*
- * This function init qm, then we can call hisi_qm_start to put qm into work.
+ * This function releases the resource of a qp.
*/
-int hisi_qm_init(struct hisi_qm *qm)
+void hisi_qm_release_qp(struct hisi_qp *qp)
{
- struct pci_dev *pdev = qm->pdev;
- struct device *dev = &pdev->dev;
- unsigned int num_vec;
- int ret;
+ struct hisi_qm *qm = qp->qm;
- switch (qm->ver) {
- case QM_HW_V1:
- qm->ops = &qm_hw_ops_v1;
- break;
- case QM_HW_V2:
- qm->ops = &qm_hw_ops_v2;
- break;
- default:
- return -EINVAL;
+ down_write(&qm->qps_lock);
+
+ if (!qm_qp_avail_state(qm, qp, QP_CLOSE)) {
+ up_write(&qm->qps_lock);
+ return;
}
- ret = qm_alloc_uacce(qm);
- if (ret < 0)
- dev_warn(&pdev->dev, "fail to alloc uacce (%d)\n", ret);
+ qm->qp_in_used--;
+ idr_remove(&qm->qp_idr, qp->qp_id);
- ret = pci_enable_device_mem(pdev);
- if (ret < 0) {
- dev_err(&pdev->dev, "Failed to enable device mem!\n");
- goto err_remove_uacce;
- }
+ up_write(&qm->qps_lock);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_release_qp);
- ret = pci_request_mem_regions(pdev, qm->dev_name);
- if (ret < 0) {
- dev_err(&pdev->dev, "Failed to request mem regions!\n");
- goto err_disable_pcidev;
+static int qm_qp_ctx_cfg(struct hisi_qp *qp, int qp_id, int pasid)
+{
+ struct hisi_qm *qm = qp->qm;
+ struct device *dev = &qm->pdev->dev;
+ enum qm_hw_ver ver = qm->ver;
+ struct qm_sqc *sqc;
+ struct qm_cqc *cqc;
+ dma_addr_t sqc_dma;
+ dma_addr_t cqc_dma;
+ int ret;
+
+ qm_init_qp_status(qp);
+
+ sqc = kzalloc(sizeof(struct qm_sqc), GFP_KERNEL);
+ if (!sqc)
+ return -ENOMEM;
+ sqc_dma = dma_map_single(dev, sqc, sizeof(struct qm_sqc),
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(dev, sqc_dma)) {
+ kfree(sqc);
+ return -ENOMEM;
}
- qm->phys_base = pci_resource_start(pdev, PCI_BAR_2);
- qm->phys_size = pci_resource_len(qm->pdev, PCI_BAR_2);
- qm->io_base = ioremap(qm->phys_base, qm->phys_size);
- if (!qm->io_base) {
- ret = -EIO;
- goto err_release_mem_regions;
+ INIT_QC_COMMON(sqc, qp->sqe_dma, pasid);
+ if (ver == QM_HW_V1) {
+ sqc->dw3 = cpu_to_le32(QM_MK_SQC_DW3_V1(0, 0, 0, qm->sqe_size));
+ sqc->w8 = cpu_to_le16(QM_Q_DEPTH - 1);
+ } else {
+ sqc->dw3 = cpu_to_le32(QM_MK_SQC_DW3_V2(qm->sqe_size));
+ sqc->w8 = 0; /* rand_qc */
}
+ sqc->cq_num = cpu_to_le16(qp_id);
+ sqc->w13 = cpu_to_le16(QM_MK_SQC_W13(0, 1, qp->alg_type));
- ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
- if (ret < 0)
- goto err_iounmap;
- pci_set_master(pdev);
+ ret = qm_mb(qm, QM_MB_CMD_SQC, sqc_dma, qp_id, 0);
+ dma_unmap_single(dev, sqc_dma, sizeof(struct qm_sqc), DMA_TO_DEVICE);
+ kfree(sqc);
+ if (ret)
+ return ret;
- if (!qm->ops->get_irq_num) {
- ret = -EOPNOTSUPP;
- goto err_iounmap;
- }
- num_vec = qm->ops->get_irq_num(qm);
- ret = pci_alloc_irq_vectors(pdev, num_vec, num_vec, PCI_IRQ_MSI);
- if (ret < 0) {
- dev_err(dev, "Failed to enable MSI vectors!\n");
- goto err_iounmap;
+ cqc = kzalloc(sizeof(struct qm_cqc), GFP_KERNEL);
+ if (!cqc)
+ return -ENOMEM;
+ cqc_dma = dma_map_single(dev, cqc, sizeof(struct qm_cqc),
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(dev, cqc_dma)) {
+ kfree(cqc);
+ return -ENOMEM;
}
- ret = qm_irq_register(qm);
- if (ret)
- goto err_free_irq_vectors;
-
- qm->qp_in_used = 0;
- mutex_init(&qm->mailbox_lock);
- rwlock_init(&qm->qps_lock);
- INIT_WORK(&qm->work, qm_work_process);
-
- dev_dbg(dev, "init qm %s with %s\n", pdev->is_physfn ? "pf" : "vf",
- qm->use_dma_api ? "dma api" : "iommu api");
-
- return 0;
+ INIT_QC_COMMON(cqc, qp->cqe_dma, pasid);
+ if (ver == QM_HW_V1) {
+ cqc->dw3 = cpu_to_le32(QM_MK_CQC_DW3_V1(0, 0, 0, 4));
+ cqc->w8 = cpu_to_le16(QM_Q_DEPTH - 1);
+ } else {
+ cqc->dw3 = cpu_to_le32(QM_MK_CQC_DW3_V2(4));
+ cqc->w8 = 0;
+ }
+ cqc->dw6 = cpu_to_le32(1 << QM_CQ_PHASE_SHIFT | 1 << QM_CQ_FLAG_SHIFT);
-err_free_irq_vectors:
- pci_free_irq_vectors(pdev);
-err_iounmap:
- iounmap(qm->io_base);
-err_release_mem_regions:
- pci_release_mem_regions(pdev);
-err_disable_pcidev:
- pci_disable_device(pdev);
-err_remove_uacce:
- uacce_remove(qm->uacce);
- qm->uacce = NULL;
+ ret = qm_mb(qm, QM_MB_CMD_CQC, cqc_dma, qp_id, 0);
+ dma_unmap_single(dev, cqc_dma, sizeof(struct qm_cqc), DMA_TO_DEVICE);
+ kfree(cqc);
return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qm_init);
-/**
- * hisi_qm_uninit() - Uninitialize qm.
- * @qm: The qm needed uninit.
- *
- * This function uninits qm related device resources.
- */
-void hisi_qm_uninit(struct hisi_qm *qm)
+static int qm_start_qp_nolock(struct hisi_qp *qp, unsigned long arg)
{
- struct pci_dev *pdev = qm->pdev;
- struct device *dev = &pdev->dev;
+ struct hisi_qm *qm = qp->qm;
+ struct device *dev = &qm->pdev->dev;
+ int qp_id = qp->qp_id;
+ int pasid = arg;
+ int ret;
- uacce_remove(qm->uacce);
- qm->uacce = NULL;
+ if (!qm_qp_avail_state(qm, qp, QP_START))
+ return -EPERM;
- if (qm->use_dma_api && qm->qdma.va) {
- hisi_qm_cache_wb(qm);
- dma_free_coherent(dev, qm->qdma.size,
- qm->qdma.va, qm->qdma.dma);
- memset(&qm->qdma, 0, sizeof(qm->qdma));
- }
+ ret = qm_qp_ctx_cfg(qp, qp_id, pasid);
+ if (ret)
+ return ret;
- qm_irq_unregister(qm);
- pci_free_irq_vectors(pdev);
- iounmap(qm->io_base);
- pci_release_mem_regions(pdev);
- pci_disable_device(pdev);
+ atomic_set(&qp->qp_status.flags, QP_START);
+ dev_dbg(dev, "queue %d started\n", qp_id);
+
+ return 0;
}
-EXPORT_SYMBOL_GPL(hisi_qm_uninit);
/**
- * hisi_qm_get_vft() - Get vft from a qm.
- * @qm: The qm we want to get its vft.
- * @base: The base number of queue in vft.
- * @number: The number of queues in vft.
- *
- * We can allocate multiple queues to a qm by configuring virtual function
- * table. We get related configures by this function. Normally, we call this
- * function in VF driver to get the queue information.
+ * hisi_qm_start_qp() - Start a qp into running.
+ * @qp: The qp we want to start to run.
+ * @arg: Accelerator specific argument.
*
- * qm hw v1 does not support this interface.
+ * After this function, qp can receive request from user. Return 0 if
+ * successful, Return -EBUSY if failed.
*/
-int hisi_qm_get_vft(struct hisi_qm *qm, u32 *base, u32 *number)
+int hisi_qm_start_qp(struct hisi_qp *qp, unsigned long arg)
{
- if (!base || !number)
- return -EINVAL;
+ struct hisi_qm *qm = qp->qm;
+ int ret;
- if (!qm->ops->get_vft) {
- dev_err(&qm->pdev->dev, "Don't support vft read!\n");
- return -EINVAL;
- }
+ down_write(&qm->qps_lock);
+ ret = qm_start_qp_nolock(qp, arg);
+ up_write(&qm->qps_lock);
- return qm->ops->get_vft(qm, base, number);
+ return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qm_get_vft);
+EXPORT_SYMBOL_GPL(hisi_qm_start_qp);
/**
- * hisi_qm_set_vft() - Set "virtual function table" for a qm.
- * @fun_num: Number of operated function.
- * @qm: The qm in which to set vft, alway in a PF.
- * @base: The base number of queue in vft.
- * @number: The number of queues in vft. 0 means invalid vft.
- *
- * This function is alway called in PF driver, it is used to assign queues
- * among PF and VFs.
- *
- * Assign queues A~B to PF: hisi_qm_set_vft(qm, 0, A, B - A + 1)
- * Assign queues A~B to VF: hisi_qm_set_vft(qm, 2, A, B - A + 1)
- * (VF function number 0x2)
+ * Determine whether the queue is cleared by judging the tail pointers of
+ * sq and cq.
*/
-int hisi_qm_set_vft(struct hisi_qm *qm, u32 fun_num, u32 base,
- u32 number)
-{
- u32 max_q_num = qm->ctrl_qp_num;
-
- if (base >= max_q_num || number > max_q_num ||
- (base + number) > max_q_num)
- return -EINVAL;
-
- return qm_set_sqc_cqc_vft(qm, fun_num, base, number);
-}
-EXPORT_SYMBOL_GPL(hisi_qm_set_vft);
-
-static void qm_init_eq_aeq_status(struct hisi_qm *qm)
-{
- struct hisi_qm_status *status = &qm->status;
-
- status->eq_head = 0;
- status->aeq_head = 0;
- status->eqc_phase = true;
- status->aeqc_phase = true;
-}
-
-static int qm_eq_ctx_cfg(struct hisi_qm *qm)
+static int qm_drain_qp(struct hisi_qp *qp)
{
+ size_t size = sizeof(struct qm_sqc) + sizeof(struct qm_cqc);
+ struct hisi_qm *qm = qp->qm;
struct device *dev = &qm->pdev->dev;
- struct qm_eqc *eqc;
- struct qm_aeqc *aeqc;
- dma_addr_t eqc_dma;
- dma_addr_t aeqc_dma;
- int ret;
+ struct qm_sqc *sqc;
+ struct qm_cqc *cqc;
+ dma_addr_t dma_addr;
+ int ret = 0, i = 0;
+ void *addr;
- qm_init_eq_aeq_status(qm);
+ /*
+ * No need to judge if ECC multi-bit error occurs because the
+ * master OOO will be blocked.
+ */
+ if (qm->err_status.is_qm_ecc_mbit || qm->err_status.is_dev_ecc_mbit)
+ return 0;
- eqc = kzalloc(sizeof(struct qm_eqc), GFP_KERNEL);
- if (!eqc)
- return -ENOMEM;
- eqc_dma = dma_map_single(dev, eqc, sizeof(struct qm_eqc),
- DMA_TO_DEVICE);
- if (dma_mapping_error(dev, eqc_dma)) {
- kfree(eqc);
+ addr = qm_ctx_alloc(qm, size, &dma_addr);
+ if (IS_ERR(addr)) {
+ dev_err(dev, "Failed to alloc ctx for sqc and cqc!\n");
return -ENOMEM;
}
- eqc->base_l = cpu_to_le32(lower_32_bits(qm->eqe_dma));
- eqc->base_h = cpu_to_le32(upper_32_bits(qm->eqe_dma));
- if (qm->ver == QM_HW_V1)
- eqc->dw3 = cpu_to_le32(QM_EQE_AEQE_SIZE);
- eqc->dw6 = cpu_to_le32((QM_Q_DEPTH - 1) | (1 << QM_EQC_PHASE_SHIFT));
- ret = qm_mb(qm, QM_MB_CMD_EQC, eqc_dma, 0, 0);
- dma_unmap_single(dev, eqc_dma, sizeof(struct qm_eqc), DMA_TO_DEVICE);
- kfree(eqc);
- if (ret)
- return ret;
+ while (++i) {
+ ret = qm_dump_sqc_raw(qm, dma_addr, qp->qp_id);
+ if (ret) {
+ dev_err_ratelimited(dev, "Failed to dump sqc!\n");
+ break;
+ }
+ sqc = addr;
- aeqc = kzalloc(sizeof(struct qm_aeqc), GFP_KERNEL);
- if (!aeqc)
- return -ENOMEM;
- aeqc_dma = dma_map_single(dev, aeqc, sizeof(struct qm_aeqc),
- DMA_TO_DEVICE);
- if (dma_mapping_error(dev, aeqc_dma)) {
- kfree(aeqc);
- return -ENOMEM;
- }
+ ret = qm_dump_cqc_raw(qm, (dma_addr + sizeof(struct qm_sqc)),
+ qp->qp_id);
+ if (ret) {
+ dev_err_ratelimited(dev, "Failed to dump cqc!\n");
+ break;
+ }
+ cqc = addr + sizeof(struct qm_sqc);
- aeqc->base_l = cpu_to_le32(lower_32_bits(qm->aeqe_dma));
- aeqc->base_h = cpu_to_le32(upper_32_bits(qm->aeqe_dma));
- aeqc->dw6 = cpu_to_le32((QM_Q_DEPTH - 1) | (1 << QM_EQC_PHASE_SHIFT));
+ if ((sqc->tail == cqc->tail) &&
+ (QM_SQ_TAIL_IDX(sqc) == QM_CQ_TAIL_IDX(cqc)))
+ break;
- ret = qm_mb(qm, QM_MB_CMD_AEQC, aeqc_dma, 0, 0);
- dma_unmap_single(dev, aeqc_dma, sizeof(struct qm_aeqc), DMA_TO_DEVICE);
- kfree(aeqc);
+ if (i == MAX_WAIT_COUNTS) {
+ dev_err(dev, "Fail to empty queue %u!\n", qp->qp_id);
+ ret = -EBUSY;
+ break;
+ }
+
+ usleep_range(WAIT_PERIOD_US_MIN, WAIT_PERIOD_US_MAX);
+ }
+
+ qm_ctx_free(qm, size, addr, &dma_addr);
return ret;
}
-static int __hisi_qm_start(struct hisi_qm *qm)
+static int qm_stop_qp_nolock(struct hisi_qp *qp)
{
- struct pci_dev *pdev = qm->pdev;
- struct device *dev = &pdev->dev;
- size_t off = 0;
+ struct device *dev = &qp->qm->pdev->dev;
int ret;
-#define QM_INIT_BUF(qm, type, num) do { \
- (qm)->type = ((qm)->qdma.va + (off)); \
- (qm)->type##_dma = (qm)->qdma.dma + (off); \
- off += QMC_ALIGN(sizeof(struct qm_##type) * (num)); \
-} while (0)
-
- WARN_ON(!qm->qdma.dma);
-
- if (qm->qp_num == 0)
- return -EINVAL;
-
- if (qm->fun_type == QM_HW_PF) {
- ret = qm_dev_mem_reset(qm);
- if (ret)
- return ret;
-
- ret = hisi_qm_set_vft(qm, 0, qm->qp_base, qm->qp_num);
- if (ret)
- return ret;
+ /*
+ * It is allowed to stop and release qp when reset, If the qp is
+ * stopped when reset but still want to be released then, the
+ * is_resetting flag should be set negative so that this qp will not
+ * be restarted after reset.
+ */
+ if (atomic_read(&qp->qp_status.flags) == QP_STOP) {
+ qp->is_resetting = false;
+ return 0;
}
- QM_INIT_BUF(qm, eqe, QM_Q_DEPTH);
- QM_INIT_BUF(qm, aeqe, QM_Q_DEPTH);
- QM_INIT_BUF(qm, sqc, qm->qp_num);
- QM_INIT_BUF(qm, cqc, qm->qp_num);
-
- dev_dbg(dev, "init qm buffer:\n"
- " eqe (%pK, %lx)\n"
- " aeqe (%pK, %lx)\n"
- " sqc (%pK, %lx)\n"
- " cqc (%pK, %lx)\n",
- qm->eqe, (unsigned long)qm->eqe_dma,
- qm->aeqe, (unsigned long)qm->aeqe_dma,
- qm->sqc, (unsigned long)qm->sqc_dma,
- qm->cqc, (unsigned long)qm->cqc_dma);
+ if (!qm_qp_avail_state(qp->qm, qp, QP_STOP))
+ return -EPERM;
- ret = qm_eq_ctx_cfg(qm);
- if (ret)
- return ret;
+ atomic_set(&qp->qp_status.flags, QP_STOP);
- ret = qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
+ ret = qm_drain_qp(qp);
if (ret)
- return ret;
+ dev_err(dev, "Failed to drain out data for stopping!\n");
- ret = qm_mb(qm, QM_MB_CMD_CQC_BT, qm->cqc_dma, 0, 0);
- if (ret)
- return ret;
+ if (qp->qm->wq)
+ flush_workqueue(qp->qm->wq);
+ else
+ flush_work(&qp->qm->work);
- writel(0x0, qm->io_base + QM_VF_EQ_INT_MASK);
- writel(0x0, qm->io_base + QM_VF_AEQ_INT_MASK);
+ dev_dbg(dev, "stop queue %u!", qp->qp_id);
return 0;
}
/**
- * hisi_qm_start() - start qm
- * @qm: The qm to be started.
+ * hisi_qm_stop_qp() - Stop a qp in qm.
+ * @qp: The qp we want to stop.
*
- * This function starts a qm, then we can allocate qp from this qm.
+ * This function is reverse of hisi_qm_start_qp. Return 0 if successful.
*/
-int hisi_qm_start(struct hisi_qm *qm)
+int hisi_qm_stop_qp(struct hisi_qp *qp)
{
- struct device *dev = &qm->pdev->dev;
+ int ret;
- dev_dbg(dev, "qm start with %d queue pairs\n", qm->qp_num);
+ down_write(&qp->qm->qps_lock);
+ ret = qm_stop_qp_nolock(qp);
+ up_write(&qp->qm->qps_lock);
- if (!qm->qp_num) {
- dev_err(dev, "qp_num should not be 0\n");
- return -EINVAL;
- }
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_stop_qp);
- if (!qm->qp_bitmap) {
- qm->qp_bitmap = devm_kcalloc(dev, BITS_TO_LONGS(qm->qp_num),
- sizeof(long), GFP_KERNEL);
- qm->qp_array = devm_kcalloc(dev, qm->qp_num,
- sizeof(struct hisi_qp *),
- GFP_KERNEL);
- if (!qm->qp_bitmap || !qm->qp_array)
- return -ENOMEM;
- }
+/**
+ * hisi_qp_send() - Queue up a task in the hardware queue.
+ * @qp: The qp in which to put the message.
+ * @msg: The message.
+ *
+ * This function will return -EBUSY if qp is currently full, and -EAGAIN
+ * if qp related qm is resetting.
+ *
+ * Note: This function may run with qm_irq_thread and ACC reset at same time.
+ * It has no race with qm_irq_thread. However, during hisi_qp_send, ACC
+ * reset may happen, we have no lock here considering performance. This
+ * causes current qm_db sending fail or can not receive sended sqe. QM
+ * sync/async receive function should handle the error sqe. ACC reset
+ * done function should clear used sqe to 0.
+ */
+int hisi_qp_send(struct hisi_qp *qp, const void *msg)
+{
+ struct hisi_qp_status *qp_status = &qp->qp_status;
+ u16 sq_tail = qp_status->sq_tail;
+ u16 sq_tail_next = (sq_tail + 1) % QM_Q_DEPTH;
+ void *sqe = qm_get_avail_sqe(qp);
+
+ if (unlikely(atomic_read(&qp->qp_status.flags) == QP_STOP ||
+ atomic_read(&qp->qm->status.flags) == QM_STOP ||
+ qp->is_resetting)) {
+ dev_info(&qp->qm->pdev->dev, "QP is stopped or resetting\n");
+ return -EAGAIN;
+ }
+
+ if (!sqe)
+ return -EBUSY;
+
+ memcpy(sqe, msg, qp->qm->sqe_size);
+
+ qm_db(qp->qm, qp->qp_id, QM_DOORBELL_CMD_SQ, sq_tail_next, 0);
+ atomic_inc(&qp->qp_status.used);
+ qp_status->sq_tail = sq_tail_next;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(hisi_qp_send);
+
+static void hisi_qm_cache_wb(struct hisi_qm *qm)
+{
+ unsigned int val;
+
+ if (qm->ver == QM_HW_V1)
+ return;
+
+ writel(0x1, qm->io_base + QM_CACHE_WB_START);
+ if (readl_relaxed_poll_timeout(qm->io_base + QM_CACHE_WB_DONE,
+ val, val & BIT(0), 10, 1000))
+ dev_err(&qm->pdev->dev, "QM writeback sqc cache fail!\n");
+}
+
+static void qm_qp_event_notifier(struct hisi_qp *qp)
+{
+ wake_up_interruptible(&qp->uacce_q->wait);
+}
+
+static int hisi_qm_get_available_instances(struct uacce_device *uacce)
+{
+ return hisi_qm_get_free_qp_num(uacce->priv);
+}
+
+static int hisi_qm_uacce_get_queue(struct uacce_device *uacce,
+ unsigned long arg,
+ struct uacce_queue *q)
+{
+ struct hisi_qm *qm = uacce->priv;
+ struct hisi_qp *qp;
+ u8 alg_type = 0;
+
+ qp = hisi_qm_create_qp(qm, alg_type);
+ if (IS_ERR(qp))
+ return PTR_ERR(qp);
+
+ q->priv = qp;
+ q->uacce = uacce;
+ qp->uacce_q = q;
+ qp->event_cb = qm_qp_event_notifier;
+ qp->pasid = arg;
+
+ return 0;
+}
+
+static void hisi_qm_uacce_put_queue(struct uacce_queue *q)
+{
+ struct hisi_qp *qp = q->priv;
+
+ hisi_qm_cache_wb(qp->qm);
+ hisi_qm_release_qp(qp);
+}
+
+/* map sq/cq/doorbell to user space */
+static int hisi_qm_uacce_mmap(struct uacce_queue *q,
+ struct vm_area_struct *vma,
+ struct uacce_qfile_region *qfr)
+{
+ struct hisi_qp *qp = q->priv;
+ struct hisi_qm *qm = qp->qm;
+ size_t sz = vma->vm_end - vma->vm_start;
+ struct pci_dev *pdev = qm->pdev;
+ struct device *dev = &pdev->dev;
+ unsigned long vm_pgoff;
+ int ret;
+
+ switch (qfr->type) {
+ case UACCE_QFRT_MMIO:
+ if (qm->ver == QM_HW_V1) {
+ if (sz > PAGE_SIZE * QM_DOORBELL_PAGE_NR)
+ return -EINVAL;
+ } else {
+ if (sz > PAGE_SIZE * (QM_DOORBELL_PAGE_NR +
+ QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE))
+ return -EINVAL;
+ }
+
+ vma->vm_flags |= VM_IO;
+
+ return remap_pfn_range(vma, vma->vm_start,
+ qm->phys_base >> PAGE_SHIFT,
+ sz, pgprot_noncached(vma->vm_page_prot));
+ case UACCE_QFRT_DUS:
+ if (sz != qp->qdma.size)
+ return -EINVAL;
+
+ /*
+ * dma_mmap_coherent() requires vm_pgoff as 0
+ * restore vm_pfoff to initial value for mmap()
+ */
+ vm_pgoff = vma->vm_pgoff;
+ vma->vm_pgoff = 0;
+ ret = dma_mmap_coherent(dev, vma, qp->qdma.va,
+ qp->qdma.dma, sz);
+ vma->vm_pgoff = vm_pgoff;
+ return ret;
+
+ default:
+ return -EINVAL;
+ }
+}
+
+static int hisi_qm_uacce_start_queue(struct uacce_queue *q)
+{
+ struct hisi_qp *qp = q->priv;
+
+ return hisi_qm_start_qp(qp, qp->pasid);
+}
+
+static void hisi_qm_uacce_stop_queue(struct uacce_queue *q)
+{
+ hisi_qm_stop_qp(q->priv);
+}
+
+static int qm_set_sqctype(struct uacce_queue *q, u16 type)
+{
+ struct hisi_qm *qm = q->uacce->priv;
+ struct hisi_qp *qp = q->priv;
+
+ down_write(&qm->qps_lock);
+ qp->alg_type = type;
+ up_write(&qm->qps_lock);
+
+ return 0;
+}
+
+static long hisi_qm_uacce_ioctl(struct uacce_queue *q, unsigned int cmd,
+ unsigned long arg)
+{
+ struct hisi_qp *qp = q->priv;
+ struct hisi_qp_ctx qp_ctx;
+
+ if (cmd == UACCE_CMD_QM_SET_QP_CTX) {
+ if (copy_from_user(&qp_ctx, (void __user *)arg,
+ sizeof(struct hisi_qp_ctx)))
+ return -EFAULT;
+
+ if (qp_ctx.qc_type != 0 && qp_ctx.qc_type != 1)
+ return -EINVAL;
+
+ qm_set_sqctype(q, qp_ctx.qc_type);
+ qp_ctx.id = qp->qp_id;
+
+ if (copy_to_user((void __user *)arg, &qp_ctx,
+ sizeof(struct hisi_qp_ctx)))
+ return -EFAULT;
+ } else {
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static const struct uacce_ops uacce_qm_ops = {
+ .get_available_instances = hisi_qm_get_available_instances,
+ .get_queue = hisi_qm_uacce_get_queue,
+ .put_queue = hisi_qm_uacce_put_queue,
+ .start_queue = hisi_qm_uacce_start_queue,
+ .stop_queue = hisi_qm_uacce_stop_queue,
+ .mmap = hisi_qm_uacce_mmap,
+ .ioctl = hisi_qm_uacce_ioctl,
+};
+
+static int qm_alloc_uacce(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ struct uacce_device *uacce;
+ unsigned long mmio_page_nr;
+ unsigned long dus_page_nr;
+ struct uacce_interface interface = {
+ .flags = UACCE_DEV_SVA,
+ .ops = &uacce_qm_ops,
+ };
+
+ strncpy(interface.name, pdev->driver->name, sizeof(interface.name));
+
+ uacce = uacce_alloc(&pdev->dev, &interface);
+ if (IS_ERR(uacce))
+ return PTR_ERR(uacce);
+
+ if (uacce->flags & UACCE_DEV_SVA) {
+ qm->use_sva = true;
+ } else {
+ /* only consider sva case */
+ uacce_remove(uacce);
+ qm->uacce = NULL;
+ return -EINVAL;
+ }
+
+ uacce->is_vf = pdev->is_virtfn;
+ uacce->priv = qm;
+ uacce->algs = qm->algs;
+
+ if (qm->ver == QM_HW_V1) {
+ mmio_page_nr = QM_DOORBELL_PAGE_NR;
+ uacce->api_ver = HISI_QM_API_VER_BASE;
+ } else {
+ mmio_page_nr = QM_DOORBELL_PAGE_NR +
+ QM_DOORBELL_SQ_CQ_BASE_V2 / PAGE_SIZE;
+ uacce->api_ver = HISI_QM_API_VER2_BASE;
+ }
+
+ dus_page_nr = (PAGE_SIZE - 1 + qm->sqe_size * QM_Q_DEPTH +
+ sizeof(struct qm_cqe) * QM_Q_DEPTH) >> PAGE_SHIFT;
+
+ uacce->qf_pg_num[UACCE_QFRT_MMIO] = mmio_page_nr;
+ uacce->qf_pg_num[UACCE_QFRT_DUS] = dus_page_nr;
+
+ qm->uacce = uacce;
+
+ return 0;
+}
+
+/**
+ * hisi_qm_get_free_qp_num() - Get free number of qp in qm.
+ * @qm: The qm which want to get free qp.
+ *
+ * This function return free number of qp in qm.
+ */
+int hisi_qm_get_free_qp_num(struct hisi_qm *qm)
+{
+ int ret;
+
+ down_read(&qm->qps_lock);
+ ret = qm->qp_num - qm->qp_in_used;
+ up_read(&qm->qps_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_get_free_qp_num);
+
+static void hisi_qp_memory_uninit(struct hisi_qm *qm, int num)
+{
+ struct device *dev = &qm->pdev->dev;
+ struct qm_dma *qdma;
+ int i;
+
+ for (i = num - 1; i >= 0; i--) {
+ qdma = &qm->qp_array[i].qdma;
+ dma_free_coherent(dev, qdma->size, qdma->va, qdma->dma);
+ }
+
+ kfree(qm->qp_array);
+}
+
+static int hisi_qp_memory_init(struct hisi_qm *qm, size_t dma_size, int id)
+{
+ struct device *dev = &qm->pdev->dev;
+ size_t off = qm->sqe_size * QM_Q_DEPTH;
+ struct hisi_qp *qp;
+
+ qp = &qm->qp_array[id];
+ qp->qdma.va = dma_alloc_coherent(dev, dma_size, &qp->qdma.dma,
+ GFP_KERNEL);
+ if (!qp->qdma.va)
+ return -ENOMEM;
+
+ qp->sqe = qp->qdma.va;
+ qp->sqe_dma = qp->qdma.dma;
+ qp->cqe = qp->qdma.va + off;
+ qp->cqe_dma = qp->qdma.dma + off;
+ qp->qdma.size = dma_size;
+ qp->qm = qm;
+ qp->qp_id = id;
+
+ return 0;
+}
+
+static int hisi_qm_memory_init(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ size_t qp_dma_size, off = 0;
+ int i, ret = 0;
+
+#define QM_INIT_BUF(qm, type, num) do { \
+ (qm)->type = ((qm)->qdma.va + (off)); \
+ (qm)->type##_dma = (qm)->qdma.dma + (off); \
+ off += QMC_ALIGN(sizeof(struct qm_##type) * (num)); \
+} while (0)
+
+ idr_init(&qm->qp_idr);
+ qm->qdma.size = QMC_ALIGN(sizeof(struct qm_eqe) * QM_Q_DEPTH) +
+ QMC_ALIGN(sizeof(struct qm_aeqe) * QM_Q_DEPTH) +
+ QMC_ALIGN(sizeof(struct qm_sqc) * qm->qp_num) +
+ QMC_ALIGN(sizeof(struct qm_cqc) * qm->qp_num);
+ qm->qdma.va = dma_alloc_coherent(dev, qm->qdma.size, &qm->qdma.dma,
+ GFP_ATOMIC);
+ dev_dbg(dev, "allocate qm dma buf size=%zx)\n", qm->qdma.size);
+ if (!qm->qdma.va)
+ return -ENOMEM;
+
+ QM_INIT_BUF(qm, eqe, QM_Q_DEPTH);
+ QM_INIT_BUF(qm, aeqe, QM_Q_DEPTH);
+ QM_INIT_BUF(qm, sqc, qm->qp_num);
+ QM_INIT_BUF(qm, cqc, qm->qp_num);
+
+ qm->qp_array = kcalloc(qm->qp_num, sizeof(struct hisi_qp), GFP_KERNEL);
+ if (!qm->qp_array) {
+ ret = -ENOMEM;
+ goto err_alloc_qp_array;
+ }
+
+ /* one more page for device or qp statuses */
+ qp_dma_size = qm->sqe_size * QM_Q_DEPTH +
+ sizeof(struct qm_cqe) * QM_Q_DEPTH;
+ qp_dma_size = PAGE_ALIGN(qp_dma_size);
+ for (i = 0; i < qm->qp_num; i++) {
+ ret = hisi_qp_memory_init(qm, qp_dma_size, i);
+ if (ret)
+ goto err_init_qp_mem;
+
+ dev_dbg(dev, "allocate qp dma buf size=%zx)\n", qp_dma_size);
+ }
+
+ return ret;
+
+err_init_qp_mem:
+ hisi_qp_memory_uninit(qm, i);
+err_alloc_qp_array:
+ dma_free_coherent(dev, qm->qdma.size, qm->qdma.va, qm->qdma.dma);
+
+ return ret;
+}
+
+static void hisi_qm_pre_init(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+
+ if (qm->ver == QM_HW_V1)
+ qm->ops = &qm_hw_ops_v1;
+ else
+ qm->ops = &qm_hw_ops_v2;
+
+ pci_set_drvdata(pdev, qm);
+ mutex_init(&qm->mailbox_lock);
+ init_rwsem(&qm->qps_lock);
+ qm->qp_in_used = 0;
+}
+
+/**
+ * hisi_qm_uninit() - Uninitialize qm.
+ * @qm: The qm needed uninit.
+ *
+ * This function uninits qm related device resources.
+ */
+void hisi_qm_uninit(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ struct device *dev = &pdev->dev;
+
+ down_write(&qm->qps_lock);
+
+ if (!qm_avail_state(qm, QM_CLOSE)) {
+ up_write(&qm->qps_lock);
+ return;
+ }
+
+ uacce_remove(qm->uacce);
+ qm->uacce = NULL;
+
+ hisi_qp_memory_uninit(qm, qm->qp_num);
+ idr_destroy(&qm->qp_idr);
+
+ if (qm->qdma.va) {
+ hisi_qm_cache_wb(qm);
+ dma_free_coherent(dev, qm->qdma.size,
+ qm->qdma.va, qm->qdma.dma);
+ memset(&qm->qdma, 0, sizeof(qm->qdma));
+ }
+
+ qm_irq_unregister(qm);
+ pci_free_irq_vectors(pdev);
+ iounmap(qm->io_base);
+ pci_release_mem_regions(pdev);
+ pci_disable_device(pdev);
+
+ up_write(&qm->qps_lock);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_uninit);
+
+/**
+ * hisi_qm_get_vft() - Get vft from a qm.
+ * @qm: The qm we want to get its vft.
+ * @base: The base number of queue in vft.
+ * @number: The number of queues in vft.
+ *
+ * We can allocate multiple queues to a qm by configuring virtual function
+ * table. We get related configures by this function. Normally, we call this
+ * function in VF driver to get the queue information.
+ *
+ * qm hw v1 does not support this interface.
+ */
+int hisi_qm_get_vft(struct hisi_qm *qm, u32 *base, u32 *number)
+{
+ if (!base || !number)
+ return -EINVAL;
+
+ if (!qm->ops->get_vft) {
+ dev_err(&qm->pdev->dev, "Don't support vft read!\n");
+ return -EINVAL;
+ }
+
+ return qm->ops->get_vft(qm, base, number);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_get_vft);
+
+/**
+ * This function is alway called in PF driver, it is used to assign queues
+ * among PF and VFs.
+ *
+ * Assign queues A~B to PF: hisi_qm_set_vft(qm, 0, A, B - A + 1)
+ * Assign queues A~B to VF: hisi_qm_set_vft(qm, 2, A, B - A + 1)
+ * (VF function number 0x2)
+ */
+static int hisi_qm_set_vft(struct hisi_qm *qm, u32 fun_num, u32 base,
+ u32 number)
+{
+ u32 max_q_num = qm->ctrl_qp_num;
+
+ if (base >= max_q_num || number > max_q_num ||
+ (base + number) > max_q_num)
+ return -EINVAL;
+
+ return qm_set_sqc_cqc_vft(qm, fun_num, base, number);
+}
+
+static void qm_init_eq_aeq_status(struct hisi_qm *qm)
+{
+ struct hisi_qm_status *status = &qm->status;
+
+ status->eq_head = 0;
+ status->aeq_head = 0;
+ status->eqc_phase = true;
+ status->aeqc_phase = true;
+}
+
+static int qm_eq_ctx_cfg(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ struct qm_eqc *eqc;
+ struct qm_aeqc *aeqc;
+ dma_addr_t eqc_dma;
+ dma_addr_t aeqc_dma;
+ int ret;
+
+ qm_init_eq_aeq_status(qm);
+
+ eqc = kzalloc(sizeof(struct qm_eqc), GFP_KERNEL);
+ if (!eqc)
+ return -ENOMEM;
+ eqc_dma = dma_map_single(dev, eqc, sizeof(struct qm_eqc),
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(dev, eqc_dma)) {
+ kfree(eqc);
+ return -ENOMEM;
+ }
+
+ eqc->base_l = cpu_to_le32(lower_32_bits(qm->eqe_dma));
+ eqc->base_h = cpu_to_le32(upper_32_bits(qm->eqe_dma));
+ if (qm->ver == QM_HW_V1)
+ eqc->dw3 = cpu_to_le32(QM_EQE_AEQE_SIZE);
+ eqc->dw6 = cpu_to_le32((QM_Q_DEPTH - 1) | (1 << QM_EQC_PHASE_SHIFT));
+ ret = qm_mb(qm, QM_MB_CMD_EQC, eqc_dma, 0, 0);
+ dma_unmap_single(dev, eqc_dma, sizeof(struct qm_eqc), DMA_TO_DEVICE);
+ kfree(eqc);
+ if (ret)
+ return ret;
+
+ aeqc = kzalloc(sizeof(struct qm_aeqc), GFP_KERNEL);
+ if (!aeqc)
+ return -ENOMEM;
+ aeqc_dma = dma_map_single(dev, aeqc, sizeof(struct qm_aeqc),
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(dev, aeqc_dma)) {
+ kfree(aeqc);
+ return -ENOMEM;
+ }
+
+ aeqc->base_l = cpu_to_le32(lower_32_bits(qm->aeqe_dma));
+ aeqc->base_h = cpu_to_le32(upper_32_bits(qm->aeqe_dma));
+ aeqc->dw6 = cpu_to_le32((QM_Q_DEPTH - 1) | (1 << QM_EQC_PHASE_SHIFT));
+
+ ret = qm_mb(qm, QM_MB_CMD_AEQC, aeqc_dma, 0, 0);
+ dma_unmap_single(dev, aeqc_dma, sizeof(struct qm_aeqc), DMA_TO_DEVICE);
+ kfree(aeqc);
+
+ return ret;
+}
+
+static int __hisi_qm_start(struct hisi_qm *qm)
+{
+ int ret;
+
+ WARN_ON(!qm->qdma.dma);
+
+ if (qm->fun_type == QM_HW_PF) {
+ ret = qm_dev_mem_reset(qm);
+ if (ret)
+ return ret;
+
+ ret = hisi_qm_set_vft(qm, 0, qm->qp_base, qm->qp_num);
+ if (ret)
+ return ret;
+ }
+
+ ret = qm_eq_ctx_cfg(qm);
+ if (ret)
+ return ret;
+
+ ret = qm_mb(qm, QM_MB_CMD_SQC_BT, qm->sqc_dma, 0, 0);
+ if (ret)
+ return ret;
+
+ ret = qm_mb(qm, QM_MB_CMD_CQC_BT, qm->cqc_dma, 0, 0);
+ if (ret)
+ return ret;
+
+ writel(0x0, qm->io_base + QM_VF_EQ_INT_MASK);
+ writel(0x0, qm->io_base + QM_VF_AEQ_INT_MASK);
+
+ return 0;
+}
+
+/**
+ * hisi_qm_start() - start qm
+ * @qm: The qm to be started.
+ *
+ * This function starts a qm, then we can allocate qp from this qm.
+ */
+int hisi_qm_start(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ int ret = 0;
+
+ down_write(&qm->qps_lock);
+
+ if (!qm_avail_state(qm, QM_START)) {
+ up_write(&qm->qps_lock);
+ return -EPERM;
+ }
+
+ dev_dbg(dev, "qm start with %d queue pairs\n", qm->qp_num);
+
+ if (!qm->qp_num) {
+ dev_err(dev, "qp_num should not be 0\n");
+ ret = -EINVAL;
+ goto err_unlock;
+ }
+
+ ret = __hisi_qm_start(qm);
+ if (!ret)
+ atomic_set(&qm->status.flags, QM_START);
+
+err_unlock:
+ up_write(&qm->qps_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_start);
+
+static int qm_restart(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ struct hisi_qp *qp;
+ int ret, i;
+
+ ret = hisi_qm_start(qm);
+ if (ret < 0)
+ return ret;
+
+ down_write(&qm->qps_lock);
+ for (i = 0; i < qm->qp_num; i++) {
+ qp = &qm->qp_array[i];
+ if (atomic_read(&qp->qp_status.flags) == QP_STOP &&
+ qp->is_resetting == true) {
+ ret = qm_start_qp_nolock(qp, 0);
+ if (ret < 0) {
+ dev_err(dev, "Failed to start qp%d!\n", i);
+
+ up_write(&qm->qps_lock);
+ return ret;
+ }
+ qp->is_resetting = false;
+ }
+ }
+ up_write(&qm->qps_lock);
+
+ return 0;
+}
+
+/* Stop started qps in reset flow */
+static int qm_stop_started_qp(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ struct hisi_qp *qp;
+ int i, ret;
+
+ for (i = 0; i < qm->qp_num; i++) {
+ qp = &qm->qp_array[i];
+ if (qp && atomic_read(&qp->qp_status.flags) == QP_START) {
+ qp->is_resetting = true;
+ ret = qm_stop_qp_nolock(qp);
+ if (ret < 0) {
+ dev_err(dev, "Failed to stop qp%d!\n", i);
+ return ret;
+ }
+ }
+ }
+
+ return 0;
+}
+
+/**
+ * This function clears all queues memory in a qm. Reset of accelerator can
+ * use this to clear queues.
+ */
+static void qm_clear_queues(struct hisi_qm *qm)
+{
+ struct hisi_qp *qp;
+ int i;
+
+ for (i = 0; i < qm->qp_num; i++) {
+ qp = &qm->qp_array[i];
+ if (qp->is_resetting)
+ memset(qp->qdma.va, 0, qp->qdma.size);
+ }
+
+ memset(qm->qdma.va, 0, qm->qdma.size);
+}
+
+/**
+ * hisi_qm_stop() - Stop a qm.
+ * @qm: The qm which will be stopped.
+ *
+ * This function stops qm and its qps, then qm can not accept request.
+ * Related resources are not released at this state, we can use hisi_qm_start
+ * to let qm start again.
+ */
+int hisi_qm_stop(struct hisi_qm *qm)
+{
+ struct device *dev = &qm->pdev->dev;
+ int ret = 0;
+
+ down_write(&qm->qps_lock);
+
+ if (!qm_avail_state(qm, QM_STOP)) {
+ ret = -EPERM;
+ goto err_unlock;
+ }
+
+ if (qm->status.stop_reason == QM_SOFT_RESET ||
+ qm->status.stop_reason == QM_FLR) {
+ ret = qm_stop_started_qp(qm);
+ if (ret < 0) {
+ dev_err(dev, "Failed to stop started qp!\n");
+ goto err_unlock;
+ }
+ }
+
+ /* Mask eq and aeq irq */
+ writel(0x1, qm->io_base + QM_VF_EQ_INT_MASK);
+ writel(0x1, qm->io_base + QM_VF_AEQ_INT_MASK);
+
+ if (qm->fun_type == QM_HW_PF) {
+ ret = hisi_qm_set_vft(qm, 0, 0, 0);
+ if (ret < 0) {
+ dev_err(dev, "Failed to set vft!\n");
+ ret = -EBUSY;
+ goto err_unlock;
+ }
+ }
+
+ qm_clear_queues(qm);
+ atomic_set(&qm->status.flags, QM_STOP);
+
+err_unlock:
+ up_write(&qm->qps_lock);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_stop);
+
+static ssize_t qm_status_read(struct file *filp, char __user *buffer,
+ size_t count, loff_t *pos)
+{
+ struct hisi_qm *qm = filp->private_data;
+ char buf[QM_DBG_READ_LEN];
+ int val, cp_len, len;
+
+ if (*pos)
+ return 0;
+
+ if (count < QM_DBG_READ_LEN)
+ return -ENOSPC;
+
+ val = atomic_read(&qm->status.flags);
+ len = snprintf(buf, QM_DBG_READ_LEN, "%s\n", qm_s[val]);
+ if (!len)
+ return -EFAULT;
+
+ cp_len = copy_to_user(buffer, buf, len);
+ if (cp_len)
+ return -EFAULT;
+
+ return (*pos = len);
+}
+
+static const struct file_operations qm_status_fops = {
+ .owner = THIS_MODULE,
+ .open = simple_open,
+ .read = qm_status_read,
+};
+
+static int qm_debugfs_atomic64_set(void *data, u64 val)
+{
+ if (val)
+ return -EINVAL;
+
+ atomic64_set((atomic64_t *)data, 0);
+
+ return 0;
+}
+
+static int qm_debugfs_atomic64_get(void *data, u64 *val)
+{
+ *val = atomic64_read((atomic64_t *)data);
+
+ return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(qm_atomic64_ops, qm_debugfs_atomic64_get,
+ qm_debugfs_atomic64_set, "%llu\n");
+
+/**
+ * hisi_qm_debug_init() - Initialize qm related debugfs files.
+ * @qm: The qm for which we want to add debugfs files.
+ *
+ * Create qm related debugfs files.
+ */
+int hisi_qm_debug_init(struct hisi_qm *qm)
+{
+ struct qm_dfx *dfx = &qm->debug.dfx;
+ struct dentry *qm_d;
+ void *data;
+ int i, ret;
+
+ qm_d = debugfs_create_dir("qm", qm->debug.debug_root);
+ qm->debug.qm_d = qm_d;
+
+ /* only show this in PF */
+ if (qm->fun_type == QM_HW_PF)
+ for (i = CURRENT_Q; i < DEBUG_FILE_NUM; i++)
+ if (qm_create_debugfs_file(qm, i)) {
+ ret = -ENOENT;
+ goto failed_to_create;
+ }
+
+ debugfs_create_file("regs", 0444, qm->debug.qm_d, qm, &qm_regs_fops);
+
+ debugfs_create_file("cmd", 0444, qm->debug.qm_d, qm, &qm_cmd_fops);
+
+ debugfs_create_file("status", 0444, qm->debug.qm_d, qm,
+ &qm_status_fops);
+ for (i = 0; i < ARRAY_SIZE(qm_dfx_files); i++) {
+ data = (atomic64_t *)((uintptr_t)dfx + qm_dfx_files[i].offset);
+ debugfs_create_file(qm_dfx_files[i].name,
+ 0644,
+ qm_d,
+ data,
+ &qm_atomic64_ops);
+ }
+
+ return 0;
+
+failed_to_create:
+ debugfs_remove_recursive(qm_d);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_debug_init);
+
+/**
+ * hisi_qm_debug_regs_clear() - clear qm debug related registers.
+ * @qm: The qm for which we want to clear its debug registers.
+ */
+void hisi_qm_debug_regs_clear(struct hisi_qm *qm)
+{
+ struct qm_dfx_registers *regs;
+ int i;
+
+ /* clear current_q */
+ writel(0x0, qm->io_base + QM_DFX_SQE_CNT_VF_SQN);
+ writel(0x0, qm->io_base + QM_DFX_CQE_CNT_VF_CQN);
+
+ /*
+ * these registers are reading and clearing, so clear them after
+ * reading them.
+ */
+ writel(0x1, qm->io_base + QM_DFX_CNT_CLR_CE);
+
+ regs = qm_dfx_regs;
+ for (i = 0; i < CNT_CYC_REGS_NUM; i++) {
+ readl(qm->io_base + regs->reg_offset);
+ regs++;
+ }
+
+ writel(0x0, qm->io_base + QM_DFX_CNT_CLR_CE);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_debug_regs_clear);
+
+static void qm_hw_error_init(struct hisi_qm *qm)
+{
+ const struct hisi_qm_err_info *err_info = &qm->err_ini->err_info;
+
+ if (!qm->ops->hw_error_init) {
+ dev_err(&qm->pdev->dev, "QM doesn't support hw error handling!\n");
+ return;
+ }
+
+ qm->ops->hw_error_init(qm, err_info->ce, err_info->nfe, err_info->fe);
+}
+
+static void qm_hw_error_uninit(struct hisi_qm *qm)
+{
+ if (!qm->ops->hw_error_uninit) {
+ dev_err(&qm->pdev->dev, "Unexpected QM hw error uninit!\n");
+ return;
+ }
+
+ qm->ops->hw_error_uninit(qm);
+}
+
+static enum acc_err_result qm_hw_error_handle(struct hisi_qm *qm)
+{
+ if (!qm->ops->hw_error_handle) {
+ dev_err(&qm->pdev->dev, "QM doesn't support hw error report!\n");
+ return ACC_ERR_NONE;
+ }
+
+ return qm->ops->hw_error_handle(qm);
+}
+
+/**
+ * hisi_qm_dev_err_init() - Initialize device error configuration.
+ * @qm: The qm for which we want to do error initialization.
+ *
+ * Initialize QM and device error related configuration.
+ */
+void hisi_qm_dev_err_init(struct hisi_qm *qm)
+{
+ if (qm->fun_type == QM_HW_VF)
+ return;
+
+ qm_hw_error_init(qm);
+
+ if (!qm->err_ini->hw_err_enable) {
+ dev_err(&qm->pdev->dev, "Device doesn't support hw error init!\n");
+ return;
+ }
+ qm->err_ini->hw_err_enable(qm);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_dev_err_init);
+
+/**
+ * hisi_qm_dev_err_uninit() - Uninitialize device error configuration.
+ * @qm: The qm for which we want to do error uninitialization.
+ *
+ * Uninitialize QM and device error related configuration.
+ */
+void hisi_qm_dev_err_uninit(struct hisi_qm *qm)
+{
+ if (qm->fun_type == QM_HW_VF)
+ return;
+
+ qm_hw_error_uninit(qm);
+
+ if (!qm->err_ini->hw_err_disable) {
+ dev_err(&qm->pdev->dev, "Unexpected device hw error uninit!\n");
+ return;
+ }
+ qm->err_ini->hw_err_disable(qm);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_dev_err_uninit);
+
+/**
+ * hisi_qm_free_qps() - free multiple queue pairs.
+ * @qps: The queue pairs need to be freed.
+ * @qp_num: The num of queue pairs.
+ */
+void hisi_qm_free_qps(struct hisi_qp **qps, int qp_num)
+{
+ int i;
+
+ if (!qps || qp_num <= 0)
+ return;
+
+ for (i = qp_num - 1; i >= 0; i--)
+ hisi_qm_release_qp(qps[i]);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_free_qps);
+
+static void free_list(struct list_head *head)
+{
+ struct hisi_qm_resource *res, *tmp;
+
+ list_for_each_entry_safe(res, tmp, head, list) {
+ list_del(&res->list);
+ kfree(res);
+ }
+}
+
+static int hisi_qm_sort_devices(int node, struct list_head *head,
+ struct hisi_qm_list *qm_list)
+{
+ struct hisi_qm_resource *res, *tmp;
+ struct hisi_qm *qm;
+ struct list_head *n;
+ struct device *dev;
+ int dev_node = 0;
+
+ list_for_each_entry(qm, &qm_list->list, list) {
+ dev = &qm->pdev->dev;
+
+ if (IS_ENABLED(CONFIG_NUMA)) {
+ dev_node = dev_to_node(dev);
+ if (dev_node < 0)
+ dev_node = 0;
+ }
+
+ res = kzalloc(sizeof(*res), GFP_KERNEL);
+ if (!res)
+ return -ENOMEM;
+
+ res->qm = qm;
+ res->distance = node_distance(dev_node, node);
+ n = head;
+ list_for_each_entry(tmp, head, list) {
+ if (res->distance < tmp->distance) {
+ n = &tmp->list;
+ break;
+ }
+ }
+ list_add_tail(&res->list, n);
+ }
+
+ return 0;
+}
+
+/**
+ * hisi_qm_alloc_qps_node() - Create multiple queue pairs.
+ * @qm_list: The list of all available devices.
+ * @qp_num: The number of queue pairs need created.
+ * @alg_type: The algorithm type.
+ * @node: The numa node.
+ * @qps: The queue pairs need created.
+ *
+ * This function will sort all available device according to numa distance.
+ * Then try to create all queue pairs from one device, if all devices do
+ * not meet the requirements will return error.
+ */
+int hisi_qm_alloc_qps_node(struct hisi_qm_list *qm_list, int qp_num,
+ u8 alg_type, int node, struct hisi_qp **qps)
+{
+ struct hisi_qm_resource *tmp;
+ int ret = -ENODEV;
+ LIST_HEAD(head);
+ int i;
+
+ if (!qps || !qm_list || qp_num <= 0)
+ return -EINVAL;
+
+ mutex_lock(&qm_list->lock);
+ if (hisi_qm_sort_devices(node, &head, qm_list)) {
+ mutex_unlock(&qm_list->lock);
+ goto err;
+ }
+
+ list_for_each_entry(tmp, &head, list) {
+ for (i = 0; i < qp_num; i++) {
+ qps[i] = hisi_qm_create_qp(tmp->qm, alg_type);
+ if (IS_ERR(qps[i])) {
+ hisi_qm_free_qps(qps, i);
+ break;
+ }
+ }
+
+ if (i == qp_num) {
+ ret = 0;
+ break;
+ }
+ }
+
+ mutex_unlock(&qm_list->lock);
+ if (ret)
+ pr_info("Failed to create qps, node[%d], alg[%d], qp[%d]!\n",
+ node, alg_type, qp_num);
+
+err:
+ free_list(&head);
+ return ret;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_alloc_qps_node);
+
+static int qm_vf_q_assign(struct hisi_qm *qm, u32 num_vfs)
+{
+ u32 remain_q_num, q_num, i, j;
+ u32 q_base = qm->qp_num;
+ int ret;
+
+ if (!num_vfs)
+ return -EINVAL;
+
+ remain_q_num = qm->ctrl_qp_num - qm->qp_num;
+
+ /* If remain queues not enough, return error. */
+ if (qm->ctrl_qp_num < qm->qp_num || remain_q_num < num_vfs)
+ return -EINVAL;
+
+ q_num = remain_q_num / num_vfs;
+ for (i = 1; i <= num_vfs; i++) {
+ if (i == num_vfs)
+ q_num += remain_q_num % num_vfs;
+ ret = hisi_qm_set_vft(qm, i, q_base, q_num);
+ if (ret) {
+ for (j = i; j > 0; j--)
+ hisi_qm_set_vft(qm, j, 0, 0);
+ return ret;
+ }
+ q_base += q_num;
+ }
+
+ return 0;
+}
+
+static int qm_clear_vft_config(struct hisi_qm *qm)
+{
+ int ret;
+ u32 i;
+
+ for (i = 1; i <= qm->vfs_num; i++) {
+ ret = hisi_qm_set_vft(qm, i, 0, 0);
+ if (ret)
+ return ret;
+ }
+ qm->vfs_num = 0;
+
+ return 0;
+}
+
+/**
+ * hisi_qm_sriov_enable() - enable virtual functions
+ * @pdev: the PCIe device
+ * @max_vfs: the number of virtual functions to enable
+ *
+ * Returns the number of enabled VFs. If there are VFs enabled already or
+ * max_vfs is more than the total number of device can be enabled, returns
+ * failure.
+ */
+int hisi_qm_sriov_enable(struct pci_dev *pdev, int max_vfs)
+{
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ int pre_existing_vfs, num_vfs, total_vfs, ret;
+
+ total_vfs = pci_sriov_get_totalvfs(pdev);
+ pre_existing_vfs = pci_num_vf(pdev);
+ if (pre_existing_vfs) {
+ pci_err(pdev, "%d VFs already enabled. Please disable pre-enabled VFs!\n",
+ pre_existing_vfs);
+ return 0;
+ }
+
+ num_vfs = min_t(int, max_vfs, total_vfs);
+ ret = qm_vf_q_assign(qm, num_vfs);
+ if (ret) {
+ pci_err(pdev, "Can't assign queues for VF!\n");
+ return ret;
+ }
+
+ qm->vfs_num = num_vfs;
+
+ ret = pci_enable_sriov(pdev, num_vfs);
+ if (ret) {
+ pci_err(pdev, "Can't enable VF!\n");
+ qm_clear_vft_config(qm);
+ return ret;
+ }
+
+ pci_info(pdev, "VF enabled, vfs_num(=%d)!\n", num_vfs);
+
+ return num_vfs;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_sriov_enable);
+
+/**
+ * hisi_qm_sriov_disable - disable virtual functions
+ * @pdev: the PCI device
+ *
+ * Return failure if there are VFs assigned already.
+ */
+int hisi_qm_sriov_disable(struct pci_dev *pdev)
+{
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+
+ if (pci_vfs_assigned(pdev)) {
+ pci_err(pdev, "Failed to disable VFs as VFs are assigned!\n");
+ return -EPERM;
+ }
+
+ /* remove in hpre_pci_driver will be called to free VF resources */
+ pci_disable_sriov(pdev);
+ return qm_clear_vft_config(qm);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_sriov_disable);
+
+/**
+ * hisi_qm_sriov_configure - configure the number of VFs
+ * @pdev: The PCI device
+ * @num_vfs: The number of VFs need enabled
+ *
+ * Enable SR-IOV according to num_vfs, 0 means disable.
+ */
+int hisi_qm_sriov_configure(struct pci_dev *pdev, int num_vfs)
+{
+ if (num_vfs == 0)
+ return hisi_qm_sriov_disable(pdev);
+ else
+ return hisi_qm_sriov_enable(pdev, num_vfs);
+}
+EXPORT_SYMBOL_GPL(hisi_qm_sriov_configure);
+
+static enum acc_err_result qm_dev_err_handle(struct hisi_qm *qm)
+{
+ u32 err_sts;
+
+ if (!qm->err_ini->get_dev_hw_err_status) {
+ dev_err(&qm->pdev->dev, "Device doesn't support get hw error status!\n");
+ return ACC_ERR_NONE;
+ }
+
+ /* get device hardware error status */
+ err_sts = qm->err_ini->get_dev_hw_err_status(qm);
+ if (err_sts) {
+ if (err_sts & qm->err_ini->err_info.ecc_2bits_mask)
+ qm->err_status.is_dev_ecc_mbit = true;
+
+ if (!qm->err_ini->log_dev_hw_err) {
+ dev_err(&qm->pdev->dev, "Device doesn't support log hw error!\n");
+ return ACC_ERR_NEED_RESET;
+ }
+
+ qm->err_ini->log_dev_hw_err(qm, err_sts);
+ return ACC_ERR_NEED_RESET;
+ }
+
+ return ACC_ERR_RECOVERED;
+}
+
+static enum acc_err_result qm_process_dev_error(struct hisi_qm *qm)
+{
+ enum acc_err_result qm_ret, dev_ret;
+
+ /* log qm error */
+ qm_ret = qm_hw_error_handle(qm);
+
+ /* log device error */
+ dev_ret = qm_dev_err_handle(qm);
+
+ return (qm_ret == ACC_ERR_NEED_RESET ||
+ dev_ret == ACC_ERR_NEED_RESET) ?
+ ACC_ERR_NEED_RESET : ACC_ERR_RECOVERED;
+}
+
+/**
+ * hisi_qm_dev_err_detected() - Get device and qm error status then log it.
+ * @pdev: The PCI device which need report error.
+ * @state: The connectivity between CPU and device.
+ *
+ * We register this function into PCIe AER handlers, It will report device or
+ * qm hardware error status when error occur.
+ */
+pci_ers_result_t hisi_qm_dev_err_detected(struct pci_dev *pdev,
+ pci_channel_state_t state)
+{
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ enum acc_err_result ret;
+
+ if (pdev->is_virtfn)
+ return PCI_ERS_RESULT_NONE;
+
+ pci_info(pdev, "PCI error detected, state(=%d)!!\n", state);
+ if (state == pci_channel_io_perm_failure)
+ return PCI_ERS_RESULT_DISCONNECT;
+
+ ret = qm_process_dev_error(qm);
+ if (ret == ACC_ERR_NEED_RESET)
+ return PCI_ERS_RESULT_NEED_RESET;
+
+ return PCI_ERS_RESULT_RECOVERED;
+}
+EXPORT_SYMBOL_GPL(hisi_qm_dev_err_detected);
+
+static int qm_get_hw_error_status(struct hisi_qm *qm)
+{
+ return readl(qm->io_base + QM_ABNORMAL_INT_STATUS);
+}
+
+static int qm_check_req_recv(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
+ u32 val;
+
+ writel(ACC_VENDOR_ID_VALUE, qm->io_base + QM_PEH_VENDOR_ID);
+ ret = readl_relaxed_poll_timeout(qm->io_base + QM_PEH_VENDOR_ID, val,
+ (val == ACC_VENDOR_ID_VALUE),
+ POLL_PERIOD, POLL_TIMEOUT);
+ if (ret) {
+ dev_err(&pdev->dev, "Fails to read QM reg!\n");
+ return ret;
+ }
+
+ writel(PCI_VENDOR_ID_HUAWEI, qm->io_base + QM_PEH_VENDOR_ID);
+ ret = readl_relaxed_poll_timeout(qm->io_base + QM_PEH_VENDOR_ID, val,
+ (val == PCI_VENDOR_ID_HUAWEI),
+ POLL_PERIOD, POLL_TIMEOUT);
+ if (ret)
+ dev_err(&pdev->dev, "Fails to read QM reg in the second time!\n");
+
+ return ret;
+}
+
+static int qm_set_pf_mse(struct hisi_qm *qm, bool set)
+{
+ struct pci_dev *pdev = qm->pdev;
+ u16 cmd;
+ int i;
+
+ pci_read_config_word(pdev, PCI_COMMAND, &cmd);
+ if (set)
+ cmd |= PCI_COMMAND_MEMORY;
+ else
+ cmd &= ~PCI_COMMAND_MEMORY;
+
+ pci_write_config_word(pdev, PCI_COMMAND, cmd);
+ for (i = 0; i < MAX_WAIT_COUNTS; i++) {
+ pci_read_config_word(pdev, PCI_COMMAND, &cmd);
+ if (set == ((cmd & PCI_COMMAND_MEMORY) >> 1))
+ return 0;
+
+ udelay(1);
+ }
+
+ return -ETIMEDOUT;
+}
+
+static int qm_set_vf_mse(struct hisi_qm *qm, bool set)
+{
+ struct pci_dev *pdev = qm->pdev;
+ u16 sriov_ctrl;
+ int pos;
+ int i;
+
+ pos = pci_find_ext_capability(pdev, PCI_EXT_CAP_ID_SRIOV);
+ pci_read_config_word(pdev, pos + PCI_SRIOV_CTRL, &sriov_ctrl);
+ if (set)
+ sriov_ctrl |= PCI_SRIOV_CTRL_MSE;
+ else
+ sriov_ctrl &= ~PCI_SRIOV_CTRL_MSE;
+ pci_write_config_word(pdev, pos + PCI_SRIOV_CTRL, sriov_ctrl);
- if (!qm->use_dma_api) {
- dev_dbg(&qm->pdev->dev, "qm delay start\n");
- return 0;
- } else if (!qm->qdma.va) {
- qm->qdma.size = QMC_ALIGN(sizeof(struct qm_eqe) * QM_Q_DEPTH) +
- QMC_ALIGN(sizeof(struct qm_aeqe) * QM_Q_DEPTH) +
- QMC_ALIGN(sizeof(struct qm_sqc) * qm->qp_num) +
- QMC_ALIGN(sizeof(struct qm_cqc) * qm->qp_num);
- qm->qdma.va = dma_alloc_coherent(dev, qm->qdma.size,
- &qm->qdma.dma, GFP_KERNEL);
- dev_dbg(dev, "allocate qm dma buf(va=%pK, dma=%pad, size=%zx)\n",
- qm->qdma.va, &qm->qdma.dma, qm->qdma.size);
- if (!qm->qdma.va)
- return -ENOMEM;
+ for (i = 0; i < MAX_WAIT_COUNTS; i++) {
+ pci_read_config_word(pdev, pos + PCI_SRIOV_CTRL, &sriov_ctrl);
+ if (set == (sriov_ctrl & PCI_SRIOV_CTRL_MSE) >>
+ ACC_PEH_SRIOV_CTRL_VF_MSE_SHIFT)
+ return 0;
+
+ udelay(1);
}
- return __hisi_qm_start(qm);
+ return -ETIMEDOUT;
}
-EXPORT_SYMBOL_GPL(hisi_qm_start);
-/**
- * hisi_qm_stop() - Stop a qm.
- * @qm: The qm which will be stopped.
- *
- * This function stops qm and its qps, then qm can not accept request.
- * Related resources are not released at this state, we can use hisi_qm_start
- * to let qm start again.
- */
-int hisi_qm_stop(struct hisi_qm *qm)
+static int qm_set_msi(struct hisi_qm *qm, bool set)
{
- struct device *dev;
- struct hisi_qp *qp;
- int ret = 0, i;
+ struct pci_dev *pdev = qm->pdev;
- if (!qm || !qm->pdev) {
- WARN_ON(1);
- return -EINVAL;
+ if (set) {
+ pci_write_config_dword(pdev, pdev->msi_cap + PCI_MSI_MASK_64,
+ 0);
+ } else {
+ pci_write_config_dword(pdev, pdev->msi_cap + PCI_MSI_MASK_64,
+ ACC_PEH_MSI_DISABLE);
+ if (qm->err_status.is_qm_ecc_mbit ||
+ qm->err_status.is_dev_ecc_mbit)
+ return 0;
+
+ mdelay(1);
+ if (readl(qm->io_base + QM_PEH_DFX_INFO0))
+ return -EFAULT;
}
- dev = &qm->pdev->dev;
+ return 0;
+}
- /* Mask eq and aeq irq */
- writel(0x1, qm->io_base + QM_VF_EQ_INT_MASK);
- writel(0x1, qm->io_base + QM_VF_AEQ_INT_MASK);
+static int qm_vf_reset_prepare(struct hisi_qm *qm)
+{
+ struct hisi_qm_list *qm_list = qm->qm_list;
+ int stop_reason = qm->status.stop_reason;
+ struct pci_dev *pdev = qm->pdev;
+ struct pci_dev *virtfn;
+ struct hisi_qm *vf_qm;
+ int ret = 0;
- /* Stop all qps belong to this qm */
- for (i = 0; i < qm->qp_num; i++) {
- qp = qm->qp_array[i];
- if (qp) {
- ret = hisi_qm_stop_qp(qp);
- if (ret < 0) {
- dev_err(dev, "Failed to stop qp%d!\n", i);
- return -EBUSY;
- }
+ mutex_lock(&qm_list->lock);
+ list_for_each_entry(vf_qm, &qm_list->list, list) {
+ virtfn = vf_qm->pdev;
+ if (virtfn == pdev)
+ continue;
+
+ if (pci_physfn(virtfn) == pdev) {
+ vf_qm->status.stop_reason = stop_reason;
+ ret = hisi_qm_stop(vf_qm);
+ if (ret)
+ goto stop_fail;
}
}
- if (qm->fun_type == QM_HW_PF) {
- ret = hisi_qm_set_vft(qm, 0, 0, 0);
- if (ret < 0)
- dev_err(dev, "Failed to set vft!\n");
+stop_fail:
+ mutex_unlock(&qm_list->lock);
+ return ret;
+}
+
+static int qm_reset_prepare_ready(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ struct hisi_qm *pf_qm = pci_get_drvdata(pci_physfn(pdev));
+ int delay = 0;
+
+ /* All reset requests need to be queued for processing */
+ while (test_and_set_bit(QM_DEV_RESET_FLAG, &pf_qm->reset_flag)) {
+ msleep(++delay);
+ if (delay > QM_RESET_WAIT_TIMEOUT)
+ return -EBUSY;
}
- return ret;
+ return 0;
}
-EXPORT_SYMBOL_GPL(hisi_qm_stop);
-/**
- * hisi_qm_debug_init() - Initialize qm related debugfs files.
- * @qm: The qm for which we want to add debugfs files.
- *
- * Create qm related debugfs files.
- */
-int hisi_qm_debug_init(struct hisi_qm *qm)
+static int qm_controller_reset_prepare(struct hisi_qm *qm)
{
- struct dentry *qm_d;
- int i, ret;
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
- qm_d = debugfs_create_dir("qm", qm->debug.debug_root);
- qm->debug.qm_d = qm_d;
+ ret = qm_reset_prepare_ready(qm);
+ if (ret) {
+ pci_err(pdev, "Controller reset not ready!\n");
+ return ret;
+ }
- /* only show this in PF */
- if (qm->fun_type == QM_HW_PF)
- for (i = CURRENT_Q; i < DEBUG_FILE_NUM; i++)
- if (qm_create_debugfs_file(qm, i)) {
- ret = -ENOENT;
- goto failed_to_create;
- }
+ if (qm->vfs_num) {
+ ret = qm_vf_reset_prepare(qm);
+ if (ret) {
+ pci_err(pdev, "Fails to stop VFs!\n");
+ return ret;
+ }
+ }
- debugfs_create_file("qm_regs", 0444, qm->debug.qm_d, qm, &qm_regs_fops);
+ qm->status.stop_reason = QM_SOFT_RESET;
+ ret = hisi_qm_stop(qm);
+ if (ret) {
+ pci_err(pdev, "Fails to stop QM!\n");
+ return ret;
+ }
return 0;
+}
-failed_to_create:
- debugfs_remove_recursive(qm_d);
- return ret;
+static void qm_dev_ecc_mbit_handle(struct hisi_qm *qm)
+{
+ u32 nfe_enb = 0;
+
+ if (!qm->err_status.is_dev_ecc_mbit &&
+ qm->err_status.is_qm_ecc_mbit &&
+ qm->err_ini->close_axi_master_ooo) {
+
+ qm->err_ini->close_axi_master_ooo(qm);
+
+ } else if (qm->err_status.is_dev_ecc_mbit &&
+ !qm->err_status.is_qm_ecc_mbit &&
+ !qm->err_ini->close_axi_master_ooo) {
+
+ nfe_enb = readl(qm->io_base + QM_RAS_NFE_ENABLE);
+ writel(nfe_enb & QM_RAS_NFE_MBIT_DISABLE,
+ qm->io_base + QM_RAS_NFE_ENABLE);
+ writel(QM_ECC_MBIT, qm->io_base + QM_ABNORMAL_INT_SET);
+ }
}
-EXPORT_SYMBOL_GPL(hisi_qm_debug_init);
-/**
- * hisi_qm_debug_regs_clear() - clear qm debug related registers.
- * @qm: The qm for which we want to clear its debug registers.
- */
-void hisi_qm_debug_regs_clear(struct hisi_qm *qm)
+static int qm_soft_reset(struct hisi_qm *qm)
{
- struct qm_dfx_registers *regs;
- int i;
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
+ u32 val;
- /* clear current_q */
- writel(0x0, qm->io_base + QM_DFX_SQE_CNT_VF_SQN);
- writel(0x0, qm->io_base + QM_DFX_CQE_CNT_VF_CQN);
+ /* Ensure all doorbells and mailboxes received by QM */
+ ret = qm_check_req_recv(qm);
+ if (ret)
+ return ret;
- /*
- * these registers are reading and clearing, so clear them after
- * reading them.
- */
- writel(0x1, qm->io_base + QM_DFX_CNT_CLR_CE);
+ if (qm->vfs_num) {
+ ret = qm_set_vf_mse(qm, false);
+ if (ret) {
+ pci_err(pdev, "Fails to disable vf MSE bit.\n");
+ return ret;
+ }
+ }
- regs = qm_dfx_regs;
- for (i = 0; i < CNT_CYC_REGS_NUM; i++) {
- readl(qm->io_base + regs->reg_offset);
- regs++;
+ ret = qm_set_msi(qm, false);
+ if (ret) {
+ pci_err(pdev, "Fails to disable PEH MSI bit.\n");
+ return ret;
}
- writel(0x0, qm->io_base + QM_DFX_CNT_CLR_CE);
+ qm_dev_ecc_mbit_handle(qm);
+
+ /* OOO register set and check */
+ writel(ACC_MASTER_GLOBAL_CTRL_SHUTDOWN,
+ qm->io_base + ACC_MASTER_GLOBAL_CTRL);
+
+ /* If bus lock, reset chip */
+ ret = readl_relaxed_poll_timeout(qm->io_base + ACC_MASTER_TRANS_RETURN,
+ val,
+ (val == ACC_MASTER_TRANS_RETURN_RW),
+ POLL_PERIOD, POLL_TIMEOUT);
+ if (ret) {
+ pci_emerg(pdev, "Bus lock! Please reset system.\n");
+ return ret;
+ }
+
+ ret = qm_set_pf_mse(qm, false);
+ if (ret) {
+ pci_err(pdev, "Fails to disable pf MSE bit.\n");
+ return ret;
+ }
+
+ /* The reset related sub-control registers are not in PCI BAR */
+ if (ACPI_HANDLE(&pdev->dev)) {
+ unsigned long long value = 0;
+ acpi_status s;
+
+ s = acpi_evaluate_integer(ACPI_HANDLE(&pdev->dev),
+ qm->err_ini->err_info.acpi_rst,
+ NULL, &value);
+ if (ACPI_FAILURE(s)) {
+ pci_err(pdev, "NO controller reset method!\n");
+ return -EIO;
+ }
+
+ if (value) {
+ pci_err(pdev, "Reset step %llu failed!\n", value);
+ return -EIO;
+ }
+ } else {
+ pci_err(pdev, "No reset method!\n");
+ return -EINVAL;
+ }
+
+ return 0;
}
-EXPORT_SYMBOL_GPL(hisi_qm_debug_regs_clear);
-static void qm_hw_error_init(struct hisi_qm *qm)
+static int qm_vf_reset_done(struct hisi_qm *qm)
{
- const struct hisi_qm_err_info *err_info = &qm->err_ini->err_info;
+ struct hisi_qm_list *qm_list = qm->qm_list;
+ struct pci_dev *pdev = qm->pdev;
+ struct pci_dev *virtfn;
+ struct hisi_qm *vf_qm;
+ int ret = 0;
- if (!qm->ops->hw_error_init) {
- dev_err(&qm->pdev->dev, "QM doesn't support hw error handling!\n");
- return;
+ mutex_lock(&qm_list->lock);
+ list_for_each_entry(vf_qm, &qm_list->list, list) {
+ virtfn = vf_qm->pdev;
+ if (virtfn == pdev)
+ continue;
+
+ if (pci_physfn(virtfn) == pdev) {
+ ret = qm_restart(vf_qm);
+ if (ret)
+ goto restart_fail;
+ }
}
- qm->ops->hw_error_init(qm, err_info->ce, err_info->nfe,
- err_info->fe, err_info->msi);
+restart_fail:
+ mutex_unlock(&qm_list->lock);
+ return ret;
}
-static void qm_hw_error_uninit(struct hisi_qm *qm)
+static int qm_get_dev_err_status(struct hisi_qm *qm)
{
- if (!qm->ops->hw_error_uninit) {
- dev_err(&qm->pdev->dev, "Unexpected QM hw error uninit!\n");
+ return qm->err_ini->get_dev_hw_err_status(qm);
+}
+
+static int qm_dev_hw_init(struct hisi_qm *qm)
+{
+ return qm->err_ini->hw_init(qm);
+}
+
+static void qm_restart_prepare(struct hisi_qm *qm)
+{
+ u32 value;
+
+ if (!qm->err_status.is_qm_ecc_mbit &&
+ !qm->err_status.is_dev_ecc_mbit)
return;
- }
- qm->ops->hw_error_uninit(qm);
+ /* temporarily close the OOO port used for PEH to write out MSI */
+ value = readl(qm->io_base + ACC_AM_CFG_PORT_WR_EN);
+ writel(value & ~qm->err_ini->err_info.msi_wr_port,
+ qm->io_base + ACC_AM_CFG_PORT_WR_EN);
+
+ /* clear dev ecc 2bit error source if having */
+ value = qm_get_dev_err_status(qm) &
+ qm->err_ini->err_info.ecc_2bits_mask;
+ if (value && qm->err_ini->clear_dev_hw_err_status)
+ qm->err_ini->clear_dev_hw_err_status(qm, value);
+
+ /* clear QM ecc mbit error source */
+ writel(QM_ECC_MBIT, qm->io_base + QM_ABNORMAL_INT_SOURCE);
+
+ /* clear AM Reorder Buffer ecc mbit source */
+ writel(ACC_ROB_ECC_ERR_MULTPL, qm->io_base + ACC_AM_ROB_ECC_INT_STS);
+
+ if (qm->err_ini->open_axi_master_ooo)
+ qm->err_ini->open_axi_master_ooo(qm);
}
-static pci_ers_result_t qm_hw_error_handle(struct hisi_qm *qm)
+static void qm_restart_done(struct hisi_qm *qm)
{
- if (!qm->ops->hw_error_handle) {
- dev_err(&qm->pdev->dev, "QM doesn't support hw error report!\n");
- return PCI_ERS_RESULT_NONE;
+ u32 value;
+
+ if (!qm->err_status.is_qm_ecc_mbit &&
+ !qm->err_status.is_dev_ecc_mbit)
+ return;
+
+ /* open the OOO port for PEH to write out MSI */
+ value = readl(qm->io_base + ACC_AM_CFG_PORT_WR_EN);
+ value |= qm->err_ini->err_info.msi_wr_port;
+ writel(value, qm->io_base + ACC_AM_CFG_PORT_WR_EN);
+
+ qm->err_status.is_qm_ecc_mbit = false;
+ qm->err_status.is_dev_ecc_mbit = false;
+}
+
+static int qm_controller_reset_done(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
+
+ ret = qm_set_msi(qm, true);
+ if (ret) {
+ pci_err(pdev, "Fails to enable PEH MSI bit!\n");
+ return ret;
}
- return qm->ops->hw_error_handle(qm);
+ ret = qm_set_pf_mse(qm, true);
+ if (ret) {
+ pci_err(pdev, "Fails to enable pf MSE bit!\n");
+ return ret;
+ }
+
+ if (qm->vfs_num) {
+ ret = qm_set_vf_mse(qm, true);
+ if (ret) {
+ pci_err(pdev, "Fails to enable vf MSE bit!\n");
+ return ret;
+ }
+ }
+
+ ret = qm_dev_hw_init(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to init device\n");
+ return ret;
+ }
+
+ qm_restart_prepare(qm);
+
+ ret = qm_restart(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to start QM!\n");
+ return ret;
+ }
+
+ if (qm->vfs_num) {
+ ret = qm_vf_q_assign(qm, qm->vfs_num);
+ if (ret) {
+ pci_err(pdev, "Failed to assign queue!\n");
+ return ret;
+ }
+ }
+
+ ret = qm_vf_reset_done(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to start VFs!\n");
+ return -EPERM;
+ }
+
+ hisi_qm_dev_err_init(qm);
+ qm_restart_done(qm);
+
+ clear_bit(QM_DEV_RESET_FLAG, &qm->reset_flag);
+
+ return 0;
}
-/**
- * hisi_qm_get_hw_version() - Get hardware version of a qm.
- * @pdev: The device which hardware version we want to get.
- *
- * This function gets the hardware version of a qm. Return QM_HW_UNKNOWN
- * if the hardware version is not supported.
- */
-enum qm_hw_ver hisi_qm_get_hw_version(struct pci_dev *pdev)
+static int qm_controller_reset(struct hisi_qm *qm)
{
- switch (pdev->revision) {
- case QM_HW_V1:
- case QM_HW_V2:
- return pdev->revision;
- default:
- return QM_HW_UNKNOWN;
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
+
+ pci_info(pdev, "Controller resetting...\n");
+
+ ret = qm_controller_reset_prepare(qm);
+ if (ret)
+ return ret;
+
+ ret = qm_soft_reset(qm);
+ if (ret) {
+ pci_err(pdev, "Controller reset failed (%d)\n", ret);
+ return ret;
}
+
+ ret = qm_controller_reset_done(qm);
+ if (ret)
+ return ret;
+
+ pci_info(pdev, "Controller reset complete\n");
+
+ return 0;
}
-EXPORT_SYMBOL_GPL(hisi_qm_get_hw_version);
/**
- * hisi_qm_dev_err_init() - Initialize device error configuration.
- * @qm: The qm for which we want to do error initialization.
+ * hisi_qm_dev_slot_reset() - slot reset
+ * @pdev: the PCIe device
*
- * Initialize QM and device error related configuration.
+ * This function offers QM relate PCIe device reset interface. Drivers which
+ * use QM can use this function as slot_reset in its struct pci_error_handlers.
*/
-void hisi_qm_dev_err_init(struct hisi_qm *qm)
+pci_ers_result_t hisi_qm_dev_slot_reset(struct pci_dev *pdev)
{
- if (qm->fun_type == QM_HW_VF)
- return;
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ int ret;
- qm_hw_error_init(qm);
+ if (pdev->is_virtfn)
+ return PCI_ERS_RESULT_RECOVERED;
- if (!qm->err_ini->hw_err_enable) {
- dev_err(&qm->pdev->dev, "Device doesn't support hw error init!\n");
- return;
+ pci_aer_clear_nonfatal_status(pdev);
+
+ /* reset pcie device controller */
+ ret = qm_controller_reset(qm);
+ if (ret) {
+ pci_err(pdev, "Controller reset failed (%d)\n", ret);
+ return PCI_ERS_RESULT_DISCONNECT;
}
- qm->err_ini->hw_err_enable(qm);
+
+ return PCI_ERS_RESULT_RECOVERED;
}
-EXPORT_SYMBOL_GPL(hisi_qm_dev_err_init);
+EXPORT_SYMBOL_GPL(hisi_qm_dev_slot_reset);
-/**
- * hisi_qm_dev_err_uninit() - Uninitialize device error configuration.
- * @qm: The qm for which we want to do error uninitialization.
- *
- * Uninitialize QM and device error related configuration.
- */
-void hisi_qm_dev_err_uninit(struct hisi_qm *qm)
+/* check the interrupt is ecc-mbit error or not */
+static int qm_check_dev_error(struct hisi_qm *qm)
{
+ int ret;
+
if (qm->fun_type == QM_HW_VF)
- return;
+ return 0;
+
+ ret = qm_get_hw_error_status(qm) & QM_ECC_MBIT;
+ if (ret)
+ return ret;
+
+ return (qm_get_dev_err_status(qm) &
+ qm->err_ini->err_info.ecc_2bits_mask);
+}
+
+void hisi_qm_reset_prepare(struct pci_dev *pdev)
+{
+ struct hisi_qm *pf_qm = pci_get_drvdata(pci_physfn(pdev));
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ u32 delay = 0;
+ int ret;
- qm_hw_error_uninit(qm);
+ hisi_qm_dev_err_uninit(pf_qm);
- if (!qm->err_ini->hw_err_disable) {
- dev_err(&qm->pdev->dev, "Unexpected device hw error uninit!\n");
+ /*
+ * Check whether there is an ECC mbit error, If it occurs, need to
+ * wait for soft reset to fix it.
+ */
+ while (qm_check_dev_error(pf_qm)) {
+ msleep(++delay);
+ if (delay > QM_RESET_WAIT_TIMEOUT)
+ return;
+ }
+
+ ret = qm_reset_prepare_ready(qm);
+ if (ret) {
+ pci_err(pdev, "FLR not ready!\n");
return;
}
- qm->err_ini->hw_err_disable(qm);
-}
-EXPORT_SYMBOL_GPL(hisi_qm_dev_err_uninit);
-/**
- * hisi_qm_free_qps() - free multiple queue pairs.
- * @qps: The queue pairs need to be freed.
- * @qp_num: The num of queue pairs.
- */
-void hisi_qm_free_qps(struct hisi_qp **qps, int qp_num)
-{
- int i;
+ if (qm->vfs_num) {
+ ret = qm_vf_reset_prepare(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to prepare reset, ret = %d.\n",
+ ret);
+ return;
+ }
+ }
- if (!qps || qp_num <= 0)
+ ret = hisi_qm_stop(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to stop QM, ret = %d.\n", ret);
return;
+ }
- for (i = qp_num - 1; i >= 0; i--)
- hisi_qm_release_qp(qps[i]);
+ pci_info(pdev, "FLR resetting...\n");
}
-EXPORT_SYMBOL_GPL(hisi_qm_free_qps);
+EXPORT_SYMBOL_GPL(hisi_qm_reset_prepare);
-static void free_list(struct list_head *head)
+static bool qm_flr_reset_complete(struct pci_dev *pdev)
{
- struct hisi_qm_resource *res, *tmp;
+ struct pci_dev *pf_pdev = pci_physfn(pdev);
+ struct hisi_qm *qm = pci_get_drvdata(pf_pdev);
+ u32 id;
- list_for_each_entry_safe(res, tmp, head, list) {
- list_del(&res->list);
- kfree(res);
+ pci_read_config_dword(qm->pdev, PCI_COMMAND, &id);
+ if (id == QM_PCI_COMMAND_INVALID) {
+ pci_err(pdev, "Device can not be used!\n");
+ return false;
}
+
+ clear_bit(QM_DEV_RESET_FLAG, &qm->reset_flag);
+
+ return true;
}
-static int hisi_qm_sort_devices(int node, struct list_head *head,
- struct hisi_qm_list *qm_list)
+void hisi_qm_reset_done(struct pci_dev *pdev)
{
- struct hisi_qm_resource *res, *tmp;
- struct hisi_qm *qm;
- struct list_head *n;
- struct device *dev;
- int dev_node = 0;
+ struct hisi_qm *pf_qm = pci_get_drvdata(pci_physfn(pdev));
+ struct hisi_qm *qm = pci_get_drvdata(pdev);
+ int ret;
- list_for_each_entry(qm, &qm_list->list, list) {
- dev = &qm->pdev->dev;
+ hisi_qm_dev_err_init(pf_qm);
- if (IS_ENABLED(CONFIG_NUMA)) {
- dev_node = dev_to_node(dev);
- if (dev_node < 0)
- dev_node = 0;
+ ret = qm_restart(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to start QM, ret = %d.\n", ret);
+ goto flr_done;
+ }
+
+ if (qm->fun_type == QM_HW_PF) {
+ ret = qm_dev_hw_init(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to init PF, ret = %d.\n", ret);
+ goto flr_done;
}
- res = kzalloc(sizeof(*res), GFP_KERNEL);
- if (!res)
- return -ENOMEM;
+ if (!qm->vfs_num)
+ goto flr_done;
- res->qm = qm;
- res->distance = node_distance(dev_node, node);
- n = head;
- list_for_each_entry(tmp, head, list) {
- if (res->distance < tmp->distance) {
- n = &tmp->list;
- break;
- }
+ ret = qm_vf_q_assign(qm, qm->vfs_num);
+ if (ret) {
+ pci_err(pdev, "Failed to assign VFs, ret = %d.\n", ret);
+ goto flr_done;
+ }
+
+ ret = qm_vf_reset_done(qm);
+ if (ret) {
+ pci_err(pdev, "Failed to start VFs, ret = %d.\n", ret);
+ goto flr_done;
}
- list_add_tail(&res->list, n);
}
- return 0;
+flr_done:
+ if (qm_flr_reset_complete(pdev))
+ pci_info(pdev, "FLR reset complete\n");
}
+EXPORT_SYMBOL_GPL(hisi_qm_reset_done);
-/**
- * hisi_qm_alloc_qps_node() - Create multiple queue pairs.
- * @qm_list: The list of all available devices.
- * @qp_num: The number of queue pairs need created.
- * @alg_type: The algorithm type.
- * @node: The numa node.
- * @qps: The queue pairs need created.
- *
- * This function will sort all available device according to numa distance.
- * Then try to create all queue pairs from one device, if all devices do
- * not meet the requirements will return error.
- */
-int hisi_qm_alloc_qps_node(struct hisi_qm_list *qm_list, int qp_num,
- u8 alg_type, int node, struct hisi_qp **qps)
+static irqreturn_t qm_abnormal_irq(int irq, void *data)
{
- struct hisi_qm_resource *tmp;
- int ret = -ENODEV;
- LIST_HEAD(head);
- int i;
+ struct hisi_qm *qm = data;
+ enum acc_err_result ret;
- if (!qps || !qm_list || qp_num <= 0)
- return -EINVAL;
+ atomic64_inc(&qm->debug.dfx.abnormal_irq_cnt);
+ ret = qm_process_dev_error(qm);
+ if (ret == ACC_ERR_NEED_RESET)
+ schedule_work(&qm->rst_work);
- mutex_lock(&qm_list->lock);
- if (hisi_qm_sort_devices(node, &head, qm_list)) {
- mutex_unlock(&qm_list->lock);
- goto err;
- }
+ return IRQ_HANDLED;
+}
- list_for_each_entry(tmp, &head, list) {
- for (i = 0; i < qp_num; i++) {
- qps[i] = hisi_qm_create_qp(tmp->qm, alg_type);
- if (IS_ERR(qps[i])) {
- hisi_qm_free_qps(qps, i);
- break;
- }
- }
+static int qm_irq_register(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ int ret;
- if (i == qp_num) {
- ret = 0;
- break;
+ ret = request_irq(pci_irq_vector(pdev, QM_EQ_EVENT_IRQ_VECTOR),
+ qm_irq, IRQF_SHARED, qm->dev_name, qm);
+ if (ret)
+ return ret;
+
+ if (qm->ver != QM_HW_V1) {
+ ret = request_irq(pci_irq_vector(pdev, QM_AEQ_EVENT_IRQ_VECTOR),
+ qm_aeq_irq, IRQF_SHARED, qm->dev_name, qm);
+ if (ret)
+ goto err_aeq_irq;
+
+ if (qm->fun_type == QM_HW_PF) {
+ ret = request_irq(pci_irq_vector(pdev,
+ QM_ABNORMAL_EVENT_IRQ_VECTOR),
+ qm_abnormal_irq, IRQF_SHARED,
+ qm->dev_name, qm);
+ if (ret)
+ goto err_abonormal_irq;
}
}
- mutex_unlock(&qm_list->lock);
- if (ret)
- pr_info("Failed to create qps, node[%d], alg[%d], qp[%d]!\n",
- node, alg_type, qp_num);
+ return 0;
-err:
- free_list(&head);
+err_abonormal_irq:
+ free_irq(pci_irq_vector(pdev, QM_AEQ_EVENT_IRQ_VECTOR), qm);
+err_aeq_irq:
+ free_irq(pci_irq_vector(pdev, QM_EQ_EVENT_IRQ_VECTOR), qm);
return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qm_alloc_qps_node);
-static pci_ers_result_t qm_dev_err_handle(struct hisi_qm *qm)
+static void hisi_qm_controller_reset(struct work_struct *rst_work)
{
- u32 err_sts;
+ struct hisi_qm *qm = container_of(rst_work, struct hisi_qm, rst_work);
+ int ret;
- if (!qm->err_ini->get_dev_hw_err_status) {
- dev_err(&qm->pdev->dev, "Device doesn't support get hw error status!\n");
- return PCI_ERS_RESULT_NONE;
+ /* reset pcie device controller */
+ ret = qm_controller_reset(qm);
+ if (ret)
+ dev_err(&qm->pdev->dev, "controller reset failed (%d)\n", ret);
+
+}
+
+/**
+ * hisi_qm_init() - Initialize configures about qm.
+ * @qm: The qm needing init.
+ *
+ * This function init qm, then we can call hisi_qm_start to put qm into work.
+ */
+int hisi_qm_init(struct hisi_qm *qm)
+{
+ struct pci_dev *pdev = qm->pdev;
+ struct device *dev = &pdev->dev;
+ unsigned int num_vec;
+ int ret;
+
+ hisi_qm_pre_init(qm);
+
+ ret = qm_alloc_uacce(qm);
+ if (ret < 0)
+ dev_warn(&pdev->dev, "fail to alloc uacce (%d)\n", ret);
+
+ ret = pci_enable_device_mem(pdev);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "Failed to enable device mem!\n");
+ goto err_remove_uacce;
}
- /* get device hardware error status */
- err_sts = qm->err_ini->get_dev_hw_err_status(qm);
- if (err_sts) {
- if (!qm->err_ini->log_dev_hw_err) {
- dev_err(&qm->pdev->dev, "Device doesn't support log hw error!\n");
- return PCI_ERS_RESULT_NEED_RESET;
- }
+ ret = pci_request_mem_regions(pdev, qm->dev_name);
+ if (ret < 0) {
+ dev_err(&pdev->dev, "Failed to request mem regions!\n");
+ goto err_disable_pcidev;
+ }
- qm->err_ini->log_dev_hw_err(qm, err_sts);
- return PCI_ERS_RESULT_NEED_RESET;
+ qm->phys_base = pci_resource_start(pdev, PCI_BAR_2);
+ qm->phys_size = pci_resource_len(qm->pdev, PCI_BAR_2);
+ qm->io_base = ioremap(qm->phys_base, qm->phys_size);
+ if (!qm->io_base) {
+ ret = -EIO;
+ goto err_release_mem_regions;
}
- return PCI_ERS_RESULT_RECOVERED;
-}
+ ret = dma_set_mask_and_coherent(dev, DMA_BIT_MASK(64));
+ if (ret < 0)
+ goto err_iounmap;
+ pci_set_master(pdev);
-static pci_ers_result_t qm_process_dev_error(struct pci_dev *pdev)
-{
- struct hisi_qm *qm = pci_get_drvdata(pdev);
- pci_ers_result_t qm_ret, dev_ret;
+ if (!qm->ops->get_irq_num) {
+ ret = -EOPNOTSUPP;
+ goto err_iounmap;
+ }
+ num_vec = qm->ops->get_irq_num(qm);
+ ret = pci_alloc_irq_vectors(pdev, num_vec, num_vec, PCI_IRQ_MSI);
+ if (ret < 0) {
+ dev_err(dev, "Failed to enable MSI vectors!\n");
+ goto err_iounmap;
+ }
- /* log qm error */
- qm_ret = qm_hw_error_handle(qm);
+ ret = qm_irq_register(qm);
+ if (ret)
+ goto err_free_irq_vectors;
- /* log device error */
- dev_ret = qm_dev_err_handle(qm);
+ if (qm->fun_type == QM_HW_VF && qm->ver != QM_HW_V1) {
+ /* v2 starts to support get vft by mailbox */
+ ret = hisi_qm_get_vft(qm, &qm->qp_base, &qm->qp_num);
+ if (ret)
+ goto err_irq_unregister;
+ }
- return (qm_ret == PCI_ERS_RESULT_NEED_RESET ||
- dev_ret == PCI_ERS_RESULT_NEED_RESET) ?
- PCI_ERS_RESULT_NEED_RESET : PCI_ERS_RESULT_RECOVERED;
-}
+ ret = hisi_qm_memory_init(qm);
+ if (ret)
+ goto err_irq_unregister;
-/**
- * hisi_qm_dev_err_detected() - Get device and qm error status then log it.
- * @pdev: The PCI device which need report error.
- * @state: The connectivity between CPU and device.
- *
- * We register this function into PCIe AER handlers, It will report device or
- * qm hardware error status when error occur.
- */
-pci_ers_result_t hisi_qm_dev_err_detected(struct pci_dev *pdev,
- pci_channel_state_t state)
-{
- if (pdev->is_virtfn)
- return PCI_ERS_RESULT_NONE;
+ INIT_WORK(&qm->work, qm_work_process);
+ if (qm->fun_type == QM_HW_PF)
+ INIT_WORK(&qm->rst_work, hisi_qm_controller_reset);
- pci_info(pdev, "PCI error detected, state(=%d)!!\n", state);
- if (state == pci_channel_io_perm_failure)
- return PCI_ERS_RESULT_DISCONNECT;
+ atomic_set(&qm->status.flags, QM_INIT);
+
+ return 0;
- return qm_process_dev_error(pdev);
+err_irq_unregister:
+ qm_irq_unregister(qm);
+err_free_irq_vectors:
+ pci_free_irq_vectors(pdev);
+err_iounmap:
+ iounmap(qm->io_base);
+err_release_mem_regions:
+ pci_release_mem_regions(pdev);
+err_disable_pcidev:
+ pci_disable_device(pdev);
+err_remove_uacce:
+ uacce_remove(qm->uacce);
+ qm->uacce = NULL;
+ return ret;
}
-EXPORT_SYMBOL_GPL(hisi_qm_dev_err_detected);
+EXPORT_SYMBOL_GPL(hisi_qm_init);
+
MODULE_LICENSE("GPL v2");
MODULE_AUTHOR("Zhou Wang <wangzhou1@hisilicon.com>");
#include <linux/module.h>
#include <linux/pci.h>
+#define QM_QNUM_V1 4096
+#define QM_QNUM_V2 1024
+#define QM_MAX_VFS_NUM_V2 63
+
/* qm user domain */
#define QM_ARUSER_M_CFG_1 0x100088
#define AXUSER_SNOOP_ENABLE BIT(30)
#define QM_BASE_NFE (QM_AXI_RRESP | QM_AXI_BRESP | QM_ECC_MBIT | \
QM_ACC_GET_TASK_TIMEOUT | QM_DB_TIMEOUT | \
- QM_OF_FIFO_OF)
+ QM_OF_FIFO_OF | QM_DB_RANDOM_INVALID)
#define QM_BASE_CE QM_ECC_1BIT
#define QM_Q_DEPTH 1024
/* page number for queue file region */
#define QM_DOORBELL_PAGE_NR 1
+enum qm_stop_reason {
+ QM_NORMAL,
+ QM_SOFT_RESET,
+ QM_FLR,
+};
+
+enum qm_state {
+ QM_INIT = 0,
+ QM_START,
+ QM_CLOSE,
+ QM_STOP,
+};
+
enum qp_state {
+ QP_INIT = 1,
+ QP_START,
QP_STOP,
+ QP_CLOSE,
};
enum qm_hw_ver {
QM_HW_UNKNOWN = -1,
QM_HW_V1 = 0x20,
QM_HW_V2 = 0x21,
+ QM_HW_V3 = 0x30,
};
enum qm_fun_type {
DEBUG_FILE_NUM,
};
+struct qm_dfx {
+ atomic64_t err_irq_cnt;
+ atomic64_t aeq_irq_cnt;
+ atomic64_t abnormal_irq_cnt;
+ atomic64_t create_qp_err_cnt;
+ atomic64_t mb_err_cnt;
+};
+
struct debugfs_file {
enum qm_debug_file index;
struct mutex lock;
struct qm_debug {
u32 curr_qm_qp_num;
+ u32 sqe_mask_offset;
+ u32 sqe_mask_len;
+ struct qm_dfx dfx;
struct dentry *debug_root;
struct dentry *qm_d;
struct debugfs_file files[DEBUG_FILE_NUM];
bool eqc_phase;
u32 aeq_head;
bool aeqc_phase;
- unsigned long flags;
+ atomic_t flags;
+ int stop_reason;
};
struct hisi_qm;
struct hisi_qm_err_info {
+ char *acpi_rst;
+ u32 msi_wr_port;
+ u32 ecc_2bits_mask;
u32 ce;
u32 nfe;
u32 fe;
- u32 msi;
+};
+
+struct hisi_qm_err_status {
+ u32 is_qm_ecc_mbit;
+ u32 is_dev_ecc_mbit;
};
struct hisi_qm_err_ini {
+ int (*hw_init)(struct hisi_qm *qm);
void (*hw_err_enable)(struct hisi_qm *qm);
void (*hw_err_disable)(struct hisi_qm *qm);
u32 (*get_dev_hw_err_status)(struct hisi_qm *qm);
+ void (*clear_dev_hw_err_status)(struct hisi_qm *qm, u32 err_sts);
+ void (*open_axi_master_ooo)(struct hisi_qm *qm);
+ void (*close_axi_master_ooo)(struct hisi_qm *qm);
void (*log_dev_hw_err)(struct hisi_qm *qm, u32 err_sts);
struct hisi_qm_err_info err_info;
};
u32 qp_num;
u32 qp_in_used;
u32 ctrl_qp_num;
+ u32 vfs_num;
struct list_head list;
+ struct hisi_qm_list *qm_list;
struct qm_dma qdma;
struct qm_sqc *sqc;
struct hisi_qm_status status;
const struct hisi_qm_err_ini *err_ini;
+ struct hisi_qm_err_status err_status;
+ unsigned long reset_flag;
- rwlock_t qps_lock;
- unsigned long *qp_bitmap;
- struct hisi_qp **qp_array;
+ struct rw_semaphore qps_lock;
+ struct idr qp_idr;
+ struct hisi_qp *qp_array;
struct mutex mailbox_lock;
struct qm_debug debug;
u32 error_mask;
- u32 msi_mask;
struct workqueue_struct *wq;
struct work_struct work;
+ struct work_struct rst_work;
const char *algs;
- bool use_dma_api;
bool use_sva;
resource_size_t phys_base;
resource_size_t phys_size;
u16 sq_tail;
u16 cq_head;
bool cqc_phase;
- unsigned long flags;
+ atomic_t flags;
};
struct hisi_qp_ops {
void (*event_cb)(struct hisi_qp *qp);
struct hisi_qm *qm;
+ bool is_resetting;
u16 pasid;
struct uacce_queue *uacce_q;
};
+static inline int q_num_set(const char *val, const struct kernel_param *kp,
+ unsigned int device)
+{
+ struct pci_dev *pdev = pci_get_device(PCI_VENDOR_ID_HUAWEI,
+ device, NULL);
+ u32 n, q_num;
+ int ret;
+
+ if (!val)
+ return -EINVAL;
+
+ if (!pdev) {
+ q_num = min_t(u32, QM_QNUM_V1, QM_QNUM_V2);
+ pr_info("No device found currently, suppose queue number is %d\n",
+ q_num);
+ } else {
+ if (pdev->revision == QM_HW_V1)
+ q_num = QM_QNUM_V1;
+ else
+ q_num = QM_QNUM_V2;
+ }
+
+ ret = kstrtou32(val, 10, &n);
+ if (ret || !n || n > q_num)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
+static inline int vfs_num_set(const char *val, const struct kernel_param *kp)
+{
+ u32 n;
+ int ret;
+
+ if (!val)
+ return -EINVAL;
+
+ ret = kstrtou32(val, 10, &n);
+ if (ret < 0)
+ return ret;
+
+ if (n > QM_MAX_VFS_NUM_V2)
+ return -EINVAL;
+
+ return param_set_int(val, kp);
+}
+
static inline void hisi_qm_init_list(struct hisi_qm_list *qm_list)
{
INIT_LIST_HEAD(&qm_list->list);
int hisi_qp_send(struct hisi_qp *qp, const void *msg);
int hisi_qm_get_free_qp_num(struct hisi_qm *qm);
int hisi_qm_get_vft(struct hisi_qm *qm, u32 *base, u32 *number);
-int hisi_qm_set_vft(struct hisi_qm *qm, u32 fun_num, u32 base, u32 number);
int hisi_qm_debug_init(struct hisi_qm *qm);
enum qm_hw_ver hisi_qm_get_hw_version(struct pci_dev *pdev);
void hisi_qm_debug_regs_clear(struct hisi_qm *qm);
+int hisi_qm_sriov_enable(struct pci_dev *pdev, int max_vfs);
+int hisi_qm_sriov_disable(struct pci_dev *pdev);
+int hisi_qm_sriov_configure(struct pci_dev *pdev, int num_vfs);
void hisi_qm_dev_err_init(struct hisi_qm *qm);
void hisi_qm_dev_err_uninit(struct hisi_qm *qm);
pci_ers_result_t hisi_qm_dev_err_detected(struct pci_dev *pdev,
pci_channel_state_t state);
+pci_ers_result_t hisi_qm_dev_slot_reset(struct pci_dev *pdev);
+void hisi_qm_reset_prepare(struct pci_dev *pdev);
+void hisi_qm_reset_done(struct pci_dev *pdev);
struct hisi_acc_sgl_pool;
struct hisi_acc_hw_sgl *hisi_acc_sg_buf_map_to_hw_sgl(struct device *dev,
struct sec_dfx {
atomic64_t send_cnt;
atomic64_t recv_cnt;
+ atomic64_t send_busy_cnt;
+ atomic64_t err_bd_cnt;
+ atomic64_t invalid_req_cnt;
+ atomic64_t done_flag_cnt;
};
struct sec_debug {
struct sec_debug debug;
u32 ctx_q_num;
bool iommu_used;
- u32 num_vfs;
unsigned long status;
};
static void sec_req_cb(struct hisi_qp *qp, void *resp)
{
struct sec_qp_ctx *qp_ctx = qp->qp_ctx;
+ struct sec_dfx *dfx = &qp_ctx->ctx->sec->debug.dfx;
struct sec_sqe *bd = resp;
struct sec_ctx *ctx;
struct sec_req *req;
type = bd->type_cipher_auth & SEC_TYPE_MASK;
if (unlikely(type != SEC_BD_TYPE2)) {
+ atomic64_inc(&dfx->err_bd_cnt);
pr_err("err bd type [%d]\n", type);
return;
}
req = qp_ctx->req_list[le16_to_cpu(bd->type2.tag)];
+ if (unlikely(!req)) {
+ atomic64_inc(&dfx->invalid_req_cnt);
+ return;
+ }
req->err_type = bd->type2.error_type;
ctx = req->ctx;
done = le16_to_cpu(bd->type2.done_flag) & SEC_DONE_MASK;
"err_type[%d],done[%d],flag[%d]\n",
req->err_type, done, flag);
err = -EIO;
+ atomic64_inc(&dfx->done_flag_cnt);
}
if (ctx->alg_type == SEC_AEAD && !req->c_req.encrypt)
err = sec_aead_verify(req);
- atomic64_inc(&ctx->sec->debug.dfx.recv_cnt);
+ atomic64_inc(&dfx->recv_cnt);
ctx->req_op->buf_unmap(ctx, req);
return -ENOBUFS;
if (!ret) {
- if (req->fake_busy)
+ if (req->fake_busy) {
+ atomic64_inc(&ctx->sec->debug.dfx.send_busy_cnt);
ret = -EBUSY;
- else
+ } else {
ret = -EINPROGRESS;
+ }
}
return ret;
struct crypto_authenc_keys *keys)
{
struct crypto_shash *hash_tfm = ctx->hash_tfm;
- SHASH_DESC_ON_STACK(shash, hash_tfm);
int blocksize, ret;
if (!keys->authkeylen) {
blocksize = crypto_shash_blocksize(hash_tfm);
if (keys->authkeylen > blocksize) {
- ret = crypto_shash_digest(shash, keys->authkey,
- keys->authkeylen, ctx->a_key);
+ ret = crypto_shash_tfm_digest(hash_tfm, keys->authkey,
+ keys->authkeylen, ctx->a_key);
if (ret) {
pr_err("hisi_sec2: aead auth digest error!\n");
return -EINVAL;
#define SEC_VF_CNT_MASK 0xffffffc0
#define SEC_DBGFS_VAL_MAX_LEN 20
+#define SEC_SQE_MASK_OFFSET 64
+#define SEC_SQE_MASK_LEN 48
+
#define SEC_ADDR(qm, offset) ((qm)->io_base + (offset) + \
SEC_ENGINE_PF_CFG_OFF + SEC_ACC_COMMON_REG_OFF)
const char *msg;
};
+struct sec_dfx_item {
+ const char *name;
+ u32 offset;
+};
+
static const char sec_name[] = "hisi_sec2";
static struct dentry *sec_debugfs_root;
static struct hisi_qm_list sec_devices;
[SEC_CLEAR_ENABLE] = "clear_enable",
};
-static struct debugfs_reg32 sec_dfx_regs[] = {
+static struct sec_dfx_item sec_dfx_labels[] = {
+ {"send_cnt", offsetof(struct sec_dfx, send_cnt)},
+ {"recv_cnt", offsetof(struct sec_dfx, recv_cnt)},
+ {"send_busy_cnt", offsetof(struct sec_dfx, send_busy_cnt)},
+ {"err_bd_cnt", offsetof(struct sec_dfx, err_bd_cnt)},
+ {"invalid_req_cnt", offsetof(struct sec_dfx, invalid_req_cnt)},
+ {"done_flag_cnt", offsetof(struct sec_dfx, done_flag_cnt)},
+};
+
+static const struct debugfs_reg32 sec_dfx_regs[] = {
{"SEC_PF_ABNORMAL_INT_SOURCE ", 0x301010},
{"SEC_SAA_EN ", 0x301270},
{"SEC_BD_LATENCY_MIN ", 0x301600},
static int sec_pf_q_num_set(const char *val, const struct kernel_param *kp)
{
- struct pci_dev *pdev;
- u32 n, q_num;
- u8 rev_id;
- int ret;
-
- if (!val)
- return -EINVAL;
-
- pdev = pci_get_device(PCI_VENDOR_ID_HUAWEI,
- SEC_PF_PCI_DEVICE_ID, NULL);
- if (!pdev) {
- q_num = min_t(u32, SEC_QUEUE_NUM_V1, SEC_QUEUE_NUM_V2);
- pr_info("No device, suppose queue number is %d!\n", q_num);
- } else {
- rev_id = pdev->revision;
-
- switch (rev_id) {
- case QM_HW_V1:
- q_num = SEC_QUEUE_NUM_V1;
- break;
- case QM_HW_V2:
- q_num = SEC_QUEUE_NUM_V2;
- break;
- default:
- return -EINVAL;
- }
- }
-
- ret = kstrtou32(val, 10, &n);
- if (ret || !n || n > q_num)
- return -EINVAL;
-
- return param_set_int(val, kp);
+ return q_num_set(val, kp, SEC_PF_PCI_DEVICE_ID);
}
static const struct kernel_param_ops sec_pf_q_num_ops = {
.set = sec_pf_q_num_set,
.get = param_get_int,
};
+
static u32 pf_q_num = SEC_PF_DEF_Q_NUM;
module_param_cb(pf_q_num, &sec_pf_q_num_ops, &pf_q_num, 0444);
MODULE_PARM_DESC(pf_q_num, "Number of queues in PF(v1 0-4096, v2 0-1024)");
module_param_cb(ctx_q_num, &sec_ctx_q_num_ops, &ctx_q_num, 0444);
MODULE_PARM_DESC(ctx_q_num, "Queue num in ctx (24 default, 2, 4, ..., 32)");
+static const struct kernel_param_ops vfs_num_ops = {
+ .set = vfs_num_set,
+ .get = param_get_int,
+};
+
+static u32 vfs_num;
+module_param_cb(vfs_num, &vfs_num_ops, &vfs_num, 0444);
+MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63), 0(default)");
+
void sec_destroy_qps(struct hisi_qp **qps, int qp_num)
{
hisi_qm_free_qps(qps, qp_num);
};
MODULE_DEVICE_TABLE(pci, sec_dev_ids);
-static u8 sec_get_endian(struct sec_dev *sec)
+static u8 sec_get_endian(struct hisi_qm *qm)
{
- struct hisi_qm *qm = &sec->qm;
u32 reg;
/*
return SEC_64BE;
}
-static int sec_engine_init(struct sec_dev *sec)
+static int sec_engine_init(struct hisi_qm *qm)
{
- struct hisi_qm *qm = &sec->qm;
int ret;
u32 reg;
/* config endian */
reg = readl_relaxed(SEC_ADDR(qm, SEC_CONTROL_REG));
- reg |= sec_get_endian(sec);
+ reg |= sec_get_endian(qm);
writel_relaxed(reg, SEC_ADDR(qm, SEC_CONTROL_REG));
/* Enable sm4 xts mode multiple iv */
return 0;
}
-static int sec_set_user_domain_and_cache(struct sec_dev *sec)
+static int sec_set_user_domain_and_cache(struct hisi_qm *qm)
{
- struct hisi_qm *qm = &sec->qm;
-
/* qm user domain */
writel(AXUSER_BASE, qm->io_base + QM_ARUSER_M_CFG_1);
writel(ARUSER_M_CFG_ENABLE, qm->io_base + QM_ARUSER_M_CFG_ENABLE);
CQC_CACHE_WB_ENABLE | FIELD_PREP(SQC_CACHE_WB_THRD, 1) |
FIELD_PREP(CQC_CACHE_WB_THRD, 1), qm->io_base + QM_CACHE_CTL);
- return sec_engine_init(sec);
+ return sec_engine_init(qm);
}
/* sec_debug_regs_clear() - clear the sec debug regs */
static int sec_current_qm_write(struct sec_debug_file *file, u32 val)
{
struct hisi_qm *qm = file->qm;
- struct sec_dev *sec = container_of(qm, struct sec_dev, qm);
u32 vfq_num;
u32 tmp;
- if (val > sec->num_vfs)
+ if (val > qm->vfs_num)
return -EINVAL;
/* According PF or VF Dev ID to calculation curr_qm_qp_num and store */
if (!val) {
qm->debug.curr_qm_qp_num = qm->qp_num;
} else {
- vfq_num = (qm->ctrl_qp_num - qm->qp_num) / sec->num_vfs;
+ vfq_num = (qm->ctrl_qp_num - qm->qp_num) / qm->vfs_num;
- if (val == sec->num_vfs)
+ if (val == qm->vfs_num)
qm->debug.curr_qm_qp_num =
qm->ctrl_qp_num - qm->qp_num -
- (sec->num_vfs - 1) * vfq_num;
+ (qm->vfs_num - 1) * vfq_num;
else
qm->debug.curr_qm_qp_num = vfq_num;
}
static int sec_debugfs_atomic64_get(void *data, u64 *val)
{
*val = atomic64_read((atomic64_t *)data);
+
+ return 0;
+}
+
+static int sec_debugfs_atomic64_set(void *data, u64 val)
+{
+ if (val)
+ return -EINVAL;
+
+ atomic64_set((atomic64_t *)data, 0);
+
return 0;
}
+
DEFINE_DEBUGFS_ATTRIBUTE(sec_atomic64_ops, sec_debugfs_atomic64_get,
- NULL, "%lld\n");
+ sec_debugfs_atomic64_set, "%lld\n");
static int sec_core_debug_init(struct sec_dev *sec)
{
struct sec_dfx *dfx = &sec->debug.dfx;
struct debugfs_regset32 *regset;
struct dentry *tmp_d;
+ int i;
tmp_d = debugfs_create_dir("sec_dfx", sec->qm.debug.debug_root);
regset->nregs = ARRAY_SIZE(sec_dfx_regs);
regset->base = qm->io_base;
- debugfs_create_regset32("regs", 0444, tmp_d, regset);
-
- debugfs_create_file("send_cnt", 0444, tmp_d,
- &dfx->send_cnt, &sec_atomic64_ops);
+ if (qm->pdev->device == SEC_PF_PCI_DEVICE_ID)
+ debugfs_create_regset32("regs", 0444, tmp_d, regset);
- debugfs_create_file("recv_cnt", 0444, tmp_d,
- &dfx->recv_cnt, &sec_atomic64_ops);
+ for (i = 0; i < ARRAY_SIZE(sec_dfx_labels); i++) {
+ atomic64_t *data = (atomic64_t *)((uintptr_t)dfx +
+ sec_dfx_labels[i].offset);
+ debugfs_create_file(sec_dfx_labels[i].name, 0644,
+ tmp_d, data, &sec_atomic64_ops);
+ }
return 0;
}
qm->debug.debug_root = debugfs_create_dir(dev_name(dev),
sec_debugfs_root);
+
+ qm->debug.sqe_mask_offset = SEC_SQE_MASK_OFFSET;
+ qm->debug.sqe_mask_len = SEC_SQE_MASK_LEN;
ret = hisi_qm_debug_init(qm);
if (ret)
goto failed_to_create;
}
errs++;
}
-
- writel(err_sts, qm->io_base + SEC_CORE_INT_SOURCE);
}
static u32 sec_get_hw_err_status(struct hisi_qm *qm)
return readl(qm->io_base + SEC_CORE_INT_STATUS);
}
+static void sec_clear_hw_err_status(struct hisi_qm *qm, u32 err_sts)
+{
+ writel(err_sts, qm->io_base + SEC_CORE_INT_SOURCE);
+}
+
+static void sec_open_axi_master_ooo(struct hisi_qm *qm)
+{
+ u32 val;
+
+ val = readl(SEC_ADDR(qm, SEC_CONTROL_REG));
+ writel(val & SEC_AXI_SHUTDOWN_DISABLE, SEC_ADDR(qm, SEC_CONTROL_REG));
+ writel(val | SEC_AXI_SHUTDOWN_ENABLE, SEC_ADDR(qm, SEC_CONTROL_REG));
+}
+
static const struct hisi_qm_err_ini sec_err_ini = {
+ .hw_init = sec_set_user_domain_and_cache,
.hw_err_enable = sec_hw_error_enable,
.hw_err_disable = sec_hw_error_disable,
.get_dev_hw_err_status = sec_get_hw_err_status,
+ .clear_dev_hw_err_status = sec_clear_hw_err_status,
.log_dev_hw_err = sec_log_hw_error,
+ .open_axi_master_ooo = sec_open_axi_master_ooo,
.err_info = {
.ce = QM_BASE_CE,
.nfe = QM_BASE_NFE | QM_ACC_DO_TASK_TIMEOUT |
QM_ACC_WB_NOT_READY_TIMEOUT,
.fe = 0,
- .msi = QM_DB_RANDOM_INVALID,
+ .ecc_2bits_mask = SEC_CORE_INT_STATUS_M_ECC,
+ .msi_wr_port = BIT(0),
+ .acpi_rst = "SRST",
}
};
struct hisi_qm *qm = &sec->qm;
int ret;
- switch (qm->ver) {
- case QM_HW_V1:
+ if (qm->ver == QM_HW_V1)
qm->ctrl_qp_num = SEC_QUEUE_NUM_V1;
- break;
-
- case QM_HW_V2:
+ else
qm->ctrl_qp_num = SEC_QUEUE_NUM_V2;
- break;
-
- default:
- return -EINVAL;
- }
qm->err_ini = &sec_err_ini;
- ret = sec_set_user_domain_and_cache(sec);
+ ret = sec_set_user_domain_and_cache(qm);
if (ret)
return ret;
static int sec_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
{
- enum qm_hw_ver rev_id;
-
- rev_id = hisi_qm_get_hw_version(pdev);
- if (rev_id == QM_HW_UNKNOWN)
- return -ENODEV;
+ int ret;
qm->pdev = pdev;
- qm->ver = rev_id;
-
+ qm->ver = pdev->revision;
qm->sqe_size = SEC_SQE_SIZE;
qm->dev_name = sec_name;
+
qm->fun_type = (pdev->device == SEC_PF_PCI_DEVICE_ID) ?
QM_HW_PF : QM_HW_VF;
- qm->use_dma_api = true;
-
- return hisi_qm_init(qm);
-}
-
-static void sec_qm_uninit(struct hisi_qm *qm)
-{
- hisi_qm_uninit(qm);
-}
-
-static int sec_probe_init(struct hisi_qm *qm, struct sec_dev *sec)
-{
- int ret;
+ if (qm->fun_type == QM_HW_PF) {
+ qm->qp_base = SEC_PF_DEF_Q_BASE;
+ qm->qp_num = pf_q_num;
+ qm->debug.curr_qm_qp_num = pf_q_num;
+ qm->qm_list = &sec_devices;
+ } else if (qm->fun_type == QM_HW_VF && qm->ver == QM_HW_V1) {
+ /*
+ * have no way to get qm configure in VM in v1 hardware,
+ * so currently force PF to uses SEC_PF_DEF_Q_NUM, and force
+ * to trigger only one VF in v1 hardware.
+ * v2 hardware has no such problem.
+ */
+ qm->qp_base = SEC_PF_DEF_Q_NUM;
+ qm->qp_num = SEC_QUEUE_NUM_V1 - SEC_PF_DEF_Q_NUM;
+ }
/*
* WQ_HIGHPRI: SEC request must be low delayed,
* WQ_UNBOUND: SEC task is likely with long
* running CPU intensive workloads.
*/
- qm->wq = alloc_workqueue("%s", WQ_HIGHPRI |
- WQ_MEM_RECLAIM | WQ_UNBOUND, num_online_cpus(),
- pci_name(qm->pdev));
+ qm->wq = alloc_workqueue("%s", WQ_HIGHPRI | WQ_MEM_RECLAIM |
+ WQ_UNBOUND, num_online_cpus(),
+ pci_name(qm->pdev));
if (!qm->wq) {
pci_err(qm->pdev, "fail to alloc workqueue\n");
return -ENOMEM;
}
- if (qm->fun_type == QM_HW_PF) {
- qm->qp_base = SEC_PF_DEF_Q_BASE;
- qm->qp_num = pf_q_num;
- qm->debug.curr_qm_qp_num = pf_q_num;
+ ret = hisi_qm_init(qm);
+ if (ret)
+ destroy_workqueue(qm->wq);
+
+ return ret;
+}
+static void sec_qm_uninit(struct hisi_qm *qm)
+{
+ hisi_qm_uninit(qm);
+}
+
+static int sec_probe_init(struct sec_dev *sec)
+{
+ struct hisi_qm *qm = &sec->qm;
+ int ret;
+
+ if (qm->fun_type == QM_HW_PF) {
ret = sec_pf_probe_init(sec);
if (ret)
- goto err_probe_uninit;
- } else if (qm->fun_type == QM_HW_VF) {
- /*
- * have no way to get qm configure in VM in v1 hardware,
- * so currently force PF to uses SEC_PF_DEF_Q_NUM, and force
- * to trigger only one VF in v1 hardware.
- * v2 hardware has no such problem.
- */
- if (qm->ver == QM_HW_V1) {
- qm->qp_base = SEC_PF_DEF_Q_NUM;
- qm->qp_num = SEC_QUEUE_NUM_V1 - SEC_PF_DEF_Q_NUM;
- } else if (qm->ver == QM_HW_V2) {
- /* v2 starts to support get vft by mailbox */
- ret = hisi_qm_get_vft(qm, &qm->qp_base, &qm->qp_num);
- if (ret)
- goto err_probe_uninit;
- }
- } else {
- ret = -ENODEV;
- goto err_probe_uninit;
+ return ret;
}
return 0;
-err_probe_uninit:
- destroy_workqueue(qm->wq);
- return ret;
}
static void sec_probe_uninit(struct hisi_qm *qm)
if (!sec)
return -ENOMEM;
- pci_set_drvdata(pdev, sec);
-
- sec->ctx_q_num = ctx_q_num;
- sec_iommu_used_check(sec);
-
qm = &sec->qm;
-
ret = sec_qm_init(qm, pdev);
if (ret) {
- pci_err(pdev, "Failed to pre init qm!\n");
+ pci_err(pdev, "Failed to init SEC QM (%d)!\n", ret);
return ret;
}
- ret = sec_probe_init(qm, sec);
+ sec->ctx_q_num = ctx_q_num;
+ sec_iommu_used_check(sec);
+
+ ret = sec_probe_init(sec);
if (ret) {
pci_err(pdev, "Failed to probe!\n");
goto err_qm_uninit;
goto err_remove_from_list;
}
+ if (qm->fun_type == QM_HW_PF && vfs_num) {
+ ret = hisi_qm_sriov_enable(pdev, vfs_num);
+ if (ret < 0)
+ goto err_crypto_unregister;
+ }
+
return 0;
+err_crypto_unregister:
+ sec_unregister_from_crypto();
+
err_remove_from_list:
hisi_qm_del_from_list(qm, &sec_devices);
sec_debugfs_exit(sec);
return ret;
}
-/* now we only support equal assignment */
-static int sec_vf_q_assign(struct sec_dev *sec, u32 num_vfs)
-{
- struct hisi_qm *qm = &sec->qm;
- u32 qp_num = qm->qp_num;
- u32 q_base = qp_num;
- u32 q_num, remain_q_num;
- int i, j, ret;
-
- if (!num_vfs)
- return -EINVAL;
-
- remain_q_num = qm->ctrl_qp_num - qp_num;
- q_num = remain_q_num / num_vfs;
-
- for (i = 1; i <= num_vfs; i++) {
- if (i == num_vfs)
- q_num += remain_q_num % num_vfs;
- ret = hisi_qm_set_vft(qm, i, q_base, q_num);
- if (ret) {
- for (j = i; j > 0; j--)
- hisi_qm_set_vft(qm, j, 0, 0);
- return ret;
- }
- q_base += q_num;
- }
-
- return 0;
-}
-
-static int sec_clear_vft_config(struct sec_dev *sec)
-{
- struct hisi_qm *qm = &sec->qm;
- u32 num_vfs = sec->num_vfs;
- int ret;
- u32 i;
-
- for (i = 1; i <= num_vfs; i++) {
- ret = hisi_qm_set_vft(qm, i, 0, 0);
- if (ret)
- return ret;
- }
-
- sec->num_vfs = 0;
-
- return 0;
-}
-
-static int sec_sriov_enable(struct pci_dev *pdev, int max_vfs)
-{
- struct sec_dev *sec = pci_get_drvdata(pdev);
- int pre_existing_vfs, ret;
- u32 num_vfs;
-
- pre_existing_vfs = pci_num_vf(pdev);
-
- if (pre_existing_vfs) {
- pci_err(pdev, "Can't enable VF. Please disable at first!\n");
- return 0;
- }
-
- num_vfs = min_t(u32, max_vfs, SEC_VF_NUM);
-
- ret = sec_vf_q_assign(sec, num_vfs);
- if (ret) {
- pci_err(pdev, "Can't assign queues for VF!\n");
- return ret;
- }
-
- sec->num_vfs = num_vfs;
-
- ret = pci_enable_sriov(pdev, num_vfs);
- if (ret) {
- pci_err(pdev, "Can't enable VF!\n");
- sec_clear_vft_config(sec);
- return ret;
- }
-
- return num_vfs;
-}
-
-static int sec_sriov_disable(struct pci_dev *pdev)
-{
- struct sec_dev *sec = pci_get_drvdata(pdev);
-
- if (pci_vfs_assigned(pdev)) {
- pci_err(pdev, "Can't disable VFs while VFs are assigned!\n");
- return -EPERM;
- }
-
- /* remove in sec_pci_driver will be called to free VF resources */
- pci_disable_sriov(pdev);
-
- return sec_clear_vft_config(sec);
-}
-
-static int sec_sriov_configure(struct pci_dev *pdev, int num_vfs)
-{
- if (num_vfs)
- return sec_sriov_enable(pdev, num_vfs);
- else
- return sec_sriov_disable(pdev);
-}
-
static void sec_remove(struct pci_dev *pdev)
{
struct sec_dev *sec = pci_get_drvdata(pdev);
hisi_qm_del_from_list(qm, &sec_devices);
- if (qm->fun_type == QM_HW_PF && sec->num_vfs)
- (void)sec_sriov_disable(pdev);
+ if (qm->fun_type == QM_HW_PF && qm->vfs_num)
+ hisi_qm_sriov_disable(pdev);
sec_debugfs_exit(sec);
static const struct pci_error_handlers sec_err_handler = {
.error_detected = hisi_qm_dev_err_detected,
+ .slot_reset = hisi_qm_dev_slot_reset,
+ .reset_prepare = hisi_qm_reset_prepare,
+ .reset_done = hisi_qm_reset_done,
};
static struct pci_driver sec_pci_driver = {
.probe = sec_probe,
.remove = sec_remove,
.err_handler = &sec_err_handler,
- .sriov_configure = sec_sriov_configure,
+ .sriov_configure = hisi_qm_sriov_configure,
};
static void sec_register_debugfs(void)
HZIP_NC_ERR = 0x0d,
};
+struct hisi_zip_dfx {
+ atomic64_t send_cnt;
+ atomic64_t recv_cnt;
+ atomic64_t send_busy_cnt;
+ atomic64_t err_bd_cnt;
+};
+
struct hisi_zip_ctrl;
struct hisi_zip {
struct hisi_qm qm;
struct list_head list;
struct hisi_zip_ctrl *ctrl;
+ struct hisi_zip_dfx dfx;
};
struct hisi_zip_sqe {
struct hisi_zip_qp_ctx {
struct hisi_qp *qp;
- struct hisi_zip_sqe zip_sqe;
struct hisi_zip_req_q req_q;
struct hisi_acc_sgl_pool *sgl_pool;
struct hisi_zip *zip_dev;
{
struct hisi_zip_sqe *sqe = data;
struct hisi_zip_qp_ctx *qp_ctx = qp->qp_ctx;
+ struct hisi_zip_dfx *dfx = &qp_ctx->zip_dev->dfx;
struct hisi_zip_req_q *req_q = &qp_ctx->req_q;
struct hisi_zip_req *req = req_q->q + sqe->tag;
struct acomp_req *acomp_req = req->req;
u32 status, dlen, head_size;
int err = 0;
+ atomic64_inc(&dfx->recv_cnt);
status = sqe->dw3 & HZIP_BD_STATUS_M;
if (status != 0 && status != HZIP_NC_ERR) {
dev_err(dev, "%scompress fail in qp%u: %u, output: %u\n",
(qp->alg_type == 0) ? "" : "de", qp->qp_id, status,
sqe->produced);
+ atomic64_inc(&dfx->err_bd_cnt);
err = -EIO;
}
dlen = sqe->produced;
static int hisi_zip_do_work(struct hisi_zip_req *req,
struct hisi_zip_qp_ctx *qp_ctx)
{
- struct hisi_zip_sqe *zip_sqe = &qp_ctx->zip_sqe;
struct acomp_req *a_req = req->req;
struct hisi_qp *qp = qp_ctx->qp;
struct device *dev = &qp->qm->pdev->dev;
struct hisi_acc_sgl_pool *pool = qp_ctx->sgl_pool;
+ struct hisi_zip_dfx *dfx = &qp_ctx->zip_dev->dfx;
+ struct hisi_zip_sqe zip_sqe;
dma_addr_t input;
dma_addr_t output;
int ret;
}
req->dma_dst = output;
- hisi_zip_fill_sqe(zip_sqe, qp->req_type, input, output, a_req->slen,
+ hisi_zip_fill_sqe(&zip_sqe, qp->req_type, input, output, a_req->slen,
a_req->dlen, req->sskip, req->dskip);
- hisi_zip_config_buf_type(zip_sqe, HZIP_SGL);
- hisi_zip_config_tag(zip_sqe, req->req_id);
+ hisi_zip_config_buf_type(&zip_sqe, HZIP_SGL);
+ hisi_zip_config_tag(&zip_sqe, req->req_id);
/* send command to start a task */
- ret = hisi_qp_send(qp, zip_sqe);
- if (ret < 0)
+ atomic64_inc(&dfx->send_cnt);
+ ret = hisi_qp_send(qp, &zip_sqe);
+ if (ret < 0) {
+ atomic64_inc(&dfx->send_busy_cnt);
goto err_unmap_output;
+ }
return -EINPROGRESS;
#define HZIP_CORE_INT_SOURCE 0x3010A0
#define HZIP_CORE_INT_MASK_REG 0x3010A4
+#define HZIP_CORE_INT_SET 0x3010A8
#define HZIP_CORE_INT_STATUS 0x3010AC
#define HZIP_CORE_INT_STATUS_M_ECC BIT(1)
#define HZIP_CORE_SRAM_ECC_ERR_INFO 0x301148
#define HZIP_SOFT_CTRL_CNT_CLR_CE 0x301000
#define SOFT_CTRL_CNT_CLR_CE_BIT BIT(0)
+#define HZIP_SOFT_CTRL_ZIP_CONTROL 0x30100C
+#define HZIP_AXI_SHUTDOWN_ENABLE BIT(14)
+#define HZIP_WR_PORT BIT(11)
#define HZIP_BUF_SIZE 22
+#define HZIP_SQE_MASK_OFFSET 64
+#define HZIP_SQE_MASK_LEN 48
static const char hisi_zip_name[] = "hisi_zip";
static struct dentry *hzip_debugfs_root;
const char *msg;
};
+struct zip_dfx_item {
+ const char *name;
+ u32 offset;
+};
+
+static struct zip_dfx_item zip_dfx_files[] = {
+ {"send_cnt", offsetof(struct hisi_zip_dfx, send_cnt)},
+ {"recv_cnt", offsetof(struct hisi_zip_dfx, recv_cnt)},
+ {"send_busy_cnt", offsetof(struct hisi_zip_dfx, send_busy_cnt)},
+ {"err_bd_cnt", offsetof(struct hisi_zip_dfx, err_bd_cnt)},
+};
+
static const struct hisi_zip_hw_error zip_hw_error[] = {
{ .int_msk = BIT(0), .msg = "zip_ecc_1bitt_err" },
{ .int_msk = BIT(1), .msg = "zip_ecc_2bit_err" },
* Just relevant for PF.
*/
struct hisi_zip_ctrl {
- u32 num_vfs;
struct hisi_zip *hisi_zip;
struct dentry *debug_root;
struct ctrl_debug_file files[HZIP_DEBUG_FILE_NUM];
[HZIP_DECOMP_CORE5] = 0x309000,
};
-static struct debugfs_reg32 hzip_dfx_regs[] = {
+static const struct debugfs_reg32 hzip_dfx_regs[] = {
{"HZIP_GET_BD_NUM ", 0x00ull},
{"HZIP_GET_RIGHT_BD ", 0x04ull},
{"HZIP_GET_ERROR_BD ", 0x08ull},
static int pf_q_num_set(const char *val, const struct kernel_param *kp)
{
- struct pci_dev *pdev = pci_get_device(PCI_VENDOR_ID_HUAWEI,
- PCI_DEVICE_ID_ZIP_PF, NULL);
- u32 n, q_num;
- u8 rev_id;
- int ret;
-
- if (!val)
- return -EINVAL;
-
- if (!pdev) {
- q_num = min_t(u32, HZIP_QUEUE_NUM_V1, HZIP_QUEUE_NUM_V2);
- pr_info("No device found currently, suppose queue number is %d\n",
- q_num);
- } else {
- rev_id = pdev->revision;
- switch (rev_id) {
- case QM_HW_V1:
- q_num = HZIP_QUEUE_NUM_V1;
- break;
- case QM_HW_V2:
- q_num = HZIP_QUEUE_NUM_V2;
- break;
- default:
- return -EINVAL;
- }
- }
-
- ret = kstrtou32(val, 10, &n);
- if (ret != 0 || n > q_num || n == 0)
- return -EINVAL;
-
- return param_set_int(val, kp);
+ return q_num_set(val, kp, PCI_DEVICE_ID_ZIP_PF);
}
static const struct kernel_param_ops pf_q_num_ops = {
module_param_cb(pf_q_num, &pf_q_num_ops, &pf_q_num, 0444);
MODULE_PARM_DESC(pf_q_num, "Number of queues in PF(v1 1-4096, v2 1-1024)");
+static const struct kernel_param_ops vfs_num_ops = {
+ .set = vfs_num_set,
+ .get = param_get_int,
+};
+
static u32 vfs_num;
-module_param(vfs_num, uint, 0444);
-MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63)");
+module_param_cb(vfs_num, &vfs_num_ops, &vfs_num, 0444);
+MODULE_PARM_DESC(vfs_num, "Number of VFs to enable(1-63), 0(default)");
static const struct pci_device_id hisi_zip_dev_ids[] = {
{ PCI_DEVICE(PCI_VENDOR_ID_HUAWEI, PCI_DEVICE_ID_ZIP_PF) },
return hisi_qm_alloc_qps_node(&zip_devices, qp_num, 0, node, qps);
}
-static void hisi_zip_set_user_domain_and_cache(struct hisi_zip *hisi_zip)
+static int hisi_zip_set_user_domain_and_cache(struct hisi_qm *qm)
{
- void __iomem *base = hisi_zip->qm.io_base;
+ void __iomem *base = qm->io_base;
/* qm user domain */
writel(AXUSER_BASE, base + QM_ARUSER_M_CFG_1);
writel(AXUSER_BASE, base + HZIP_SGL_RUSER_32_63);
writel(AXUSER_BASE, base + HZIP_BD_WUSER_32_63);
- if (hisi_zip->qm.use_sva) {
+ if (qm->use_sva) {
writel(AXUSER_BASE | AXUSER_SSV, base + HZIP_DATA_RUSER_32_63);
writel(AXUSER_BASE | AXUSER_SSV, base + HZIP_DATA_WUSER_32_63);
} else {
writel(SQC_CACHE_ENABLE | CQC_CACHE_ENABLE | SQC_CACHE_WB_ENABLE |
CQC_CACHE_WB_ENABLE | FIELD_PREP(SQC_CACHE_WB_THRD, 1) |
FIELD_PREP(CQC_CACHE_WB_THRD, 1), base + QM_CACHE_CTL);
+
+ return 0;
}
static void hisi_zip_hw_error_enable(struct hisi_qm *qm)
{
+ u32 val;
+
if (qm->ver == QM_HW_V1) {
writel(HZIP_CORE_INT_MASK_ALL,
qm->io_base + HZIP_CORE_INT_MASK_REG);
/* enable ZIP hw error interrupts */
writel(0, qm->io_base + HZIP_CORE_INT_MASK_REG);
+
+ /* enable ZIP block master OOO when m-bit error occur */
+ val = readl(qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
+ val = val | HZIP_AXI_SHUTDOWN_ENABLE;
+ writel(val, qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
}
static void hisi_zip_hw_error_disable(struct hisi_qm *qm)
{
+ u32 val;
+
/* disable ZIP hw error interrupts */
writel(HZIP_CORE_INT_MASK_ALL, qm->io_base + HZIP_CORE_INT_MASK_REG);
+
+ /* disable ZIP block master OOO when m-bit error occur */
+ val = readl(qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
+ val = val & ~HZIP_AXI_SHUTDOWN_ENABLE;
+ writel(val, qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
}
static inline struct hisi_qm *file_to_qm(struct ctrl_debug_file *file)
static int current_qm_write(struct ctrl_debug_file *file, u32 val)
{
struct hisi_qm *qm = file_to_qm(file);
- struct hisi_zip_ctrl *ctrl = file->ctrl;
u32 vfq_num;
u32 tmp;
- if (val > ctrl->num_vfs)
+ if (val > qm->vfs_num)
return -EINVAL;
/* Calculate curr_qm_qp_num and store */
if (val == 0) {
qm->debug.curr_qm_qp_num = qm->qp_num;
} else {
- vfq_num = (qm->ctrl_qp_num - qm->qp_num) / ctrl->num_vfs;
- if (val == ctrl->num_vfs)
+ vfq_num = (qm->ctrl_qp_num - qm->qp_num) / qm->vfs_num;
+ if (val == qm->vfs_num)
qm->debug.curr_qm_qp_num = qm->ctrl_qp_num -
- qm->qp_num - (ctrl->num_vfs - 1) * vfq_num;
+ qm->qp_num - (qm->vfs_num - 1) * vfq_num;
else
qm->debug.curr_qm_qp_num = vfq_num;
}
.write = ctrl_debug_write,
};
+
+static int zip_debugfs_atomic64_set(void *data, u64 val)
+{
+ if (val)
+ return -EINVAL;
+
+ atomic64_set((atomic64_t *)data, 0);
+
+ return 0;
+}
+
+static int zip_debugfs_atomic64_get(void *data, u64 *val)
+{
+ *val = atomic64_read((atomic64_t *)data);
+
+ return 0;
+}
+
+DEFINE_DEBUGFS_ATTRIBUTE(zip_atomic64_ops, zip_debugfs_atomic64_get,
+ zip_debugfs_atomic64_set, "%llu\n");
+
static int hisi_zip_core_debug_init(struct hisi_zip_ctrl *ctrl)
{
struct hisi_zip *hisi_zip = ctrl->hisi_zip;
return 0;
}
+static void hisi_zip_dfx_debug_init(struct hisi_qm *qm)
+{
+ struct hisi_zip *zip = container_of(qm, struct hisi_zip, qm);
+ struct hisi_zip_dfx *dfx = &zip->dfx;
+ struct dentry *tmp_dir;
+ void *data;
+ int i;
+
+ tmp_dir = debugfs_create_dir("zip_dfx", qm->debug.debug_root);
+ for (i = 0; i < ARRAY_SIZE(zip_dfx_files); i++) {
+ data = (atomic64_t *)((uintptr_t)dfx + zip_dfx_files[i].offset);
+ debugfs_create_file(zip_dfx_files[i].name,
+ 0644,
+ tmp_dir,
+ data,
+ &zip_atomic64_ops);
+ }
+}
+
static int hisi_zip_ctrl_debug_init(struct hisi_zip_ctrl *ctrl)
{
int i;
dev_d = debugfs_create_dir(dev_name(dev), hzip_debugfs_root);
+ qm->debug.sqe_mask_offset = HZIP_SQE_MASK_OFFSET;
+ qm->debug.sqe_mask_len = HZIP_SQE_MASK_LEN;
qm->debug.debug_root = dev_d;
ret = hisi_qm_debug_init(qm);
if (ret)
goto failed_to_create;
}
+ hisi_zip_dfx_debug_init(qm);
+
return 0;
failed_to_create:
}
err++;
}
-
- writel(err_sts, qm->io_base + HZIP_CORE_INT_SOURCE);
}
static u32 hisi_zip_get_hw_err_status(struct hisi_qm *qm)
return readl(qm->io_base + HZIP_CORE_INT_STATUS);
}
+static void hisi_zip_clear_hw_err_status(struct hisi_qm *qm, u32 err_sts)
+{
+ writel(err_sts, qm->io_base + HZIP_CORE_INT_SOURCE);
+}
+
+static void hisi_zip_open_axi_master_ooo(struct hisi_qm *qm)
+{
+ u32 val;
+
+ val = readl(qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
+
+ writel(val & ~HZIP_AXI_SHUTDOWN_ENABLE,
+ qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
+
+ writel(val | HZIP_AXI_SHUTDOWN_ENABLE,
+ qm->io_base + HZIP_SOFT_CTRL_ZIP_CONTROL);
+}
+
+static void hisi_zip_close_axi_master_ooo(struct hisi_qm *qm)
+{
+ u32 nfe_enb;
+
+ /* Disable ECC Mbit error report. */
+ nfe_enb = readl(qm->io_base + HZIP_CORE_INT_RAS_NFE_ENB);
+ writel(nfe_enb & ~HZIP_CORE_INT_STATUS_M_ECC,
+ qm->io_base + HZIP_CORE_INT_RAS_NFE_ENB);
+
+ /* Inject zip ECC Mbit error to block master ooo. */
+ writel(HZIP_CORE_INT_STATUS_M_ECC,
+ qm->io_base + HZIP_CORE_INT_SET);
+}
+
static const struct hisi_qm_err_ini hisi_zip_err_ini = {
+ .hw_init = hisi_zip_set_user_domain_and_cache,
.hw_err_enable = hisi_zip_hw_error_enable,
.hw_err_disable = hisi_zip_hw_error_disable,
.get_dev_hw_err_status = hisi_zip_get_hw_err_status,
+ .clear_dev_hw_err_status = hisi_zip_clear_hw_err_status,
.log_dev_hw_err = hisi_zip_log_hw_error,
+ .open_axi_master_ooo = hisi_zip_open_axi_master_ooo,
+ .close_axi_master_ooo = hisi_zip_close_axi_master_ooo,
.err_info = {
.ce = QM_BASE_CE,
.nfe = QM_BASE_NFE |
QM_ACC_WB_NOT_READY_TIMEOUT,
.fe = 0,
- .msi = QM_DB_RANDOM_INVALID,
+ .ecc_2bits_mask = HZIP_CORE_INT_STATUS_M_ECC,
+ .msi_wr_port = HZIP_WR_PORT,
+ .acpi_rst = "ZRST",
}
};
hisi_zip->ctrl = ctrl;
ctrl->hisi_zip = hisi_zip;
- switch (qm->ver) {
- case QM_HW_V1:
+ if (qm->ver == QM_HW_V1)
qm->ctrl_qp_num = HZIP_QUEUE_NUM_V1;
- break;
-
- case QM_HW_V2:
+ else
qm->ctrl_qp_num = HZIP_QUEUE_NUM_V2;
- break;
-
- default:
- return -EINVAL;
- }
qm->err_ini = &hisi_zip_err_ini;
- hisi_zip_set_user_domain_and_cache(hisi_zip);
+ hisi_zip_set_user_domain_and_cache(qm);
hisi_qm_dev_err_init(qm);
hisi_zip_debug_regs_clear(hisi_zip);
return 0;
}
-/* Currently we only support equal assignment */
-static int hisi_zip_vf_q_assign(struct hisi_zip *hisi_zip, int num_vfs)
+static int hisi_zip_qm_init(struct hisi_qm *qm, struct pci_dev *pdev)
{
- struct hisi_qm *qm = &hisi_zip->qm;
- u32 qp_num = qm->qp_num;
- u32 q_base = qp_num;
- u32 q_num, remain_q_num, i;
- int ret;
-
- if (!num_vfs)
- return -EINVAL;
-
- remain_q_num = qm->ctrl_qp_num - qp_num;
- if (remain_q_num < num_vfs)
- return -EINVAL;
+ qm->pdev = pdev;
+ qm->ver = pdev->revision;
+ qm->algs = "zlib\ngzip";
+ qm->sqe_size = HZIP_SQE_SIZE;
+ qm->dev_name = hisi_zip_name;
- q_num = remain_q_num / num_vfs;
- for (i = 1; i <= num_vfs; i++) {
- if (i == num_vfs)
- q_num += remain_q_num % num_vfs;
- ret = hisi_qm_set_vft(qm, i, q_base, q_num);
- if (ret)
- return ret;
- q_base += q_num;
+ qm->fun_type = (pdev->device == PCI_DEVICE_ID_ZIP_PF) ?
+ QM_HW_PF : QM_HW_VF;
+ if (qm->fun_type == QM_HW_PF) {
+ qm->qp_base = HZIP_PF_DEF_Q_BASE;
+ qm->qp_num = pf_q_num;
+ qm->qm_list = &zip_devices;
+ } else if (qm->fun_type == QM_HW_VF && qm->ver == QM_HW_V1) {
+ /*
+ * have no way to get qm configure in VM in v1 hardware,
+ * so currently force PF to uses HZIP_PF_DEF_Q_NUM, and force
+ * to trigger only one VF in v1 hardware.
+ *
+ * v2 hardware has no such problem.
+ */
+ qm->qp_base = HZIP_PF_DEF_Q_NUM;
+ qm->qp_num = HZIP_QUEUE_NUM_V1 - HZIP_PF_DEF_Q_NUM;
}
- return 0;
+ return hisi_qm_init(qm);
}
-static int hisi_zip_clear_vft_config(struct hisi_zip *hisi_zip)
+static int hisi_zip_probe_init(struct hisi_zip *hisi_zip)
{
- struct hisi_zip_ctrl *ctrl = hisi_zip->ctrl;
struct hisi_qm *qm = &hisi_zip->qm;
- u32 i, num_vfs = ctrl->num_vfs;
int ret;
- for (i = 1; i <= num_vfs; i++) {
- ret = hisi_qm_set_vft(qm, i, 0, 0);
+ if (qm->fun_type == QM_HW_PF) {
+ ret = hisi_zip_pf_probe_init(hisi_zip);
if (ret)
return ret;
}
- ctrl->num_vfs = 0;
-
return 0;
}
-static int hisi_zip_sriov_enable(struct pci_dev *pdev, int max_vfs)
-{
- struct hisi_zip *hisi_zip = pci_get_drvdata(pdev);
- int pre_existing_vfs, num_vfs, ret;
-
- pre_existing_vfs = pci_num_vf(pdev);
-
- if (pre_existing_vfs) {
- dev_err(&pdev->dev,
- "Can't enable VF. Please disable pre-enabled VFs!\n");
- return 0;
- }
-
- num_vfs = min_t(int, max_vfs, HZIP_VF_NUM);
-
- ret = hisi_zip_vf_q_assign(hisi_zip, num_vfs);
- if (ret) {
- dev_err(&pdev->dev, "Can't assign queues for VF!\n");
- return ret;
- }
-
- hisi_zip->ctrl->num_vfs = num_vfs;
-
- ret = pci_enable_sriov(pdev, num_vfs);
- if (ret) {
- dev_err(&pdev->dev, "Can't enable VF!\n");
- hisi_zip_clear_vft_config(hisi_zip);
- return ret;
- }
-
- return num_vfs;
-}
-
-static int hisi_zip_sriov_disable(struct pci_dev *pdev)
-{
- struct hisi_zip *hisi_zip = pci_get_drvdata(pdev);
-
- if (pci_vfs_assigned(pdev)) {
- dev_err(&pdev->dev,
- "Can't disable VFs while VFs are assigned!\n");
- return -EPERM;
- }
-
- /* remove in hisi_zip_pci_driver will be called to free VF resources */
- pci_disable_sriov(pdev);
-
- return hisi_zip_clear_vft_config(hisi_zip);
-}
-
static int hisi_zip_probe(struct pci_dev *pdev, const struct pci_device_id *id)
{
struct hisi_zip *hisi_zip;
- enum qm_hw_ver rev_id;
struct hisi_qm *qm;
int ret;
- rev_id = hisi_qm_get_hw_version(pdev);
- if (rev_id == QM_HW_UNKNOWN)
- return -EINVAL;
-
hisi_zip = devm_kzalloc(&pdev->dev, sizeof(*hisi_zip), GFP_KERNEL);
if (!hisi_zip)
return -ENOMEM;
- pci_set_drvdata(pdev, hisi_zip);
qm = &hisi_zip->qm;
- qm->use_dma_api = true;
- qm->pdev = pdev;
- qm->ver = rev_id;
- qm->algs = "zlib\ngzip";
- qm->sqe_size = HZIP_SQE_SIZE;
- qm->dev_name = hisi_zip_name;
- qm->fun_type = (pdev->device == PCI_DEVICE_ID_ZIP_PF) ? QM_HW_PF :
- QM_HW_VF;
- ret = hisi_qm_init(qm);
+ ret = hisi_zip_qm_init(qm, pdev);
if (ret) {
- dev_err(&pdev->dev, "Failed to init qm!\n");
+ pci_err(pdev, "Failed to init ZIP QM (%d)!\n", ret);
return ret;
}
- if (qm->fun_type == QM_HW_PF) {
- ret = hisi_zip_pf_probe_init(hisi_zip);
- if (ret)
- return ret;
-
- qm->qp_base = HZIP_PF_DEF_Q_BASE;
- qm->qp_num = pf_q_num;
- } else if (qm->fun_type == QM_HW_VF) {
- /*
- * have no way to get qm configure in VM in v1 hardware,
- * so currently force PF to uses HZIP_PF_DEF_Q_NUM, and force
- * to trigger only one VF in v1 hardware.
- *
- * v2 hardware has no such problem.
- */
- if (qm->ver == QM_HW_V1) {
- qm->qp_base = HZIP_PF_DEF_Q_NUM;
- qm->qp_num = HZIP_QUEUE_NUM_V1 - HZIP_PF_DEF_Q_NUM;
- } else if (qm->ver == QM_HW_V2)
- /* v2 starts to support get vft by mailbox */
- hisi_qm_get_vft(qm, &qm->qp_base, &qm->qp_num);
+ ret = hisi_zip_probe_init(hisi_zip);
+ if (ret) {
+ pci_err(pdev, "Failed to probe (%d)!\n", ret);
+ goto err_qm_uninit;
}
ret = hisi_qm_start(qm);
}
if (qm->fun_type == QM_HW_PF && vfs_num > 0) {
- ret = hisi_zip_sriov_enable(pdev, vfs_num);
+ ret = hisi_qm_sriov_enable(pdev, vfs_num);
if (ret < 0)
goto err_remove_from_list;
}
hisi_qm_stop(qm);
err_qm_uninit:
hisi_qm_uninit(qm);
- return ret;
-}
-static int hisi_zip_sriov_configure(struct pci_dev *pdev, int num_vfs)
-{
- if (num_vfs == 0)
- return hisi_zip_sriov_disable(pdev);
- else
- return hisi_zip_sriov_enable(pdev, num_vfs);
+ return ret;
}
static void hisi_zip_remove(struct pci_dev *pdev)
struct hisi_zip *hisi_zip = pci_get_drvdata(pdev);
struct hisi_qm *qm = &hisi_zip->qm;
- if (qm->fun_type == QM_HW_PF && hisi_zip->ctrl->num_vfs != 0)
- hisi_zip_sriov_disable(pdev);
+ if (qm->fun_type == QM_HW_PF && qm->vfs_num)
+ hisi_qm_sriov_disable(pdev);
hisi_zip_debugfs_exit(hisi_zip);
hisi_qm_stop(qm);
static const struct pci_error_handlers hisi_zip_err_handler = {
.error_detected = hisi_qm_dev_err_detected,
+ .slot_reset = hisi_qm_dev_slot_reset,
+ .reset_prepare = hisi_qm_reset_prepare,
+ .reset_done = hisi_qm_reset_done,
};
static struct pci_driver hisi_zip_pci_driver = {
.probe = hisi_zip_probe,
.remove = hisi_zip_remove,
.sriov_configure = IS_ENABLED(CONFIG_PCI_IOV) ?
- hisi_zip_sriov_configure : NULL,
+ hisi_qm_sriov_configure : NULL,
.err_handler = &hisi_zip_err_handler,
};
/* Check BIST status */
bist = (u64)otx_cpt_check_bist_status(cpt);
if (bist) {
- dev_err(dev, "RAM BIST failed with code 0x%llx", bist);
+ dev_err(dev, "RAM BIST failed with code 0x%llx\n", bist);
return -ENODEV;
}
bist = otx_cpt_check_exe_bist_status(cpt);
if (bist) {
- dev_err(dev, "Engine BIST failed with code 0x%llx", bist);
+ dev_err(dev, "Engine BIST failed with code 0x%llx\n", bist);
return -ENODEV;
}
hex_dump_to_buffer(mbox_msg, sizeof(struct otx_cpt_mbox), 16, 8,
raw_data_str, OTX_CPT_MAX_MBOX_DATA_STR_SIZE, false);
if (vf_id >= 0)
- pr_debug("MBOX opcode %s received from VF%d raw_data %s",
+ pr_debug("MBOX opcode %s received from VF%d raw_data %s\n",
get_mbox_opcode_str(mbox_msg->msg), vf_id,
raw_data_str);
else
- pr_debug("MBOX opcode %s received from PF raw_data %s",
+ pr_debug("MBOX opcode %s received from PF raw_data %s\n",
get_mbox_opcode_str(mbox_msg->msg), raw_data_str);
}
struct otx_cpt_ucode *ucode;
if (q >= cpt->max_vfs) {
- dev_err(dev, "Requested queue %d is > than maximum avail %d",
+ dev_err(dev, "Requested queue %d is > than maximum avail %d\n",
q, cpt->max_vfs);
return -EINVAL;
}
if (grp >= OTX_CPT_MAX_ENGINE_GROUPS) {
- dev_err(dev, "Requested group %d is > than maximum avail %d",
+ dev_err(dev, "Requested group %d is > than maximum avail %d\n",
grp, OTX_CPT_MAX_ENGINE_GROUPS);
return -EINVAL;
}
eng_grp = &cpt->eng_grps.grp[grp];
if (!eng_grp->is_enabled) {
- dev_err(dev, "Requested engine group %d is disabled", grp);
+ dev_err(dev, "Requested engine group %d is disabled\n", grp);
return -EINVAL;
}
vftype = otx_cpt_bind_vq_to_grp(cpt, vf, (u8)mbx.data);
if ((vftype != OTX_CPT_AE_TYPES) &&
(vftype != OTX_CPT_SE_TYPES)) {
- dev_err(dev, "VF%d binding to eng group %llu failed",
+ dev_err(dev, "VF%d binding to eng group %llu failed\n",
vf, mbx.data);
otx_cptpf_mbox_send_nack(cpt, vf, &mbx);
} else {
int i;
if (eng_grp->g->engs_num > OTX_CPT_MAX_ENGINES) {
- dev_err(dev, "unsupported number of engines %d on octeontx",
+ dev_err(dev, "unsupported number of engines %d on octeontx\n",
eng_grp->g->engs_num);
return bmap;
}
}
if (!found)
- dev_err(dev, "No engines reserved for engine group %d",
+ dev_err(dev, "No engines reserved for engine group %d\n",
eng_grp->idx);
return bmap;
}
ucode_size = ntohl(ucode_hdr->code_length) * 2;
if (!ucode_size || (size < round_up(ucode_size, 16) +
sizeof(struct otx_cpt_ucode_hdr) + OTX_CPT_UCODE_SIGN_LEN)) {
- dev_err(dev, "Ucode %s invalid size", filename);
+ dev_err(dev, "Ucode %s invalid size\n", filename);
return -EINVAL;
}
{
struct tar_ucode_info_t *curr;
- pr_debug("Tar archive filename %s", tar_filename);
- pr_debug("Tar archive pointer %p, size %ld", tar_arch->fw->data,
+ pr_debug("Tar archive filename %s\n", tar_filename);
+ pr_debug("Tar archive pointer %p, size %ld\n", tar_arch->fw->data,
tar_arch->fw->size);
list_for_each_entry(curr, &tar_arch->ucodes, list) {
- pr_debug("Ucode filename %s", curr->ucode.filename);
- pr_debug("Ucode version string %s", curr->ucode.ver_str);
- pr_debug("Ucode version %d.%d.%d.%d",
+ pr_debug("Ucode filename %s\n", curr->ucode.filename);
+ pr_debug("Ucode version string %s\n", curr->ucode.ver_str);
+ pr_debug("Ucode version %d.%d.%d.%d\n",
curr->ucode.ver_num.nn, curr->ucode.ver_num.xx,
curr->ucode.ver_num.yy, curr->ucode.ver_num.zz);
- pr_debug("Ucode type (%d) %s", curr->ucode.type,
+ pr_debug("Ucode type (%d) %s\n", curr->ucode.type,
get_ucode_type_str(curr->ucode.type));
- pr_debug("Ucode size %d", curr->ucode.size);
+ pr_debug("Ucode size %d\n", curr->ucode.size);
pr_debug("Ucode ptr %p\n", curr->ucode_ptr);
}
}
goto release_tar_arch;
if (tar_arch->fw->size < TAR_BLOCK_LEN) {
- dev_err(dev, "Invalid tar archive %s ", tar_filename);
+ dev_err(dev, "Invalid tar archive %s\n", tar_filename);
goto release_tar_arch;
}
tar_size = tar_arch->fw->size;
tar_blk = (struct tar_blk_t *) tar_arch->fw->data;
if (strncmp(tar_blk->hdr.magic, TAR_MAGIC, TAR_MAGIC_LEN - 1)) {
- dev_err(dev, "Unsupported format of tar archive %s",
+ dev_err(dev, "Unsupported format of tar archive %s\n",
tar_filename);
goto release_tar_arch;
}
if (tar_offs + cur_size > tar_size ||
tar_offs + 2*TAR_BLOCK_LEN > tar_size) {
- dev_err(dev, "Invalid tar archive %s ", tar_filename);
+ dev_err(dev, "Invalid tar archive %s\n", tar_filename);
goto release_tar_arch;
}
/* Check for the end of the archive */
if (tar_offs + 2*TAR_BLOCK_LEN > tar_size) {
- dev_err(dev, "Invalid tar archive %s ", tar_filename);
+ dev_err(dev, "Invalid tar archive %s\n", tar_filename);
goto release_tar_arch;
}
static void print_ucode_dbg_info(struct otx_cpt_ucode *ucode)
{
- pr_debug("Ucode info");
- pr_debug("Ucode version string %s", ucode->ver_str);
- pr_debug("Ucode version %d.%d.%d.%d", ucode->ver_num.nn,
+ pr_debug("Ucode info\n");
+ pr_debug("Ucode version string %s\n", ucode->ver_str);
+ pr_debug("Ucode version %d.%d.%d.%d\n", ucode->ver_num.nn,
ucode->ver_num.xx, ucode->ver_num.yy, ucode->ver_num.zz);
- pr_debug("Ucode type %s", get_ucode_type_str(ucode->type));
- pr_debug("Ucode size %d", ucode->size);
- pr_debug("Ucode virt address %16.16llx", (u64)ucode->align_va);
+ pr_debug("Ucode type %s\n", get_ucode_type_str(ucode->type));
+ pr_debug("Ucode size %d\n", ucode->size);
+ pr_debug("Ucode virt address %16.16llx\n", (u64)ucode->align_va);
pr_debug("Ucode phys address %16.16llx\n", ucode->align_dma);
}
u32 mask[4];
int i, j;
- pr_debug("Engine groups global info");
- pr_debug("max SE %d, max AE %d",
+ pr_debug("Engine groups global info\n");
+ pr_debug("max SE %d, max AE %d\n",
eng_grps->avail.max_se_cnt, eng_grps->avail.max_ae_cnt);
- pr_debug("free SE %d", eng_grps->avail.se_cnt);
- pr_debug("free AE %d", eng_grps->avail.ae_cnt);
+ pr_debug("free SE %d\n", eng_grps->avail.se_cnt);
+ pr_debug("free AE %d\n", eng_grps->avail.ae_cnt);
for (i = 0; i < OTX_CPT_MAX_ENGINE_GROUPS; i++) {
grp = &eng_grps->grp[i];
- pr_debug("engine_group%d, state %s", i, grp->is_enabled ?
+ pr_debug("engine_group%d, state %s\n", i, grp->is_enabled ?
"enabled" : "disabled");
if (grp->is_enabled) {
mirrored_grp = &eng_grps->grp[grp->mirror.idx];
- pr_debug("Ucode0 filename %s, version %s",
+ pr_debug("Ucode0 filename %s, version %s\n",
grp->mirror.is_ena ?
mirrored_grp->ucode[0].filename :
grp->ucode[0].filename,
if (engs->type) {
print_engs_info(grp, engs_info,
2*OTX_CPT_UCODE_NAME_LENGTH, j);
- pr_debug("Slot%d: %s", j, engs_info);
+ pr_debug("Slot%d: %s\n", j, engs_info);
bitmap_to_arr32(mask, engs->bmap,
eng_grps->engs_num);
- pr_debug("Mask: %8.8x %8.8x %8.8x %8.8x",
+ pr_debug("Mask: %8.8x %8.8x %8.8x %8.8x\n",
mask[3], mask[2], mask[1], mask[0]);
} else
- pr_debug("Slot%d not used", j);
+ pr_debug("Slot%d not used\n", j);
}
if (grp->is_enabled) {
cpt_print_engines_mask(grp, dev, engs_mask,
OTX_CPT_UCODE_NAME_LENGTH);
- pr_debug("Cmask: %s", engs_mask);
+ pr_debug("Cmask: %s\n", engs_mask);
}
}
}
if (avail_cnt < req_eng->count) {
dev_err(dev,
- "Error available %s engines %d < than requested %d",
+ "Error available %s engines %d < than requested %d\n",
get_eng_type_str(req_eng->type),
avail_cnt, req_eng->count);
return -EBUSY;
OTX_CPT_UCODE_ALIGNMENT,
&ucode->dma, GFP_KERNEL);
if (!ucode->va) {
- dev_err(dev, "Unable to allocate space for microcode");
+ dev_err(dev, "Unable to allocate space for microcode\n");
return -ENOMEM;
}
ucode->align_va = PTR_ALIGN(ucode->va, OTX_CPT_UCODE_ALIGNMENT);
ucode->size = ntohl(ucode_hdr->code_length) * 2;
if (!ucode->size || (fw->size < round_up(ucode->size, 16)
+ sizeof(struct otx_cpt_ucode_hdr) + OTX_CPT_UCODE_SIGN_LEN)) {
- dev_err(dev, "Ucode %s invalid size", ucode_filename);
+ dev_err(dev, "Ucode %s invalid size\n", ucode_filename);
ret = -EINVAL;
goto release_fw;
}
ret = get_ucode_type(ucode_hdr, &ucode->type);
if (ret) {
- dev_err(dev, "Microcode %s unknown type 0x%x", ucode->filename,
- ucode->type);
+ dev_err(dev, "Microcode %s unknown type 0x%x\n",
+ ucode->filename, ucode->type);
goto release_fw;
}
break;
default:
- dev_err(dev, "Invalid engine type %d", engs->type);
+ dev_err(dev, "Invalid engine type %d\n", engs->type);
return -EINVAL;
}
return -EINVAL;
if (eng_grp->mirror.ref_count) {
- dev_err(dev, "Can't delete engine_group%d as it is used by:",
+ dev_err(dev, "Can't delete engine_group%d as it is used by engine_group(s):",
eng_grp->idx);
for (i = 0; i < OTX_CPT_MAX_ENGINE_GROUPS; i++) {
if (eng_grp->g->grp[i].mirror.is_ena &&
eng_grp->g->grp[i].mirror.idx == eng_grp->idx)
- dev_err(dev, "engine_group%d", i);
+ pr_cont(" %d", i);
}
+ pr_cont("\n");
return -EINVAL;
}
if (!otx_cpt_uc_supports_eng_type(&eng_grp->ucode[0],
engs[i].type)) {
dev_err(dev,
- "Microcode %s does not support %s engines",
+ "Microcode %s does not support %s engines\n",
eng_grp->ucode[0].filename,
get_eng_type_str(engs[i].type));
return -EINVAL;
/* Validate if requested engine types are supported by this device */
for (i = 0; i < engs_cnt; i++)
if (!dev_supports_eng_type(eng_grps, engs[i].type)) {
- dev_err(dev, "Device does not support %s engines",
+ dev_err(dev, "Device does not support %s engines\n",
get_eng_type_str(engs[i].type));
return -EPERM;
}
/* Find engine group which is not used */
eng_grp = find_unused_eng_grp(eng_grps);
if (!eng_grp) {
- dev_err(dev, "Error all engine groups are being used");
+ dev_err(dev, "Error all engine groups are being used\n");
return -ENOSPC;
}
eng_grp->is_enabled = true;
if (eng_grp->mirror.is_ena)
dev_info(dev,
- "Engine_group%d: reuse microcode %s from group %d",
+ "Engine_group%d: reuse microcode %s from group %d\n",
eng_grp->idx, mirrored_eng_grp->ucode[0].ver_str,
mirrored_eng_grp->idx);
else
- dev_info(dev, "Engine_group%d: microcode loaded %s",
+ dev_info(dev, "Engine_group%d: microcode loaded %s\n",
eng_grp->idx, eng_grp->ucode[0].ver_str);
return 0;
} else {
if (del_grp_idx < 0 ||
del_grp_idx >= OTX_CPT_MAX_ENGINE_GROUPS) {
- dev_err(dev, "Invalid engine group index %d",
+ dev_err(dev, "Invalid engine group index %d\n",
del_grp_idx);
ret = -EINVAL;
return ret;
}
if (!eng_grps->grp[del_grp_idx].is_enabled) {
- dev_err(dev, "Error engine_group%d is not configured",
+ dev_err(dev, "Error engine_group%d is not configured\n",
del_grp_idx);
ret = -EINVAL;
return ret;
udelay(CSR_DELAY);
reg = readq(cpt->reg_base + OTX_CPT_PF_EXEC_BUSY);
if (timeout--) {
- dev_warn(&cpt->pdev->dev, "Cores still busy");
+ dev_warn(&cpt->pdev->dev, "Cores still busy\n");
break;
}
}
eng_grps->avail.max_ae_cnt;
if (eng_grps->engs_num > OTX_CPT_MAX_ENGINES) {
dev_err(&pdev->dev,
- "Number of engines %d > than max supported %d",
+ "Number of engines %d > than max supported %d\n",
eng_grps->engs_num, OTX_CPT_MAX_ENGINES);
ret = -EINVAL;
goto err;
case OTX_CPT_SE_TYPES:
count = atomic_read(&se_devices.count);
if (count >= CPT_MAX_VF_NUM) {
- dev_err(&pdev->dev, "No space to add a new device");
+ dev_err(&pdev->dev, "No space to add a new device\n");
ret = -ENOSPC;
goto err;
}
case OTX_CPT_AE_TYPES:
count = atomic_read(&ae_devices.count);
if (count >= CPT_MAX_VF_NUM) {
- dev_err(&pdev->dev, "No space to a add new device");
+ dev_err(&pdev->dev, "No space to a add new device\n");
ret = -ENOSPC;
goto err;
}
}
if (!dev_found) {
- dev_err(&pdev->dev, "%s device not found", __func__);
+ dev_err(&pdev->dev, "%s device not found\n", __func__);
goto exit;
}
cptvf_write_vq_done_ack(cptvf, intr);
wqe = get_cptvf_vq_wqe(cptvf, 0);
if (unlikely(!wqe)) {
- dev_err(&pdev->dev, "No work to schedule for VF (%d)",
+ dev_err(&pdev->dev, "No work to schedule for VF (%d)\n",
cptvf->vfid);
return IRQ_NONE;
}
if (!zalloc_cpumask_var(&cptvf->affinity_mask[vec],
GFP_KERNEL)) {
dev_err(&pdev->dev,
- "Allocation failed for affinity_mask for VF %d",
+ "Allocation failed for affinity_mask for VF %d\n",
cptvf->vfid);
return;
}
return -EINVAL;
if (val >= OTX_CPT_MAX_ENGINE_GROUPS) {
- dev_err(dev, "Engine group >= than max available groups %d",
+ dev_err(dev, "Engine group >= than max available groups %d\n",
OTX_CPT_MAX_ENGINE_GROUPS);
return -EINVAL;
}
cptvf_misc_intr_handler, 0, "CPT VF misc intr",
cptvf);
if (err) {
- dev_err(dev, "Failed to request misc irq");
+ dev_err(dev, "Failed to request misc irq\n");
goto free_vectors;
}
cptvf->cqinfo.qchunksize = OTX_CPT_CMD_QCHUNK_SIZE;
err = cptvf_sw_init(cptvf, OTX_CPT_CMD_QLEN, OTX_CPT_NUM_QS_PER_VF);
if (err) {
- dev_err(dev, "cptvf_sw_init() failed");
+ dev_err(dev, "cptvf_sw_init() failed\n");
goto free_misc_irq;
}
/* Convey VQ LEN to PF */
/* Convey DOWN to PF */
if (otx_cptvf_send_vf_down(cptvf)) {
- dev_err(&pdev->dev, "PF not responding to DOWN msg");
+ dev_err(&pdev->dev, "PF not responding to DOWN msg\n");
} else {
sysfs_remove_group(&pdev->dev.kobj, &otx_cptvf_sysfs_group);
otx_cpt_crypto_exit(pdev, THIS_MODULE, cptvf->vftype);
GFP_ATOMIC;
ret = setup_sgio_list(pdev, &info, req, gfp);
if (unlikely(ret)) {
- dev_err(&pdev->dev, "Setting up SG list failed");
+ dev_err(&pdev->dev, "Setting up SG list failed\n");
goto request_cleanup;
}
cpt_req->dlen = info->dlen;
struct otx_cptvf *cptvf = pci_get_drvdata(pdev);
if (!otx_cpt_device_ready(cptvf)) {
- dev_err(&pdev->dev, "CPT Device is not ready");
+ dev_err(&pdev->dev, "CPT Device is not ready\n");
return -ENODEV;
}
if ((cptvf->vftype == OTX_CPT_SE_TYPES) && (!req->ctrl.s.se_req)) {
- dev_err(&pdev->dev, "CPTVF-%d of SE TYPE got AE request",
+ dev_err(&pdev->dev, "CPTVF-%d of SE TYPE got AE request\n",
cptvf->vfid);
return -EINVAL;
} else if ((cptvf->vftype == OTX_CPT_AE_TYPES) &&
(req->ctrl.s.se_req)) {
- dev_err(&pdev->dev, "CPTVF-%d of AE TYPE got SE request",
+ dev_err(&pdev->dev, "CPTVF-%d of AE TYPE got SE request\n",
cptvf->vfid);
return -EINVAL;
}
/* check for timeout */
if (time_after_eq(jiffies, cpt_info->time_in +
OTX_CPT_COMMAND_TIMEOUT * HZ))
- dev_warn(&pdev->dev, "Request timed out 0x%p", req);
+ dev_warn(&pdev->dev, "Request timed out 0x%p\n", req);
else if (cpt_info->extra_time < OTX_CPT_TIME_IN_RESET_COUNT) {
cpt_info->time_in = jiffies;
cpt_info->extra_time++;
size_t ds = crypto_shash_digestsize(bctx->shash);
int err, i;
- SHASH_DESC_ON_STACK(shash, bctx->shash);
-
- shash->tfm = bctx->shash;
-
if (keylen > bs) {
- err = crypto_shash_digest(shash, key, keylen, bctx->ipad);
+ err = crypto_shash_tfm_digest(bctx->shash, key, keylen,
+ bctx->ipad);
if (err)
return err;
keylen = ds;
struct n2_hmac_ctx *ctx = crypto_ahash_ctx(tfm);
struct crypto_shash *child_shash = ctx->child_shash;
struct crypto_ahash *fallback_tfm;
- SHASH_DESC_ON_STACK(shash, child_shash);
int err, bs, ds;
fallback_tfm = ctx->base.fallback_tfm;
if (err)
return err;
- shash->tfm = child_shash;
-
bs = crypto_shash_blocksize(child_shash);
ds = crypto_shash_digestsize(child_shash);
BUG_ON(ds > N2_HASH_KEY_MAX);
if (keylen > bs) {
- err = crypto_shash_digest(shash, key, keylen,
- ctx->hash_key);
+ err = crypto_shash_tfm_digest(child_shash, key, keylen,
+ ctx->hash_key);
if (err)
return err;
keylen = ds;
#include <linux/of_irq.h>
#include <linux/delay.h>
#include <linux/crypto.h>
-#include <linux/cryptohash.h>
#include <crypto/scatterwalk.h>
#include <crypto/algapi.h>
#include <crypto/sha.h>
return omap_sham_enqueue(req, OP_UPDATE);
}
-static int omap_sham_shash_digest(struct crypto_shash *tfm, u32 flags,
- const u8 *data, unsigned int len, u8 *out)
-{
- SHASH_DESC_ON_STACK(shash, tfm);
-
- shash->tfm = tfm;
-
- return crypto_shash_digest(shash, data, len, out);
-}
-
static int omap_sham_final_shash(struct ahash_request *req)
{
struct omap_sham_ctx *tctx = crypto_tfm_ctx(req->base.tfm);
!test_bit(FLAGS_AUTO_XOR, &ctx->dd->flags))
offset = get_block_size(ctx);
- return omap_sham_shash_digest(tctx->fallback, req->base.flags,
- ctx->buffer + offset,
- ctx->bufcnt - offset, req->result);
+ return crypto_shash_tfm_digest(tctx->fallback, ctx->buffer + offset,
+ ctx->bufcnt - offset, req->result);
}
static int omap_sham_final(struct ahash_request *req)
return err;
if (keylen > bs) {
- err = omap_sham_shash_digest(bctx->shash,
- crypto_shash_get_flags(bctx->shash),
- key, keylen, bctx->ipad);
+ err = crypto_shash_tfm_digest(bctx->shash, key, keylen,
+ bctx->ipad);
if (err)
return err;
keylen = ds;
return s5p_hash_enqueue(req, true); /* HASH_OP_UPDATE */
}
-/**
- * s5p_hash_shash_digest() - calculate shash digest
- * @tfm: crypto transformation
- * @flags: tfm flags
- * @data: input data
- * @len: length of data
- * @out: output buffer
- */
-static int s5p_hash_shash_digest(struct crypto_shash *tfm, u32 flags,
- const u8 *data, unsigned int len, u8 *out)
-{
- SHASH_DESC_ON_STACK(shash, tfm);
-
- shash->tfm = tfm;
-
- return crypto_shash_digest(shash, data, len, out);
-}
-
-/**
- * s5p_hash_final_shash() - calculate shash digest
- * @req: AHASH request
- */
-static int s5p_hash_final_shash(struct ahash_request *req)
-{
- struct s5p_hash_ctx *tctx = crypto_tfm_ctx(req->base.tfm);
- struct s5p_hash_reqctx *ctx = ahash_request_ctx(req);
-
- return s5p_hash_shash_digest(tctx->fallback, req->base.flags,
- ctx->buffer, ctx->bufcnt, req->result);
-}
-
/**
* s5p_hash_final() - close up hash and calculate digest
* @req: AHASH request
if (ctx->error)
return -EINVAL; /* uncompleted hash is not needed */
- if (!ctx->digcnt && ctx->bufcnt < BUFLEN)
- return s5p_hash_final_shash(req);
+ if (!ctx->digcnt && ctx->bufcnt < BUFLEN) {
+ struct s5p_hash_ctx *tctx = crypto_tfm_ctx(req->base.tfm);
+
+ return crypto_shash_tfm_digest(tctx->fallback, ctx->buffer,
+ ctx->bufcnt, req->result);
+ }
return s5p_hash_enqueue(req, false); /* HASH_OP_FINAL */
}
/* Registers values */
#define CRC_CR_RESET BIT(0)
-#define CRC_CR_REVERSE (BIT(7) | BIT(6) | BIT(5))
-#define CRC_INIT_DEFAULT 0xFFFFFFFF
+#define CRC_CR_REV_IN_WORD (BIT(6) | BIT(5))
+#define CRC_CR_REV_IN_BYTE BIT(5)
+#define CRC_CR_REV_OUT BIT(7)
+#define CRC32C_INIT_DEFAULT 0xFFFFFFFF
#define CRC_AUTOSUSPEND_DELAY 50
+static unsigned int burst_size;
+module_param(burst_size, uint, 0644);
+MODULE_PARM_DESC(burst_size, "Select burst byte size (0 unlimited)");
+
struct stm32_crc {
struct list_head list;
struct device *dev;
void __iomem *regs;
struct clk *clk;
- u8 pending_data[sizeof(u32)];
- size_t nb_pending_bytes;
+ spinlock_t lock;
};
struct stm32_crc_list {
struct stm32_crc_desc_ctx {
u32 partial; /* crc32c: partial in first 4 bytes of that struct */
- struct stm32_crc *crc;
};
static int stm32_crc32_cra_init(struct crypto_tfm *tfm)
{
struct stm32_crc_ctx *mctx = crypto_tfm_ctx(tfm);
- mctx->key = CRC_INIT_DEFAULT;
+ mctx->key = 0;
mctx->poly = CRC32_POLY_LE;
return 0;
}
{
struct stm32_crc_ctx *mctx = crypto_tfm_ctx(tfm);
- mctx->key = CRC_INIT_DEFAULT;
+ mctx->key = CRC32C_INIT_DEFAULT;
mctx->poly = CRC32C_POLY_LE;
return 0;
}
return 0;
}
+static struct stm32_crc *stm32_crc_get_next_crc(void)
+{
+ struct stm32_crc *crc;
+
+ spin_lock_bh(&crc_list.lock);
+ crc = list_first_entry(&crc_list.dev_list, struct stm32_crc, list);
+ if (crc)
+ list_move_tail(&crc->list, &crc_list.dev_list);
+ spin_unlock_bh(&crc_list.lock);
+
+ return crc;
+}
+
static int stm32_crc_init(struct shash_desc *desc)
{
struct stm32_crc_desc_ctx *ctx = shash_desc_ctx(desc);
struct stm32_crc_ctx *mctx = crypto_shash_ctx(desc->tfm);
struct stm32_crc *crc;
+ unsigned long flags;
- spin_lock_bh(&crc_list.lock);
- list_for_each_entry(crc, &crc_list.dev_list, list) {
- ctx->crc = crc;
- break;
- }
- spin_unlock_bh(&crc_list.lock);
+ crc = stm32_crc_get_next_crc();
+ if (!crc)
+ return -ENODEV;
+
+ pm_runtime_get_sync(crc->dev);
- pm_runtime_get_sync(ctx->crc->dev);
+ spin_lock_irqsave(&crc->lock, flags);
/* Reset, set key, poly and configure in bit reverse mode */
- writel_relaxed(bitrev32(mctx->key), ctx->crc->regs + CRC_INIT);
- writel_relaxed(bitrev32(mctx->poly), ctx->crc->regs + CRC_POL);
- writel_relaxed(CRC_CR_RESET | CRC_CR_REVERSE, ctx->crc->regs + CRC_CR);
+ writel_relaxed(bitrev32(mctx->key), crc->regs + CRC_INIT);
+ writel_relaxed(bitrev32(mctx->poly), crc->regs + CRC_POL);
+ writel_relaxed(CRC_CR_RESET | CRC_CR_REV_IN_WORD | CRC_CR_REV_OUT,
+ crc->regs + CRC_CR);
/* Store partial result */
- ctx->partial = readl_relaxed(ctx->crc->regs + CRC_DR);
- ctx->crc->nb_pending_bytes = 0;
+ ctx->partial = readl_relaxed(crc->regs + CRC_DR);
- pm_runtime_mark_last_busy(ctx->crc->dev);
- pm_runtime_put_autosuspend(ctx->crc->dev);
+ spin_unlock_irqrestore(&crc->lock, flags);
+
+ pm_runtime_mark_last_busy(crc->dev);
+ pm_runtime_put_autosuspend(crc->dev);
return 0;
}
-static int stm32_crc_update(struct shash_desc *desc, const u8 *d8,
- unsigned int length)
+static int burst_update(struct shash_desc *desc, const u8 *d8,
+ size_t length)
{
struct stm32_crc_desc_ctx *ctx = shash_desc_ctx(desc);
- struct stm32_crc *crc = ctx->crc;
- u32 *d32;
- unsigned int i;
+ struct stm32_crc_ctx *mctx = crypto_shash_ctx(desc->tfm);
+ struct stm32_crc *crc;
+ unsigned long flags;
+
+ crc = stm32_crc_get_next_crc();
+ if (!crc)
+ return -ENODEV;
pm_runtime_get_sync(crc->dev);
- if (unlikely(crc->nb_pending_bytes)) {
- while (crc->nb_pending_bytes != sizeof(u32) && length) {
- /* Fill in pending data */
- crc->pending_data[crc->nb_pending_bytes++] = *(d8++);
+ spin_lock_irqsave(&crc->lock, flags);
+
+ /*
+ * Restore previously calculated CRC for this context as init value
+ * Restore polynomial configuration
+ * Configure in register for word input data,
+ * Configure out register in reversed bit mode data.
+ */
+ writel_relaxed(bitrev32(ctx->partial), crc->regs + CRC_INIT);
+ writel_relaxed(bitrev32(mctx->poly), crc->regs + CRC_POL);
+ writel_relaxed(CRC_CR_RESET | CRC_CR_REV_IN_WORD | CRC_CR_REV_OUT,
+ crc->regs + CRC_CR);
+
+ if (d8 != PTR_ALIGN(d8, sizeof(u32))) {
+ /* Configure for byte data */
+ writel_relaxed(CRC_CR_REV_IN_BYTE | CRC_CR_REV_OUT,
+ crc->regs + CRC_CR);
+ while (d8 != PTR_ALIGN(d8, sizeof(u32)) && length) {
+ writeb_relaxed(*d8++, crc->regs + CRC_DR);
length--;
}
-
- if (crc->nb_pending_bytes == sizeof(u32)) {
- /* Process completed pending data */
- writel_relaxed(*(u32 *)crc->pending_data,
- crc->regs + CRC_DR);
- crc->nb_pending_bytes = 0;
- }
+ /* Configure for word data */
+ writel_relaxed(CRC_CR_REV_IN_WORD | CRC_CR_REV_OUT,
+ crc->regs + CRC_CR);
}
- d32 = (u32 *)d8;
- for (i = 0; i < length >> 2; i++)
- /* Process 32 bits data */
- writel_relaxed(*(d32++), crc->regs + CRC_DR);
+ for (; length >= sizeof(u32); d8 += sizeof(u32), length -= sizeof(u32))
+ writel_relaxed(*((u32 *)d8), crc->regs + CRC_DR);
+
+ if (length) {
+ /* Configure for byte data */
+ writel_relaxed(CRC_CR_REV_IN_BYTE | CRC_CR_REV_OUT,
+ crc->regs + CRC_CR);
+ while (length--)
+ writeb_relaxed(*d8++, crc->regs + CRC_DR);
+ }
/* Store partial result */
ctx->partial = readl_relaxed(crc->regs + CRC_DR);
+ spin_unlock_irqrestore(&crc->lock, flags);
+
pm_runtime_mark_last_busy(crc->dev);
pm_runtime_put_autosuspend(crc->dev);
- /* Check for pending data (non 32 bits) */
- length &= 3;
- if (likely(!length))
- return 0;
+ return 0;
+}
- if ((crc->nb_pending_bytes + length) >= sizeof(u32)) {
- /* Shall not happen */
- dev_err(crc->dev, "Pending data overflow\n");
- return -EINVAL;
- }
+static int stm32_crc_update(struct shash_desc *desc, const u8 *d8,
+ unsigned int length)
+{
+ const unsigned int burst_sz = burst_size;
+ unsigned int rem_sz;
+ const u8 *cur;
+ size_t size;
+ int ret;
- d8 = (const u8 *)d32;
- for (i = 0; i < length; i++)
- /* Store pending data */
- crc->pending_data[crc->nb_pending_bytes++] = *(d8++);
+ if (!burst_sz)
+ return burst_update(desc, d8, length);
+
+ /* Digest first bytes not 32bit aligned at first pass in the loop */
+ size = min(length,
+ burst_sz + (unsigned int)d8 - ALIGN_DOWN((unsigned int)d8,
+ sizeof(u32)));
+ for (rem_sz = length, cur = d8; rem_sz;
+ rem_sz -= size, cur += size, size = min(rem_sz, burst_sz)) {
+ ret = burst_update(desc, cur, size);
+ if (ret)
+ return ret;
+ }
return 0;
}
return stm32_crc_init(desc) ?: stm32_crc_finup(desc, data, length, out);
}
+static unsigned int refcnt;
+static DEFINE_MUTEX(refcnt_lock);
static struct shash_alg algs[] = {
/* CRC-32 */
{
pm_runtime_get_noresume(dev);
pm_runtime_set_active(dev);
+ pm_runtime_irq_safe(dev);
pm_runtime_enable(dev);
+ spin_lock_init(&crc->lock);
+
platform_set_drvdata(pdev, crc);
spin_lock(&crc_list.lock);
list_add(&crc->list, &crc_list.dev_list);
spin_unlock(&crc_list.lock);
- ret = crypto_register_shashes(algs, ARRAY_SIZE(algs));
- if (ret) {
- dev_err(dev, "Failed to register\n");
- clk_disable_unprepare(crc->clk);
- return ret;
+ mutex_lock(&refcnt_lock);
+ if (!refcnt) {
+ ret = crypto_register_shashes(algs, ARRAY_SIZE(algs));
+ if (ret) {
+ mutex_unlock(&refcnt_lock);
+ dev_err(dev, "Failed to register\n");
+ clk_disable_unprepare(crc->clk);
+ return ret;
+ }
}
+ refcnt++;
+ mutex_unlock(&refcnt_lock);
dev_info(dev, "Initialized\n");
list_del(&crc->list);
spin_unlock(&crc_list.lock);
- crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+ mutex_lock(&refcnt_lock);
+ if (!--refcnt)
+ crypto_unregister_shashes(algs, ARRAY_SIZE(algs));
+ mutex_unlock(&refcnt_lock);
pm_runtime_disable(crc->dev);
pm_runtime_put_noidle(crc->dev);
return 0;
}
-#ifdef CONFIG_PM
-static int stm32_crc_runtime_suspend(struct device *dev)
+static int __maybe_unused stm32_crc_suspend(struct device *dev)
{
struct stm32_crc *crc = dev_get_drvdata(dev);
+ int ret;
- clk_disable_unprepare(crc->clk);
+ ret = pm_runtime_force_suspend(dev);
+ if (ret)
+ return ret;
+
+ clk_unprepare(crc->clk);
return 0;
}
-static int stm32_crc_runtime_resume(struct device *dev)
+static int __maybe_unused stm32_crc_resume(struct device *dev)
{
struct stm32_crc *crc = dev_get_drvdata(dev);
int ret;
- ret = clk_prepare_enable(crc->clk);
+ ret = clk_prepare(crc->clk);
if (ret) {
- dev_err(crc->dev, "Failed to prepare_enable clock\n");
+ dev_err(crc->dev, "Failed to prepare clock\n");
+ return ret;
+ }
+
+ return pm_runtime_force_resume(dev);
+}
+
+static int __maybe_unused stm32_crc_runtime_suspend(struct device *dev)
+{
+ struct stm32_crc *crc = dev_get_drvdata(dev);
+
+ clk_disable(crc->clk);
+
+ return 0;
+}
+
+static int __maybe_unused stm32_crc_runtime_resume(struct device *dev)
+{
+ struct stm32_crc *crc = dev_get_drvdata(dev);
+ int ret;
+
+ ret = clk_enable(crc->clk);
+ if (ret) {
+ dev_err(crc->dev, "Failed to enable clock\n");
return ret;
}
return 0;
}
-#endif
static const struct dev_pm_ops stm32_crc_pm_ops = {
- SET_SYSTEM_SLEEP_PM_OPS(pm_runtime_force_suspend,
- pm_runtime_force_resume)
+ SET_SYSTEM_SLEEP_PM_OPS(stm32_crc_suspend,
+ stm32_crc_resume)
SET_RUNTIME_PM_OPS(stm32_crc_runtime_suspend,
stm32_crc_runtime_resume, NULL)
};
static int stm32_hash_dma_init(struct stm32_hash_dev *hdev)
{
struct dma_slave_config dma_conf;
+ struct dma_chan *chan;
int err;
memset(&dma_conf, 0, sizeof(dma_conf));
dma_conf.dst_maxburst = hdev->dma_maxburst;
dma_conf.device_fc = false;
- hdev->dma_lch = dma_request_chan(hdev->dev, "in");
- if (IS_ERR(hdev->dma_lch)) {
- dev_err(hdev->dev, "Couldn't acquire a slave DMA channel.\n");
- return PTR_ERR(hdev->dma_lch);
- }
+ chan = dma_request_chan(hdev->dev, "in");
+ if (IS_ERR(chan))
+ return PTR_ERR(chan);
+
+ hdev->dma_lch = chan;
err = dmaengine_slave_config(hdev->dma_lch, &dma_conf);
if (err) {
hdev->clk = devm_clk_get(&pdev->dev, NULL);
if (IS_ERR(hdev->clk)) {
- dev_err(dev, "failed to get clock for hash (%lu)\n",
- PTR_ERR(hdev->clk));
+ if (PTR_ERR(hdev->clk) != -EPROBE_DEFER) {
+ dev_err(dev, "failed to get clock for hash (%lu)\n",
+ PTR_ERR(hdev->clk));
+ }
+
return PTR_ERR(hdev->clk);
}
pm_runtime_enable(dev);
hdev->rst = devm_reset_control_get(&pdev->dev, NULL);
- if (!IS_ERR(hdev->rst)) {
+ if (IS_ERR(hdev->rst)) {
+ if (PTR_ERR(hdev->rst) == -EPROBE_DEFER) {
+ ret = -EPROBE_DEFER;
+ goto err_reset;
+ }
+ } else {
reset_control_assert(hdev->rst);
udelay(2);
reset_control_deassert(hdev->rst);
platform_set_drvdata(pdev, hdev);
ret = stm32_hash_dma_init(hdev);
- if (ret)
+ switch (ret) {
+ case 0:
+ break;
+ case -ENOENT:
dev_dbg(dev, "DMA mode not available\n");
+ break;
+ default:
+ goto err_dma;
+ }
spin_lock(&stm32_hash.lock);
list_add_tail(&hdev->list, &stm32_hash.dev_list);
spin_lock(&stm32_hash.lock);
list_del(&hdev->list);
spin_unlock(&stm32_hash.lock);
-
+err_dma:
if (hdev->dma_lch)
dma_release_channel(hdev->dma_lch);
-
+err_reset:
pm_runtime_disable(dev);
pm_runtime_put_noidle(dev);
resource_size_t kmem_size;
resource_size_t kmem_end;
struct resource *new_res;
+ const char *new_res_name;
int numa_node;
int rc;
kmem_size &= ~(memory_block_size_bytes() - 1);
kmem_end = kmem_start + kmem_size;
- /* Region is permanently reserved. Hot-remove not yet implemented. */
- new_res = request_mem_region(kmem_start, kmem_size, dev_name(dev));
+ new_res_name = kstrdup(dev_name(dev), GFP_KERNEL);
+ if (!new_res_name)
+ return -ENOMEM;
+
+ /* Region is permanently reserved if hotremove fails. */
+ new_res = request_mem_region(kmem_start, kmem_size, new_res_name);
if (!new_res) {
dev_warn(dev, "could not reserve region [%pa-%pa]\n",
&kmem_start, &kmem_end);
+ kfree(new_res_name);
return -EBUSY;
}
* unknown to us that will break add_memory() below.
*/
new_res->flags = IORESOURCE_SYSTEM_RAM;
- new_res->name = dev_name(dev);
rc = add_memory(numa_node, new_res->start, resource_size(new_res));
if (rc) {
release_resource(new_res);
kfree(new_res);
+ kfree(new_res_name);
return rc;
}
dev_dax->dax_kmem_res = new_res;
struct resource *res = dev_dax->dax_kmem_res;
resource_size_t kmem_start = res->start;
resource_size_t kmem_size = resource_size(res);
+ const char *res_name = res->name;
int rc;
/*
/* Release and free dax resources */
release_resource(res);
kfree(res);
+ kfree(res_name);
dev_dax->dax_kmem_res = NULL;
return 0;
mutex_unlock(&info->lock);
return ret;
} else if (dmatest_run) {
- if (is_threaded_test_pending(info))
- start_threaded_tests(info);
- else
- pr_info("Could not start test, no channels configured\n");
+ if (!is_threaded_test_pending(info)) {
+ pr_info("No channels configured, continue with any\n");
+ add_threaded_test(info);
+ }
+ start_threaded_tests(info);
} else {
stop_threaded_test(info);
}
perm.ignore = 0;
iowrite32(perm.bits, idxd->reg_base + offset);
+ /*
+ * A readback from the device ensures that any previously generated
+ * completion record writes are visible to software based on PCI
+ * ordering rules.
+ */
+ perm.bits = ioread32(idxd->reg_base + offset);
+
return 0;
}
struct llist_node *head;
int queued = 0;
+ *processed = 0;
head = llist_del_all(&irq_entry->pending_llist);
if (!head)
return 0;
struct list_head *node, *next;
int queued = 0;
+ *processed = 0;
if (list_empty(&irq_entry->work_list))
return 0;
return queued;
}
-irqreturn_t idxd_wq_thread(int irq, void *data)
+static int idxd_desc_process(struct idxd_irq_entry *irq_entry)
{
- struct idxd_irq_entry *irq_entry = data;
- int rc, processed = 0, retry = 0;
+ int rc, processed, total = 0;
/*
* There are two lists we are processing. The pending_llist is where
*/
do {
rc = irq_process_work_list(irq_entry, &processed);
- if (rc != 0) {
- retry++;
+ total += processed;
+ if (rc != 0)
continue;
- }
rc = irq_process_pending_llist(irq_entry, &processed);
- } while (rc != 0 && retry != 10);
+ total += processed;
+ } while (rc != 0);
+
+ return total;
+}
+
+irqreturn_t idxd_wq_thread(int irq, void *data)
+{
+ struct idxd_irq_entry *irq_entry = data;
+ int processed;
+ processed = idxd_desc_process(irq_entry);
idxd_unmask_msix_vector(irq_entry->idxd, irq_entry->id);
+ /* catch anything unprocessed after unmasking */
+ processed += idxd_desc_process(irq_entry);
if (processed == 0)
return IRQ_NONE;
* @id: physical index to this channel
* @base: virtual memory base for the dma channel
* @vchan: the virtual channel currently being served by this physical channel
- * @lock: a lock to use when altering an instance of this struct
*/
struct owl_dma_pchan {
u32 id;
void __iomem *base;
struct owl_dma_vchan *vchan;
- spinlock_t lock;
};
/**
for (i = 0; i < od->nr_pchans; i++) {
pchan = &od->pchans[i];
- spin_lock_irqsave(&pchan->lock, flags);
+ spin_lock_irqsave(&od->lock, flags);
if (!pchan->vchan) {
pchan->vchan = vchan;
- spin_unlock_irqrestore(&pchan->lock, flags);
+ spin_unlock_irqrestore(&od->lock, flags);
break;
}
- spin_unlock_irqrestore(&pchan->lock, flags);
+ spin_unlock_irqrestore(&od->lock, flags);
}
return pchan;
ret = dma_async_device_register(&tdma->dma_dev);
if (ret < 0) {
dev_err(&pdev->dev, "ADMA registration failed: %d\n", ret);
- goto irq_dispose;
+ goto rpm_put;
}
ret = of_dma_controller_register(pdev->dev.of_node,
d->residue += sg_dma_len(sgent);
}
- cppi5_tr_csf_set(&tr_req[tr_idx - 1].flags, CPPI5_TR_CSF_EOP);
+ cppi5_tr_csf_set(&tr_req[tr_idx - 1].flags,
+ CPPI5_TR_CSF_SUPR_EVT | CPPI5_TR_CSF_EOP);
return d;
}
tr_req[1].dicnt3 = 1;
}
- cppi5_tr_csf_set(&tr_req[num_tr - 1].flags, CPPI5_TR_CSF_EOP);
+ cppi5_tr_csf_set(&tr_req[num_tr - 1].flags,
+ CPPI5_TR_CSF_SUPR_EVT | CPPI5_TR_CSF_EOP);
if (uc->config.metadata_size)
d->vd.tx.metadata_ops = &metadata_ops;
struct zynqmp_dma_desc_sw *child, *next;
chan->desc_free_cnt++;
+ list_del(&sdesc->node);
list_add_tail(&sdesc->node, &chan->free_list);
list_for_each_entry_safe(child, next, &sdesc->tx_list, node) {
chan->desc_free_cnt++;
dma_async_tx_callback callback;
void *callback_param;
- list_del(&desc->node);
-
callback = desc->async_tx.callback;
callback_param = desc->async_tx.callback_param;
if (callback) {
static int hw_info_get(struct amd64_pvt *pvt)
{
u16 pci_id1, pci_id2;
- int ret = -EINVAL;
+ int ret;
if (pvt->fam >= 0x17) {
pvt->umc = kcalloc(fam_type->max_mcs, sizeof(struct amd64_umc), GFP_KERNEL);
" PCI Access Write Error at 0x%x\n", reg);
}
-static char * const bridge_str[] = {
- [NORTH_A] = "NORTH A",
- [NORTH_B] = "NORTH B",
- [SOUTH_A] = "SOUTH A",
- [SOUTH_B] = "SOUTH B",
- [NO_BRIDGE] = "NO BRIDGE",
-};
-
/* Support up to two AMD8131 chipsets on a platform */
static struct amd8131_dev_info amd8131_devices[] = {
{
char msg[128];
};
-/* derived from "DRAM Address Multiplexing" in the ARAMDA XP Functional Spec */
+/* derived from "DRAM Address Multiplexing" in the ARMADA XP Functional Spec */
static uint32_t axp_mc_calc_address(struct axp_mc_drvdata *drvdata,
uint8_t cs, uint8_t bank, uint16_t row,
uint16_t col)
if (cnt_sbe)
cnt_sbe--;
else
- dev_warn(mci->pdev, "inconsistent SBE count detected");
+ dev_warn(mci->pdev, "inconsistent SBE count detected\n");
} else {
if (cnt_dbe)
cnt_dbe--;
else
- dev_warn(mci->pdev, "inconsistent DBE count detected");
+ dev_warn(mci->pdev, "inconsistent DBE count detected\n");
}
/* report earlier errors */
config = readl(base + SDRAM_CONFIG_REG);
if (!(config & SDRAM_CONFIG_ECC_MASK)) {
- dev_warn(&pdev->dev, "SDRAM ECC is not enabled");
+ dev_warn(&pdev->dev, "SDRAM ECC is not enabled\n");
return -EINVAL;
}
l2x0_aux_ctrl = readl(base + L2X0_AUX_CTRL);
if (!(l2x0_aux_ctrl & AURORA_ACR_PARITY_EN))
- dev_warn(&pdev->dev, "tag parity is not enabled");
+ dev_warn(&pdev->dev, "tag parity is not enabled\n");
if (!(l2x0_aux_ctrl & AURORA_ACR_ECC_EN))
- dev_warn(&pdev->dev, "data ECC is not enabled");
+ dev_warn(&pdev->dev, "data ECC is not enabled\n");
dci = edac_device_alloc_ctl_info(sizeof(*drvdata),
"cpu", 1, "L", 1, 2, NULL, 0, 0);
res = platform_register_drivers(drivers, ARRAY_SIZE(drivers));
if (res)
- pr_warn("Aramda XP EDAC drivers fail to register\n");
+ pr_warn("Armada XP EDAC drivers fail to register\n");
return 0;
}
return 0;
}
+static struct res_config i10nm_cfg0 = {
+ .type = I10NM,
+ .decs_did = 0x3452,
+ .busno_cfg_offset = 0xcc,
+};
+
+static struct res_config i10nm_cfg1 = {
+ .type = I10NM,
+ .decs_did = 0x3452,
+ .busno_cfg_offset = 0xd0,
+};
+
static const struct x86_cpu_id i10nm_cpuids[] = {
- X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, NULL),
- X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, NULL),
- X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, NULL),
+ X86_MATCH_INTEL_FAM6_MODEL(ATOM_TREMONT_D, &i10nm_cfg0),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_X, &i10nm_cfg0),
+ X86_MATCH_INTEL_FAM6_MODEL(ICELAKE_D, &i10nm_cfg1),
{}
};
MODULE_DEVICE_TABLE(x86cpu, i10nm_cpuids);
mtr, mcddrtcfg, imc->mc, i, j);
if (IS_DIMM_PRESENT(mtr))
- ndimms += skx_get_dimm_info(mtr, 0, dimm,
+ ndimms += skx_get_dimm_info(mtr, 0, 0, dimm,
imc, i, j);
else if (IS_NVDIMM_PRESENT(mcddrtcfg, j))
ndimms += skx_get_nvdimm_info(dimm, imc, i, j,
{
u8 mc = 0, src_id = 0, node_id = 0;
const struct x86_cpu_id *id;
+ struct res_config *cfg;
const char *owner;
struct skx_dev *d;
int rc, i, off[3] = {0xd0, 0xc8, 0xcc};
if (!id)
return -ENODEV;
+ cfg = (struct res_config *)id->driver_data;
+
+ /* Newer steppings have different offset for ATOM_TREMONT_D/ICELAKE_X */
+ if (boot_cpu_data.x86_stepping >= 4)
+ cfg->busno_cfg_offset = 0xd0;
+
rc = skx_get_hi_lo(0x09a2, off, &tolm, &tohm);
if (rc)
return rc;
- rc = skx_get_all_bus_mappings(0x3452, 0xcc, I10NM, &i10nm_edac_list);
+ rc = skx_get_all_bus_mappings(cfg, &i10nm_edac_list);
if (rc < 0)
goto fail;
if (rc == 0) {
return -ENODEV;
}
+static struct res_config skx_cfg = {
+ .type = SKX,
+ .decs_did = 0x2016,
+ .busno_cfg_offset = 0xcc,
+};
+
static const struct x86_cpu_id skx_cpuids[] = {
- X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, NULL),
+ X86_MATCH_INTEL_FAM6_MODEL(SKYLAKE_X, &skx_cfg),
{ }
};
MODULE_DEVICE_TABLE(x86cpu, skx_cpuids);
-#define SKX_GET_MTMTR(dev, reg) \
- pci_read_config_dword((dev), 0x87c, &(reg))
-
-static bool skx_check_ecc(struct pci_dev *pdev)
+static bool skx_check_ecc(u32 mcmtr)
{
- u32 mtmtr;
-
- SKX_GET_MTMTR(pdev, mtmtr);
-
- return !!GET_BITFIELD(mtmtr, 2, 2);
+ return !!GET_BITFIELD(mcmtr, 2, 2);
}
static int skx_get_dimm_config(struct mem_ctl_info *mci)
{
struct skx_pvt *pvt = mci->pvt_info;
+ u32 mtr, mcmtr, amap, mcddrtcfg;
struct skx_imc *imc = pvt->imc;
- u32 mtr, amap, mcddrtcfg;
struct dimm_info *dimm;
int i, j;
int ndimms;
+ /* Only the mcmtr on the first channel is effective */
+ pci_read_config_dword(imc->chan[0].cdev, 0x87c, &mcmtr);
+
for (i = 0; i < SKX_NUM_CHANNELS; i++) {
ndimms = 0;
pci_read_config_dword(imc->chan[i].cdev, 0x8C, &amap);
pci_read_config_dword(imc->chan[i].cdev,
0x80 + 4 * j, &mtr);
if (IS_DIMM_PRESENT(mtr)) {
- ndimms += skx_get_dimm_info(mtr, amap, dimm, imc, i, j);
+ ndimms += skx_get_dimm_info(mtr, mcmtr, amap, dimm, imc, i, j);
} else if (IS_NVDIMM_PRESENT(mcddrtcfg, j)) {
ndimms += skx_get_nvdimm_info(dimm, imc, i, j,
EDAC_MOD_STR);
nvdimm_count++;
}
}
- if (ndimms && !skx_check_ecc(imc->chan[0].cdev)) {
+ if (ndimms && !skx_check_ecc(mcmtr)) {
skx_printk(KERN_ERR, "ECC is disabled on imc %d\n", imc->mc);
return -ENODEV;
}
static int __init skx_init(void)
{
const struct x86_cpu_id *id;
+ struct res_config *cfg;
const struct munit *m;
const char *owner;
int rc = 0, i, off[3] = {0xd0, 0xd4, 0xd8};
if (!id)
return -ENODEV;
+ cfg = (struct res_config *)id->driver_data;
+
rc = skx_get_hi_lo(0x2034, off, &skx_tolm, &skx_tohm);
if (rc)
return rc;
- rc = skx_get_all_bus_mappings(0x2016, 0xcc, SKX, &skx_edac_list);
+ rc = skx_get_all_bus_mappings(cfg, &skx_edac_list);
if (rc < 0)
goto fail;
if (rc == 0) {
}
/*
- * We use the per-socket device @did to count how many sockets are present,
+ * We use the per-socket device @cfg->did to count how many sockets are present,
* and to detemine which PCI buses are associated with each socket. Allocate
* and build the full list of all the skx_dev structures that we need here.
*/
-int skx_get_all_bus_mappings(unsigned int did, int off, enum type type,
- struct list_head **list)
+int skx_get_all_bus_mappings(struct res_config *cfg, struct list_head **list)
{
struct pci_dev *pdev, *prev;
struct skx_dev *d;
prev = NULL;
for (;;) {
- pdev = pci_get_device(PCI_VENDOR_ID_INTEL, did, prev);
+ pdev = pci_get_device(PCI_VENDOR_ID_INTEL, cfg->decs_did, prev);
if (!pdev)
break;
ndev++;
return -ENOMEM;
}
- if (pci_read_config_dword(pdev, off, ®)) {
+ if (pci_read_config_dword(pdev, cfg->busno_cfg_offset, ®)) {
kfree(d);
pci_dev_put(pdev);
skx_printk(KERN_ERR, "Failed to read bus idx\n");
d->bus[0] = GET_BITFIELD(reg, 0, 7);
d->bus[1] = GET_BITFIELD(reg, 8, 15);
- if (type == SKX) {
+ if (cfg->type == SKX) {
d->seg = pci_domain_nr(pdev->bus);
d->bus[2] = GET_BITFIELD(reg, 16, 23);
d->bus[3] = GET_BITFIELD(reg, 24, 31);
#define numrow(reg) skx_get_dimm_attr(reg, 2, 4, 12, 1, 6, "rows")
#define numcol(reg) skx_get_dimm_attr(reg, 0, 1, 10, 0, 2, "cols")
-int skx_get_dimm_info(u32 mtr, u32 amap, struct dimm_info *dimm,
+int skx_get_dimm_info(u32 mtr, u32 mcmtr, u32 amap, struct dimm_info *dimm,
struct skx_imc *imc, int chan, int dimmno)
{
int banks = 16, ranks, rows, cols, npages;
imc->mc, chan, dimmno, size, npages,
banks, 1 << ranks, rows, cols);
- imc->chan[chan].dimms[dimmno].close_pg = GET_BITFIELD(mtr, 0, 0);
- imc->chan[chan].dimms[dimmno].bank_xor_enable = GET_BITFIELD(mtr, 9, 9);
+ imc->chan[chan].dimms[dimmno].close_pg = GET_BITFIELD(mcmtr, 0, 0);
+ imc->chan[chan].dimms[dimmno].bank_xor_enable = GET_BITFIELD(mcmtr, 9, 9);
imc->chan[chan].dimms[dimmno].fine_grain_bank = GET_BITFIELD(amap, 0, 0);
imc->chan[chan].dimms[dimmno].rowbits = rows;
imc->chan[chan].dimms[dimmno].colbits = cols;
int bank_group;
};
+struct res_config {
+ enum type type;
+ /* Configuration agent device ID */
+ unsigned int decs_did;
+ /* Default bus number configuration register offset */
+ int busno_cfg_offset;
+};
+
typedef int (*get_dimm_config_f)(struct mem_ctl_info *mci);
typedef bool (*skx_decode_f)(struct decoded_addr *res);
typedef void (*skx_show_retry_log_f)(struct decoded_addr *res, char *msg, int len);
int skx_get_src_id(struct skx_dev *d, int off, u8 *id);
int skx_get_node_id(struct skx_dev *d, u8 *id);
-int skx_get_all_bus_mappings(unsigned int did, int off, enum type,
- struct list_head **list);
+int skx_get_all_bus_mappings(struct res_config *cfg, struct list_head **list);
int skx_get_hi_lo(unsigned int did, int off[], u64 *tolm, u64 *tohm);
-int skx_get_dimm_info(u32 mtr, u32 amap, struct dimm_info *dimm,
+int skx_get_dimm_info(u32 mtr, u32 mcmtr, u32 amap, struct dimm_info *dimm,
struct skx_imc *imc, int chan, int dimmno);
int skx_get_nvdimm_info(struct dimm_info *dimm, struct skx_imc *imc,
OCX_DEBUGFS_ATTR(com_int, OCX_COM_INT_W1S);
-struct debugfs_entry *ocx_dfs_ents[] = {
+static struct debugfs_entry *ocx_dfs_ents[] = {
&debugfs_tlk0_ecc_ctl,
&debugfs_tlk1_ecc_ctl,
&debugfs_tlk2_ecc_ctl,
L2C_DEBUGFS_ATTR(tad_int, L2C_TAD_INT_W1S);
-struct debugfs_entry *l2c_tad_dfs_ents[] = {
+static struct debugfs_entry *l2c_tad_dfs_ents[] = {
&debugfs_tad_int,
};
L2C_DEBUGFS_ATTR(cbc_int, L2C_CBC_INT_W1S);
-struct debugfs_entry *l2c_cbc_dfs_ents[] = {
+static struct debugfs_entry *l2c_cbc_dfs_ents[] = {
&debugfs_cbc_int,
};
L2C_DEBUGFS_ATTR(mci_int, L2C_MCI_INT_W1S);
-struct debugfs_entry *l2c_mci_dfs_ents[] = {
+static struct debugfs_entry *l2c_mci_dfs_ents[] = {
&debugfs_mci_int,
};
#define WORD_ALIGNED_ERR_MASK BIT(28)
#define PAGE_ACCESS_ERR_MASK BIT(27)
#define WRITE_ACCESS_MASK BIT(26)
-#define RBERRADDR_RD(src) ((src) & 0x03FFFFFF)
static const char * const soc_mem_err_v1[] = {
"10GbE0",
return;
if (reg & STICKYERR_MASK) {
bool write;
- u32 address;
dev_err(edac_dev->dev, "IOB bus access error(s)\n");
if (regmap_read(ctx->edac->rb_map, RBEIR, ®))
return;
write = reg & WRITE_ACCESS_MASK ? 1 : 0;
- address = RBERRADDR_RD(reg);
if (reg & AGENT_OFFLINE_ERR_MASK)
dev_err(edac_dev->dev,
"IOB bus %s access to offline agent error\n",
config EFI_RUNTIME_WRAPPERS
bool
-config EFI_ARMSTUB
+config EFI_GENERIC_STUB
bool
config EFI_ARMSTUB_DTB_LOADER
bool "Enable the DTB loader"
- depends on EFI_ARMSTUB
+ depends on EFI_GENERIC_STUB
default y
help
Select this config option to add support for the dtb= command
functionality for bootloaders that do not have such support
this option is necessary.
+config EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER
+ bool "Enable the command line initrd loader" if !X86
+ depends on EFI_STUB && (EFI_GENERIC_STUB || X86)
+ default y
+ help
+ Select this config option to add support for the initrd= command
+ line parameter, allowing an initrd that resides on the same volume
+ as the kernel image to be loaded into memory.
+
+ This method is deprecated.
+
config EFI_BOOTLOADER_CONTROL
tristate "EFI Bootloader Control"
depends on EFI_VARS
static __initdata unsigned long screen_info_table = EFI_INVALID_TABLE_ADDR;
static const efi_config_table_type_t arch_tables[] __initconst = {
- {LINUX_EFI_ARM_SCREEN_INFO_TABLE_GUID, NULL, &screen_info_table},
- {NULL_GUID, NULL, NULL}
+ {LINUX_EFI_ARM_SCREEN_INFO_TABLE_GUID, &screen_info_table},
+ {}
};
static void __init init_screen_info(void)
}
}
+static const char * const fw_err_rec_type_strs[] = {
+ "IPF SAL Error Record",
+ "SOC Firmware Error Record Type1 (Legacy CrashLog Support)",
+ "SOC Firmware Error Record Type2",
+};
+
+static void cper_print_fw_err(const char *pfx,
+ struct acpi_hest_generic_data *gdata,
+ const struct cper_sec_fw_err_rec_ref *fw_err)
+{
+ void *buf = acpi_hest_get_payload(gdata);
+ u32 offset, length = gdata->error_data_length;
+
+ printk("%s""Firmware Error Record Type: %s\n", pfx,
+ fw_err->record_type < ARRAY_SIZE(fw_err_rec_type_strs) ?
+ fw_err_rec_type_strs[fw_err->record_type] : "unknown");
+ printk("%s""Revision: %d\n", pfx, fw_err->revision);
+
+ /* Record Type based on UEFI 2.7 */
+ if (fw_err->revision == 0) {
+ printk("%s""Record Identifier: %08llx\n", pfx,
+ fw_err->record_identifier);
+ } else if (fw_err->revision == 2) {
+ printk("%s""Record Identifier: %pUl\n", pfx,
+ &fw_err->record_identifier_guid);
+ }
+
+ /*
+ * The FW error record may contain trailing data beyond the
+ * structure defined by the specification. As the fields
+ * defined (and hence the offset of any trailing data) vary
+ * with the revision, set the offset to account for this
+ * variation.
+ */
+ if (fw_err->revision == 0) {
+ /* record_identifier_guid not defined */
+ offset = offsetof(struct cper_sec_fw_err_rec_ref,
+ record_identifier_guid);
+ } else if (fw_err->revision == 1) {
+ /* record_identifier not defined */
+ offset = offsetof(struct cper_sec_fw_err_rec_ref,
+ record_identifier);
+ } else {
+ offset = sizeof(*fw_err);
+ }
+
+ buf += offset;
+ length -= offset;
+
+ print_hex_dump(pfx, "", DUMP_PREFIX_OFFSET, 16, 4, buf, length, true);
+}
+
static void cper_print_tstamp(const char *pfx,
struct acpi_hest_generic_data_v300 *gdata)
{
else
goto err_section_too_small;
#endif
+ } else if (guid_equal(sec_type, &CPER_SEC_FW_ERR_REC_REF)) {
+ struct cper_sec_fw_err_rec_ref *fw_err = acpi_hest_get_payload(gdata);
+
+ printk("%ssection_type: Firmware Error Record Reference\n",
+ newpfx);
+ /* The minimal FW Error Record contains 16 bytes */
+ if (gdata->error_data_length >= SZ_16)
+ cper_print_fw_err(newpfx, gdata, fw_err);
+ else
+ goto err_section_too_small;
} else {
const void *err = acpi_hest_get_payload(gdata);
const u32 color_black = 0x00000000;
const u32 color_white = 0x00ffffff;
const u8 *src;
- u8 s8;
- int m;
+ int m, n, bytes;
+ u8 x;
- src = font->data + c * font->height;
- s8 = *(src + h);
+ bytes = BITS_TO_BYTES(font->width);
+ src = font->data + c * font->height * bytes + h * bytes;
- for (m = 0; m < 8; m++) {
- if ((s8 >> (7 - m)) & 1)
+ for (m = 0; m < font->width; m++) {
+ n = m % 8;
+ x = *(src + m / 8);
+ if ((x >> (7 - n)) & 1)
*dst = color_white;
else
*dst = color_black;
if (efi.smbios != EFI_INVALID_TABLE_ADDR)
str += sprintf(str, "SMBIOS=0x%lx\n", efi.smbios);
- if (IS_ENABLED(CONFIG_IA64) || IS_ENABLED(CONFIG_X86)) {
- extern char *efi_systab_show_arch(char *str);
-
+ if (IS_ENABLED(CONFIG_IA64) || IS_ENABLED(CONFIG_X86))
str = efi_systab_show_arch(str);
- }
return str - buf;
}
}
static const efi_config_table_type_t common_tables[] __initconst = {
- {ACPI_20_TABLE_GUID, "ACPI 2.0", &efi.acpi20},
- {ACPI_TABLE_GUID, "ACPI", &efi.acpi},
- {SMBIOS_TABLE_GUID, "SMBIOS", &efi.smbios},
- {SMBIOS3_TABLE_GUID, "SMBIOS 3.0", &efi.smbios3},
- {EFI_SYSTEM_RESOURCE_TABLE_GUID, "ESRT", &efi.esrt},
- {EFI_MEMORY_ATTRIBUTES_TABLE_GUID, "MEMATTR", &efi_mem_attr_table},
- {LINUX_EFI_RANDOM_SEED_TABLE_GUID, "RNG", &efi_rng_seed},
- {LINUX_EFI_TPM_EVENT_LOG_GUID, "TPMEventLog", &efi.tpm_log},
- {LINUX_EFI_TPM_FINAL_LOG_GUID, "TPMFinalLog", &efi.tpm_final_log},
- {LINUX_EFI_MEMRESERVE_TABLE_GUID, "MEMRESERVE", &mem_reserve},
- {EFI_RT_PROPERTIES_TABLE_GUID, "RTPROP", &rt_prop},
+ {ACPI_20_TABLE_GUID, &efi.acpi20, "ACPI 2.0" },
+ {ACPI_TABLE_GUID, &efi.acpi, "ACPI" },
+ {SMBIOS_TABLE_GUID, &efi.smbios, "SMBIOS" },
+ {SMBIOS3_TABLE_GUID, &efi.smbios3, "SMBIOS 3.0" },
+ {EFI_SYSTEM_RESOURCE_TABLE_GUID, &efi.esrt, "ESRT" },
+ {EFI_MEMORY_ATTRIBUTES_TABLE_GUID, &efi_mem_attr_table, "MEMATTR" },
+ {LINUX_EFI_RANDOM_SEED_TABLE_GUID, &efi_rng_seed, "RNG" },
+ {LINUX_EFI_TPM_EVENT_LOG_GUID, &efi.tpm_log, "TPMEventLog" },
+ {LINUX_EFI_TPM_FINAL_LOG_GUID, &efi.tpm_final_log, "TPMFinalLog" },
+ {LINUX_EFI_MEMRESERVE_TABLE_GUID, &mem_reserve, "MEMRESERVE" },
+ {EFI_RT_PROPERTIES_TABLE_GUID, &rt_prop, "RTPROP" },
#ifdef CONFIG_EFI_RCI2_TABLE
- {DELLEMC_EFI_RCI2_TABLE_GUID, NULL, &rci2_table_phys},
+ {DELLEMC_EFI_RCI2_TABLE_GUID, &rci2_table_phys },
#endif
- {NULL_GUID, NULL, NULL},
+ {},
};
static __init int match_config_table(const efi_guid_t *guid,
{
int i;
- if (table_types) {
- for (i = 0; efi_guidcmp(table_types[i].guid, NULL_GUID); i++) {
- if (!efi_guidcmp(*guid, table_types[i].guid)) {
- *(table_types[i].ptr) = table;
- if (table_types[i].name)
- pr_cont(" %s=0x%lx ",
- table_types[i].name, table);
- return 1;
- }
+ for (i = 0; efi_guidcmp(table_types[i].guid, NULL_GUID); i++) {
+ if (!efi_guidcmp(*guid, table_types[i].guid)) {
+ *(table_types[i].ptr) = table;
+ if (table_types[i].name[0])
+ pr_cont("%s=0x%lx ",
+ table_types[i].name, table);
+ return 1;
}
}
table = tbl32[i].table;
}
- if (!match_config_table(guid, table, common_tables))
+ if (!match_config_table(guid, table, common_tables) && arch_tables)
match_config_table(guid, table, arch_tables);
}
pr_cont("\n");
ret = kobject_init_and_add(&new_var->kobj, &efivar_ktype,
NULL, "%s", short_name);
kfree(short_name);
- if (ret)
+ if (ret) {
+ kobject_put(&new_var->kobj);
return ret;
+ }
kobject_uevent(&new_var->kobj, KOBJ_ADD);
if (efivar_entry_add(new_var, &efivar_sysfs_list)) {
#
cflags-$(CONFIG_X86_32) := -march=i386
cflags-$(CONFIG_X86_64) := -mcmodel=small
-cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ -O2 \
+cflags-$(CONFIG_X86) += -m$(BITS) -D__KERNEL__ \
-fPIC -fno-strict-aliasing -mno-red-zone \
-mno-mmx -mno-sse -fshort-wchar \
-Wno-pointer-sign \
-fno-builtin -fpic \
$(call cc-option,-mno-single-pic-base)
-cflags-$(CONFIG_EFI_ARMSTUB) += -I$(srctree)/scripts/dtc/libfdt
+cflags-$(CONFIG_EFI_GENERIC_STUB) += -I$(srctree)/scripts/dtc/libfdt
-KBUILD_CFLAGS := $(cflags-y) -DDISABLE_BRANCH_PROFILING \
+KBUILD_CFLAGS := $(cflags-y) -Os -DDISABLE_BRANCH_PROFILING \
-include $(srctree)/drivers/firmware/efi/libstub/hidden.h \
-D__NO_FORTIFY \
$(call cc-option,-ffreestanding) \
$(call cc-option,-fno-stack-protector) \
+ $(call cc-option,-fno-addrsig) \
-D__DISABLE_EXPORTS
GCOV_PROFILE := n
lib-y := efi-stub-helper.o gop.o secureboot.o tpm.o \
file.o mem.o random.o randomalloc.o pci.o \
- skip_spaces.o lib-cmdline.o lib-ctype.o
+ skip_spaces.o lib-cmdline.o lib-ctype.o \
+ alignedmem.o relocate.o vsprintf.o
# include the stub's generic dependencies from lib/ when building for ARM/arm64
-arm-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
+efi-deps-y := fdt_rw.c fdt_ro.c fdt_wip.c fdt.c fdt_empty_tree.c fdt_sw.c
$(obj)/lib-%.o: $(srctree)/lib/%.c FORCE
$(call if_changed_rule,cc_o_c)
-lib-$(CONFIG_EFI_ARMSTUB) += arm-stub.o fdt.o string.o \
- $(patsubst %.c,lib-%.o,$(arm-deps-y))
+lib-$(CONFIG_EFI_GENERIC_STUB) += efi-stub.o fdt.o string.o \
+ $(patsubst %.c,lib-%.o,$(efi-deps-y))
lib-$(CONFIG_ARM) += arm32-stub.o
lib-$(CONFIG_ARM64) += arm64-stub.o
CFLAGS_arm32-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
CFLAGS_arm64-stub.o := -DTEXT_OFFSET=$(TEXT_OFFSET)
+#
+# For x86, bootloaders like systemd-boot or grub-efi do not zero-initialize the
+# .bss section, so the .bss section of the EFI stub needs to be included in the
+# .data section of the compressed kernel to ensure initialization. Rename the
+# .bss section here so it's easy to pick out in the linker script.
+#
+STUBCOPY_FLAGS-$(CONFIG_X86) += --rename-section .bss=.bss.efistub,load,alloc
+STUBCOPY_RELOC-$(CONFIG_X86_32) := R_386_32
+STUBCOPY_RELOC-$(CONFIG_X86_64) := R_X86_64_64
+
+#
+# ARM discards the .data section because it disallows r/w data in the
+# decompressor. So move our .data to .data.efistub and .bss to .bss.efistub,
+# which are preserved explicitly by the decompressor linker script.
+#
+STUBCOPY_FLAGS-$(CONFIG_ARM) += --rename-section .data=.data.efistub \
+ --rename-section .bss=.bss.efistub,load,alloc
+STUBCOPY_RELOC-$(CONFIG_ARM) := R_ARM_ABS
+
#
# arm64 puts the stub in the kernel proper, which will unnecessarily retain all
# code indefinitely unless it is annotated as __init/__initdata/__initconst etc.
# a verification pass to see if any absolute relocations exist in any of the
# object files.
#
-extra-$(CONFIG_EFI_ARMSTUB) := $(lib-y)
-lib-$(CONFIG_EFI_ARMSTUB) := $(patsubst %.o,%.stub.o,$(lib-y))
+extra-y := $(lib-y)
+lib-y := $(patsubst %.o,%.stub.o,$(lib-y))
STUBCOPY_FLAGS-$(CONFIG_ARM64) += --prefix-alloc-sections=.init \
--prefix-symbols=__efistub_
/bin/false; \
fi; \
$(OBJCOPY) $(STUBCOPY_FLAGS-y) $< $@
-
-#
-# ARM discards the .data section because it disallows r/w data in the
-# decompressor. So move our .data to .data.efistub, which is preserved
-# explicitly by the decompressor linker script.
-#
-STUBCOPY_FLAGS-$(CONFIG_ARM) += --rename-section .data=.data.efistub
-STUBCOPY_RELOC-$(CONFIG_ARM) := R_ARM_ABS
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+/**
+ * efi_allocate_pages_aligned() - Allocate memory pages
+ * @size: minimum number of bytes to allocate
+ * @addr: On return the address of the first allocated page. The first
+ * allocated page has alignment EFI_ALLOC_ALIGN which is an
+ * architecture dependent multiple of the page size.
+ * @max: the address that the last allocated memory page shall not
+ * exceed
+ * @align: minimum alignment of the base of the allocation
+ *
+ * Allocate pages as EFI_LOADER_DATA. The allocated pages are aligned according
+ * to @align, which should be >= EFI_ALLOC_ALIGN. The last allocated page will
+ * not exceed the address given by @max.
+ *
+ * Return: status code
+ */
+efi_status_t efi_allocate_pages_aligned(unsigned long size, unsigned long *addr,
+ unsigned long max, unsigned long align)
+{
+ efi_physical_addr_t alloc_addr;
+ efi_status_t status;
+ int slack;
+
+ if (align < EFI_ALLOC_ALIGN)
+ align = EFI_ALLOC_ALIGN;
+
+ alloc_addr = ALIGN_DOWN(max + 1, align) - 1;
+ size = round_up(size, EFI_ALLOC_ALIGN);
+ slack = align / EFI_PAGE_SIZE - 1;
+
+ status = efi_bs_call(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
+ EFI_LOADER_DATA, size / EFI_PAGE_SIZE + slack,
+ &alloc_addr);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ *addr = ALIGN((unsigned long)alloc_addr, align);
+
+ if (slack > 0) {
+ int l = (alloc_addr % align) / EFI_PAGE_SIZE;
+
+ if (l) {
+ efi_bs_call(free_pages, alloc_addr, slack - l + 1);
+ slack = l - 1;
+ }
+ if (slack)
+ efi_bs_call(free_pages, *addr + size, slack);
+ }
+ return EFI_SUCCESS;
+}
/* LPAE kernels need compatible hardware */
block = cpuid_feature_extract(CPUID_EXT_MMFR0, 0);
if (block < 5) {
- pr_efi_err("This LPAE kernel is not supported by your CPU\n");
+ efi_err("This LPAE kernel is not supported by your CPU\n");
return EFI_UNSUPPORTED;
}
return EFI_SUCCESS;
*/
status = efi_get_memory_map(&map);
if (status != EFI_SUCCESS) {
- pr_efi_err("reserve_kernel_base(): Unable to retrieve memory map.\n");
+ efi_err("reserve_kernel_base(): Unable to retrieve memory map.\n");
return status;
}
(end - start) / EFI_PAGE_SIZE,
&start);
if (status != EFI_SUCCESS) {
- pr_efi_err("reserve_kernel_base(): alloc failed.\n");
+ efi_err("reserve_kernel_base(): alloc failed.\n");
goto out;
}
break;
status = reserve_kernel_base(kernel_base, reserve_addr, reserve_size);
if (status != EFI_SUCCESS) {
- pr_efi_err("Unable to allocate memory for uncompressed kernel.\n");
+ efi_err("Unable to allocate memory for uncompressed kernel.\n");
return status;
}
status = efi_relocate_kernel(image_addr, *image_size, *image_size,
kernel_base + MAX_UNCOMP_KERNEL_SIZE, 0, 0);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to relocate kernel.\n");
+ efi_err("Failed to relocate kernel.\n");
efi_free(*reserve_size, *reserve_addr);
*reserve_size = 0;
return status;
* address at which the zImage is loaded.
*/
if (*image_addr + *image_size > dram_base + ZIMAGE_OFFSET_LIMIT) {
- pr_efi_err("Failed to relocate kernel, no low memory available.\n");
+ efi_err("Failed to relocate kernel, no low memory available.\n");
efi_free(*reserve_size, *reserve_addr);
*reserve_size = 0;
efi_free(*image_size, *image_addr);
tg = (read_cpuid(ID_AA64MMFR0_EL1) >> ID_AA64MMFR0_TGRAN_SHIFT) & 0xf;
if (tg != ID_AA64MMFR0_TGRAN_SUPPORTED) {
if (IS_ENABLED(CONFIG_ARM64_64K_PAGES))
- pr_efi_err("This 64 KB granular kernel is not supported by your CPU\n");
+ efi_err("This 64 KB granular kernel is not supported by your CPU\n");
else
- pr_efi_err("This 16 KB granular kernel is not supported by your CPU\n");
+ efi_err("This 16 KB granular kernel is not supported by your CPU\n");
return EFI_UNSUPPORTED;
}
return EFI_SUCCESS;
}
+/*
+ * Relocatable kernels can fix up the misalignment with respect to
+ * MIN_KIMG_ALIGN, so they only require a minimum alignment of EFI_KIMG_ALIGN
+ * (which accounts for the alignment of statically allocated objects such as
+ * the swapper stack.)
+ */
+static const u64 min_kimg_align = IS_ENABLED(CONFIG_RELOCATABLE) ? EFI_KIMG_ALIGN
+ : MIN_KIMG_ALIGN;
+
efi_status_t handle_kernel_image(unsigned long *image_addr,
unsigned long *image_size,
unsigned long *reserve_addr,
{
efi_status_t status;
unsigned long kernel_size, kernel_memsize = 0;
- unsigned long preferred_offset;
- u64 phys_seed = 0;
+ u32 phys_seed = 0;
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE)) {
- if (!nokaslr()) {
+ if (!efi_nokaslr) {
status = efi_get_random_bytes(sizeof(phys_seed),
(u8 *)&phys_seed);
if (status == EFI_NOT_FOUND) {
- pr_efi("EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
+ efi_info("EFI_RNG_PROTOCOL unavailable, no randomness supplied\n");
} else if (status != EFI_SUCCESS) {
- pr_efi_err("efi_get_random_bytes() failed\n");
+ efi_err("efi_get_random_bytes() failed\n");
return status;
}
} else {
- pr_efi("KASLR disabled on kernel command line\n");
+ efi_info("KASLR disabled on kernel command line\n");
}
}
- /*
- * The preferred offset of the kernel Image is TEXT_OFFSET bytes beyond
- * a 2 MB aligned base, which itself may be lower than dram_base, as
- * long as the resulting offset equals or exceeds it.
- */
- preferred_offset = round_down(dram_base, MIN_KIMG_ALIGN) + TEXT_OFFSET;
- if (preferred_offset < dram_base)
- preferred_offset += MIN_KIMG_ALIGN;
+ if (image->image_base != _text)
+ efi_err("FIRMWARE BUG: efi_loaded_image_t::image_base has bogus value\n");
kernel_size = _edata - _text;
kernel_memsize = kernel_size + (_end - _edata);
+ *reserve_size = kernel_memsize + TEXT_OFFSET % min_kimg_align;
if (IS_ENABLED(CONFIG_RANDOMIZE_BASE) && phys_seed != 0) {
- /*
- * Produce a displacement in the interval [0, MIN_KIMG_ALIGN)
- * that doesn't violate this kernel's de-facto alignment
- * constraints.
- */
- u32 mask = (MIN_KIMG_ALIGN - 1) & ~(EFI_KIMG_ALIGN - 1);
- u32 offset = (phys_seed >> 32) & mask;
-
- /*
- * With CONFIG_RANDOMIZE_TEXT_OFFSET=y, TEXT_OFFSET may not
- * be a multiple of EFI_KIMG_ALIGN, and we must ensure that
- * we preserve the misalignment of 'offset' relative to
- * EFI_KIMG_ALIGN so that statically allocated objects whose
- * alignment exceeds PAGE_SIZE appear correctly aligned in
- * memory.
- */
- offset |= TEXT_OFFSET % EFI_KIMG_ALIGN;
-
/*
* If KASLR is enabled, and we have some randomness available,
* locate the kernel at a randomized offset in physical memory.
*/
- *reserve_size = kernel_memsize + offset;
- status = efi_random_alloc(*reserve_size,
- MIN_KIMG_ALIGN, reserve_addr,
- (u32)phys_seed);
-
- *image_addr = *reserve_addr + offset;
+ status = efi_random_alloc(*reserve_size, min_kimg_align,
+ reserve_addr, phys_seed);
} else {
- /*
- * Else, try a straight allocation at the preferred offset.
- * This will work around the issue where, if dram_base == 0x0,
- * efi_low_alloc() refuses to allocate at 0x0 (to prevent the
- * address of the allocation to be mistaken for a FAIL return
- * value or a NULL pointer). It will also ensure that, on
- * platforms where the [dram_base, dram_base + TEXT_OFFSET)
- * interval is partially occupied by the firmware (like on APM
- * Mustang), we can still place the kernel at the address
- * 'dram_base + TEXT_OFFSET'.
- */
- *image_addr = (unsigned long)_text;
- if (*image_addr == preferred_offset)
- return EFI_SUCCESS;
-
- *image_addr = *reserve_addr = preferred_offset;
- *reserve_size = round_up(kernel_memsize, EFI_ALLOC_ALIGN);
-
- status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA,
- *reserve_size / EFI_PAGE_SIZE,
- (efi_physical_addr_t *)reserve_addr);
+ status = EFI_OUT_OF_RESOURCES;
}
if (status != EFI_SUCCESS) {
- *reserve_size = kernel_memsize + TEXT_OFFSET;
- status = efi_low_alloc(*reserve_size,
- MIN_KIMG_ALIGN, reserve_addr);
+ if (IS_ALIGNED((u64)_text - TEXT_OFFSET, min_kimg_align)) {
+ /*
+ * Just execute from wherever we were loaded by the
+ * UEFI PE/COFF loader if the alignment is suitable.
+ */
+ *image_addr = (u64)_text;
+ *reserve_size = 0;
+ return EFI_SUCCESS;
+ }
+
+ status = efi_allocate_pages_aligned(*reserve_size, reserve_addr,
+ ULONG_MAX, min_kimg_align);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to relocate kernel\n");
+ efi_err("Failed to relocate kernel\n");
*reserve_size = 0;
return status;
}
- *image_addr = *reserve_addr + TEXT_OFFSET;
}
- if (image->image_base != _text)
- pr_efi_err("FIRMWARE BUG: efi_loaded_image_t::image_base has bogus value\n");
-
+ *image_addr = *reserve_addr + TEXT_OFFSET % min_kimg_align;
memcpy((void *)*image_addr, _text, kernel_size);
return EFI_SUCCESS;
* Copyright 2011 Intel Corporation; author Matt Fleming
*/
+#include <stdarg.h>
+
+#include <linux/ctype.h>
#include <linux/efi.h>
+#include <linux/kernel.h>
+#include <linux/printk.h> /* For CONSOLE_LOGLEVEL_* */
#include <asm/efi.h>
+#include <asm/setup.h>
#include "efistub.h"
-static bool __efistub_global efi_nochunk;
-static bool __efistub_global efi_nokaslr;
-static bool __efistub_global efi_noinitrd;
-static bool __efistub_global efi_quiet;
-static bool __efistub_global efi_novamap;
-static bool __efistub_global efi_nosoftreserve;
-static bool __efistub_global efi_disable_pci_dma =
- IS_ENABLED(CONFIG_EFI_DISABLE_PCI_DMA);
+bool efi_nochunk;
+bool efi_nokaslr;
+bool efi_noinitrd;
+int efi_loglevel = CONSOLE_LOGLEVEL_DEFAULT;
+bool efi_novamap;
-bool __pure nochunk(void)
-{
- return efi_nochunk;
-}
-bool __pure nokaslr(void)
-{
- return efi_nokaslr;
-}
-bool __pure noinitrd(void)
+static bool efi_nosoftreserve;
+static bool efi_disable_pci_dma = IS_ENABLED(CONFIG_EFI_DISABLE_PCI_DMA);
+
+bool __pure __efi_soft_reserve_enabled(void)
{
- return efi_noinitrd;
+ return !efi_nosoftreserve;
}
-bool __pure is_quiet(void)
+
+void efi_char16_puts(efi_char16_t *str)
{
- return efi_quiet;
+ efi_call_proto(efi_table_attr(efi_system_table, con_out),
+ output_string, str);
}
-bool __pure novamap(void)
+
+static
+u32 utf8_to_utf32(const u8 **s8)
{
- return efi_novamap;
+ u32 c32;
+ u8 c0, cx;
+ size_t clen, i;
+
+ c0 = cx = *(*s8)++;
+ /*
+ * The position of the most-significant 0 bit gives us the length of
+ * a multi-octet encoding.
+ */
+ for (clen = 0; cx & 0x80; ++clen)
+ cx <<= 1;
+ /*
+ * If the 0 bit is in position 8, this is a valid single-octet
+ * encoding. If the 0 bit is in position 7 or positions 1-3, the
+ * encoding is invalid.
+ * In either case, we just return the first octet.
+ */
+ if (clen < 2 || clen > 4)
+ return c0;
+ /* Get the bits from the first octet. */
+ c32 = cx >> clen--;
+ for (i = 0; i < clen; ++i) {
+ /* Trailing octets must have 10 in most significant bits. */
+ cx = (*s8)[i] ^ 0x80;
+ if (cx & 0xc0)
+ return c0;
+ c32 = (c32 << 6) | cx;
+ }
+ /*
+ * Check for validity:
+ * - The character must be in the Unicode range.
+ * - It must not be a surrogate.
+ * - It must be encoded using the correct number of octets.
+ */
+ if (c32 > 0x10ffff ||
+ (c32 & 0xf800) == 0xd800 ||
+ clen != (c32 >= 0x80) + (c32 >= 0x800) + (c32 >= 0x10000))
+ return c0;
+ *s8 += clen;
+ return c32;
}
-bool __pure __efi_soft_reserve_enabled(void)
+
+void efi_puts(const char *str)
{
- return !efi_nosoftreserve;
+ efi_char16_t buf[128];
+ size_t pos = 0, lim = ARRAY_SIZE(buf);
+ const u8 *s8 = (const u8 *)str;
+ u32 c32;
+
+ while (*s8) {
+ if (*s8 == '\n')
+ buf[pos++] = L'\r';
+ c32 = utf8_to_utf32(&s8);
+ if (c32 < 0x10000) {
+ /* Characters in plane 0 use a single word. */
+ buf[pos++] = c32;
+ } else {
+ /*
+ * Characters in other planes encode into a surrogate
+ * pair.
+ */
+ buf[pos++] = (0xd800 - (0x10000 >> 10)) + (c32 >> 10);
+ buf[pos++] = 0xdc00 + (c32 & 0x3ff);
+ }
+ if (*s8 == '\0' || pos >= lim - 2) {
+ buf[pos] = L'\0';
+ efi_char16_puts(buf);
+ pos = 0;
+ }
+ }
}
-void efi_printk(char *str)
+int efi_printk(const char *fmt, ...)
{
- char *s8;
+ char printf_buf[256];
+ va_list args;
+ int printed;
+ int loglevel = printk_get_level(fmt);
+
+ switch (loglevel) {
+ case '0' ... '9':
+ loglevel -= '0';
+ break;
+ default:
+ /*
+ * Use loglevel -1 for cases where we just want to print to
+ * the screen.
+ */
+ loglevel = -1;
+ break;
+ }
- for (s8 = str; *s8; s8++) {
- efi_char16_t ch[2] = { 0 };
+ if (loglevel >= efi_loglevel)
+ return 0;
- ch[0] = *s8;
- if (*s8 == '\n') {
- efi_char16_t nl[2] = { '\r', 0 };
- efi_char16_printk(nl);
- }
+ if (loglevel >= 0)
+ efi_puts("EFI stub: ");
+
+ fmt = printk_skip_level(fmt);
+
+ va_start(args, fmt);
+ printed = vsnprintf(printf_buf, sizeof(printf_buf), fmt, args);
+ va_end(args);
- efi_char16_printk(ch);
+ efi_puts(printf_buf);
+ if (printed >= sizeof(printf_buf)) {
+ efi_puts("[Message truncated]\n");
+ return -1;
}
+
+ return printed;
}
/*
if (!strcmp(param, "nokaslr")) {
efi_nokaslr = true;
} else if (!strcmp(param, "quiet")) {
- efi_quiet = true;
+ efi_loglevel = CONSOLE_LOGLEVEL_QUIET;
} else if (!strcmp(param, "noinitrd")) {
efi_noinitrd = true;
} else if (!strcmp(param, "efi") && val) {
efi_disable_pci_dma = true;
if (parse_option_str(val, "no_disable_early_pci_dma"))
efi_disable_pci_dma = false;
+ if (parse_option_str(val, "debug"))
+ efi_loglevel = CONSOLE_LOGLEVEL_DEBUG;
+ } else if (!strcmp(param, "video") &&
+ val && strstarts(val, "efifb:")) {
+ efi_parse_option_graphics(val + strlen("efifb:"));
}
}
efi_bs_call(free_pool, buf);
return EFI_SUCCESS;
}
-/*
- * Get the number of UTF-8 bytes corresponding to an UTF-16 character.
- * This overestimates for surrogates, but that is okay.
- */
-static int efi_utf8_bytes(u16 c)
-{
- return 1 + (c >= 0x80) + (c >= 0x800);
-}
-
-/*
- * Convert an UTF-16 string, not necessarily null terminated, to UTF-8.
- */
-static u8 *efi_utf16_to_utf8(u8 *dst, const u16 *src, int n)
-{
- unsigned int c;
-
- while (n--) {
- c = *src++;
- if (n && c >= 0xd800 && c <= 0xdbff &&
- *src >= 0xdc00 && *src <= 0xdfff) {
- c = 0x10000 + ((c & 0x3ff) << 10) + (*src & 0x3ff);
- src++;
- n--;
- }
- if (c >= 0xd800 && c <= 0xdfff)
- c = 0xfffd; /* Unmatched surrogate */
- if (c < 0x80) {
- *dst++ = c;
- continue;
- }
- if (c < 0x800) {
- *dst++ = 0xc0 + (c >> 6);
- goto t1;
- }
- if (c < 0x10000) {
- *dst++ = 0xe0 + (c >> 12);
- goto t2;
- }
- *dst++ = 0xf0 + (c >> 18);
- *dst++ = 0x80 + ((c >> 12) & 0x3f);
- t2:
- *dst++ = 0x80 + ((c >> 6) & 0x3f);
- t1:
- *dst++ = 0x80 + (c & 0x3f);
- }
-
- return dst;
-}
-
/*
* Convert the unicode UEFI command line to ASCII to pass to kernel.
* Size of memory allocated return in *cmd_line_len.
* Returns NULL on error.
*/
-char *efi_convert_cmdline(efi_loaded_image_t *image,
- int *cmd_line_len, unsigned long max_addr)
+char *efi_convert_cmdline(efi_loaded_image_t *image, int *cmd_line_len)
{
const u16 *s2;
- u8 *s1 = NULL;
unsigned long cmdline_addr = 0;
- int load_options_chars = efi_table_attr(image, load_options_size) / 2;
+ int options_chars = efi_table_attr(image, load_options_size) / 2;
const u16 *options = efi_table_attr(image, load_options);
- int options_bytes = 0; /* UTF-8 bytes */
- int options_chars = 0; /* UTF-16 chars */
+ int options_bytes = 0, safe_options_bytes = 0; /* UTF-8 bytes */
+ bool in_quote = false;
efi_status_t status;
- u16 zero = 0;
if (options) {
s2 = options;
- while (*s2 && *s2 != '\n'
- && options_chars < load_options_chars) {
- options_bytes += efi_utf8_bytes(*s2++);
- options_chars++;
+ while (options_bytes < COMMAND_LINE_SIZE && options_chars--) {
+ u16 c = *s2++;
+
+ if (c < 0x80) {
+ if (c == L'\0' || c == L'\n')
+ break;
+ if (c == L'"')
+ in_quote = !in_quote;
+ else if (!in_quote && isspace((char)c))
+ safe_options_bytes = options_bytes;
+
+ options_bytes++;
+ continue;
+ }
+
+ /*
+ * Get the number of UTF-8 bytes corresponding to a
+ * UTF-16 character.
+ * The first part handles everything in the BMP.
+ */
+ options_bytes += 2 + (c >= 0x800);
+ /*
+ * Add one more byte for valid surrogate pairs. Invalid
+ * surrogates will be replaced with 0xfffd and take up
+ * only 3 bytes.
+ */
+ if ((c & 0xfc00) == 0xd800) {
+ /*
+ * If the very last word is a high surrogate,
+ * we must ignore it since we can't access the
+ * low surrogate.
+ */
+ if (!options_chars) {
+ options_bytes -= 3;
+ } else if ((*s2 & 0xfc00) == 0xdc00) {
+ options_bytes++;
+ options_chars--;
+ s2++;
+ }
+ }
+ }
+ if (options_bytes >= COMMAND_LINE_SIZE) {
+ options_bytes = safe_options_bytes;
+ efi_err("Command line is too long: truncated to %d bytes\n",
+ options_bytes);
}
- }
-
- if (!options_chars) {
- /* No command line options, so return empty string*/
- options = &zero;
}
options_bytes++; /* NUL termination */
- status = efi_allocate_pages(options_bytes, &cmdline_addr, max_addr);
+ status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, options_bytes,
+ (void **)&cmdline_addr);
if (status != EFI_SUCCESS)
return NULL;
- s1 = (u8 *)cmdline_addr;
- s2 = (const u16 *)options;
-
- s1 = efi_utf16_to_utf8(s1, s2, options_chars);
- *s1 = '\0';
+ snprintf((char *)cmdline_addr, options_bytes, "%.*ls",
+ options_bytes - 1, options);
*cmd_line_len = options_bytes;
return (char *)cmdline_addr;
void *get_efi_config_table(efi_guid_t guid)
{
- unsigned long tables = efi_table_attr(efi_system_table(), tables);
- int nr_tables = efi_table_attr(efi_system_table(), nr_tables);
+ unsigned long tables = efi_table_attr(efi_system_table, tables);
+ int nr_tables = efi_table_attr(efi_system_table, nr_tables);
int i;
for (i = 0; i < nr_tables; i++) {
return NULL;
}
-void efi_char16_printk(efi_char16_t *str)
-{
- efi_call_proto(efi_table_attr(efi_system_table(), con_out),
- output_string, str);
-}
-
/*
* The LINUX_EFI_INITRD_MEDIA_GUID vendor media device path below provides a way
* for the firmware or bootloader to expose the initrd data directly to the stub
* %EFI_OUT_OF_RESOURCES if memory allocation failed
* %EFI_LOAD_ERROR in all other cases
*/
+static
efi_status_t efi_load_initrd_dev_path(unsigned long *load_addr,
unsigned long *load_size,
unsigned long max)
efi_handle_t handle;
efi_status_t status;
- if (!load_addr || !load_size)
- return EFI_INVALID_PARAMETER;
-
dp = (efi_device_path_protocol_t *)&initrd_dev_path;
status = efi_bs_call(locate_device_path, &lf2_proto_guid, &dp, &handle);
if (status != EFI_SUCCESS)
*load_size = initrd_size;
return EFI_SUCCESS;
}
+
+static
+efi_status_t efi_load_initrd_cmdline(efi_loaded_image_t *image,
+ unsigned long *load_addr,
+ unsigned long *load_size,
+ unsigned long soft_limit,
+ unsigned long hard_limit)
+{
+ if (!IS_ENABLED(CONFIG_EFI_GENERIC_STUB_INITRD_CMDLINE_LOADER) ||
+ (IS_ENABLED(CONFIG_X86) && (!efi_is_native() || image == NULL))) {
+ *load_addr = *load_size = 0;
+ return EFI_SUCCESS;
+ }
+
+ return handle_cmdline_files(image, L"initrd=", sizeof(L"initrd=") - 2,
+ soft_limit, hard_limit,
+ load_addr, load_size);
+}
+
+efi_status_t efi_load_initrd(efi_loaded_image_t *image,
+ unsigned long *load_addr,
+ unsigned long *load_size,
+ unsigned long soft_limit,
+ unsigned long hard_limit)
+{
+ efi_status_t status;
+
+ if (!load_addr || !load_size)
+ return EFI_INVALID_PARAMETER;
+
+ status = efi_load_initrd_dev_path(load_addr, load_size, hard_limit);
+ if (status == EFI_SUCCESS) {
+ efi_info("Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path\n");
+ } else if (status == EFI_NOT_FOUND) {
+ status = efi_load_initrd_cmdline(image, load_addr, load_size,
+ soft_limit, hard_limit);
+ if (status == EFI_SUCCESS && *load_size > 0)
+ efi_info("Loaded initrd from command line option\n");
+ }
+
+ return status;
+}
+
+efi_status_t efi_wait_for_key(unsigned long usec, efi_input_key_t *key)
+{
+ efi_event_t events[2], timer;
+ unsigned long index;
+ efi_simple_text_input_protocol_t *con_in;
+ efi_status_t status;
+
+ con_in = efi_table_attr(efi_system_table, con_in);
+ if (!con_in)
+ return EFI_UNSUPPORTED;
+ efi_set_event_at(events, 0, efi_table_attr(con_in, wait_for_key));
+
+ status = efi_bs_call(create_event, EFI_EVT_TIMER, 0, NULL, NULL, &timer);
+ if (status != EFI_SUCCESS)
+ return status;
+
+ status = efi_bs_call(set_timer, timer, EfiTimerRelative,
+ EFI_100NSEC_PER_USEC * usec);
+ if (status != EFI_SUCCESS)
+ return status;
+ efi_set_event_at(events, 1, timer);
+
+ status = efi_bs_call(wait_for_event, 2, events, &index);
+ if (status == EFI_SUCCESS) {
+ if (index == 0)
+ status = efi_call_proto(con_in, read_keystroke, key);
+ else
+ status = EFI_TIMEOUT;
+ }
+
+ efi_bs_call(close_event, timer);
+
+ return status;
+}
#endif
static u64 virtmap_base = EFI_RT_VIRTUAL_BASE;
-static bool __efistub_global flat_va_mapping;
+static bool flat_va_mapping;
-static efi_system_table_t *__efistub_global sys_table;
-
-__pure efi_system_table_t *efi_system_table(void)
-{
- return sys_table;
-}
+const efi_system_table_t *efi_system_table;
static struct screen_info *setup_graphics(void)
{
si = alloc_screen_info();
if (!si)
return NULL;
- efi_setup_gop(si, &gop_proto, size);
+ status = efi_setup_gop(si, &gop_proto, size);
+ if (status != EFI_SUCCESS) {
+ free_screen_info(si);
+ return NULL;
+ }
}
return si;
}
-void install_memreserve_table(void)
+static void install_memreserve_table(void)
{
struct linux_efi_memreserve *rsv;
efi_guid_t memreserve_table_guid = LINUX_EFI_MEMRESERVE_TABLE_GUID;
status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, sizeof(*rsv),
(void **)&rsv);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to allocate memreserve entry!\n");
+ efi_err("Failed to allocate memreserve entry!\n");
return;
}
status = efi_bs_call(install_configuration_table,
&memreserve_table_guid, rsv);
if (status != EFI_SUCCESS)
- pr_efi_err("Failed to install memreserve config table!\n");
+ efi_err("Failed to install memreserve config table!\n");
}
static unsigned long get_dram_base(void)
* for both archictectures, with the arch-specific code provided in the
* handle_kernel_image() function.
*/
-efi_status_t efi_entry(efi_handle_t handle, efi_system_table_t *sys_table_arg)
+efi_status_t __efiapi efi_pe_entry(efi_handle_t handle,
+ efi_system_table_t *sys_table_arg)
{
efi_loaded_image_t *image;
efi_status_t status;
efi_properties_table_t *prop_tbl;
unsigned long max_addr;
- sys_table = sys_table_arg;
+ efi_system_table = sys_table_arg;
/* Check if we were booted by the EFI firmware */
- if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) {
+ if (efi_system_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE) {
status = EFI_INVALID_PARAMETER;
goto fail;
}
* information about the running image, such as size and the command
* line.
*/
- status = sys_table->boottime->handle_protocol(handle,
+ status = efi_system_table->boottime->handle_protocol(handle,
&loaded_image_proto, (void *)&image);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to get loaded image protocol\n");
+ efi_err("Failed to get loaded image protocol\n");
goto fail;
}
dram_base = get_dram_base();
if (dram_base == EFI_ERROR) {
- pr_efi_err("Failed to find DRAM base\n");
+ efi_err("Failed to find DRAM base\n");
status = EFI_LOAD_ERROR;
goto fail;
}
* protocol. We are going to copy the command line into the
* device tree, so this can be allocated anywhere.
*/
- cmdline_ptr = efi_convert_cmdline(image, &cmdline_size, ULONG_MAX);
+ cmdline_ptr = efi_convert_cmdline(image, &cmdline_size);
if (!cmdline_ptr) {
- pr_efi_err("getting command line via LOADED_IMAGE_PROTOCOL\n");
+ efi_err("getting command line via LOADED_IMAGE_PROTOCOL\n");
status = EFI_OUT_OF_RESOURCES;
goto fail;
}
if (IS_ENABLED(CONFIG_CMDLINE_EXTEND) ||
IS_ENABLED(CONFIG_CMDLINE_FORCE) ||
- cmdline_size == 0)
- efi_parse_options(CONFIG_CMDLINE);
+ cmdline_size == 0) {
+ status = efi_parse_options(CONFIG_CMDLINE);
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to parse options\n");
+ goto fail_free_cmdline;
+ }
+ }
- if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && cmdline_size > 0)
- efi_parse_options(cmdline_ptr);
+ if (!IS_ENABLED(CONFIG_CMDLINE_FORCE) && cmdline_size > 0) {
+ status = efi_parse_options(cmdline_ptr);
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to parse options\n");
+ goto fail_free_cmdline;
+ }
+ }
- pr_efi("Booting Linux Kernel...\n");
+ efi_info("Booting Linux Kernel...\n");
si = setup_graphics();
&reserve_size,
dram_base, image);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to relocate kernel\n");
- goto fail_free_cmdline;
+ efi_err("Failed to relocate kernel\n");
+ goto fail_free_screeninfo;
}
efi_retrieve_tpm2_eventlog();
if (!IS_ENABLED(CONFIG_EFI_ARMSTUB_DTB_LOADER) ||
secure_boot != efi_secureboot_mode_disabled) {
if (strstr(cmdline_ptr, "dtb="))
- pr_efi("Ignoring DTB from command line.\n");
+ efi_err("Ignoring DTB from command line.\n");
} else {
status = efi_load_dtb(image, &fdt_addr, &fdt_size);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to load device tree!\n");
+ efi_err("Failed to load device tree!\n");
goto fail_free_image;
}
}
if (fdt_addr) {
- pr_efi("Using DTB from command line\n");
+ efi_info("Using DTB from command line\n");
} else {
/* Look for a device tree configuration table entry. */
fdt_addr = (uintptr_t)get_fdt(&fdt_size);
if (fdt_addr)
- pr_efi("Using DTB from configuration table\n");
+ efi_info("Using DTB from configuration table\n");
}
if (!fdt_addr)
- pr_efi("Generating empty DTB\n");
+ efi_info("Generating empty DTB\n");
- if (!noinitrd()) {
+ if (!efi_noinitrd) {
max_addr = efi_get_max_initrd_addr(dram_base, image_addr);
- status = efi_load_initrd_dev_path(&initrd_addr, &initrd_size,
- max_addr);
- if (status == EFI_SUCCESS) {
- pr_efi("Loaded initrd from LINUX_EFI_INITRD_MEDIA_GUID device path\n");
- } else if (status == EFI_NOT_FOUND) {
- status = efi_load_initrd(image, &initrd_addr, &initrd_size,
- ULONG_MAX, max_addr);
- if (status == EFI_SUCCESS && initrd_size > 0)
- pr_efi("Loaded initrd from command line option\n");
- }
+ status = efi_load_initrd(image, &initrd_addr, &initrd_size,
+ ULONG_MAX, max_addr);
if (status != EFI_SUCCESS)
- pr_efi_err("Failed to load initrd!\n");
+ efi_err("Failed to load initrd!\n");
}
efi_random_get_seed();
EFI_PROPERTIES_RUNTIME_MEMORY_PROTECTION_NON_EXECUTABLE_PE_DATA);
/* hibernation expects the runtime regions to stay in the same place */
- if (!IS_ENABLED(CONFIG_HIBERNATION) && !nokaslr() && !flat_va_mapping) {
+ if (!IS_ENABLED(CONFIG_HIBERNATION) && !efi_nokaslr && !flat_va_mapping) {
/*
* Randomize the base of the UEFI runtime services region.
* Preserve the 2 MB alignment of the region by taking a
/* not reached */
fail_free_initrd:
- pr_efi_err("Failed to update FDT and exit boot services\n");
+ efi_err("Failed to update FDT and exit boot services\n");
efi_free(initrd_size, initrd_addr);
efi_free(fdt_size, fdt_addr);
fail_free_image:
efi_free(image_size, image_addr);
efi_free(reserve_size, reserve_addr);
-fail_free_cmdline:
+fail_free_screeninfo:
free_screen_info(si);
- efi_free(cmdline_size, (unsigned long)cmdline_ptr);
+fail_free_cmdline:
+ efi_bs_call(free_pool, cmdline_ptr);
fail:
return status;
}
size = in->num_pages * EFI_PAGE_SIZE;
in->virt_addr = in->phys_addr;
- if (novamap()) {
+ if (efi_novamap) {
continue;
}
#ifndef _DRIVERS_FIRMWARE_EFI_EFISTUB_H
#define _DRIVERS_FIRMWARE_EFI_EFISTUB_H
+#include <linux/compiler.h>
+#include <linux/efi.h>
+#include <linux/kernel.h>
+#include <linux/kern_levels.h>
+#include <linux/types.h>
+#include <asm/efi.h>
+
/* error code which can't be mistaken for valid address */
#define EFI_ERROR (~0UL)
#define EFI_ALLOC_ALIGN EFI_PAGE_SIZE
#endif
-#if defined(CONFIG_ARM) || defined(CONFIG_X86)
-#define __efistub_global __section(.data)
-#else
-#define __efistub_global
-#endif
+extern bool efi_nochunk;
+extern bool efi_nokaslr;
+extern bool efi_noinitrd;
+extern int efi_loglevel;
+extern bool efi_novamap;
-extern bool __pure nochunk(void);
-extern bool __pure nokaslr(void);
-extern bool __pure noinitrd(void);
-extern bool __pure is_quiet(void);
-extern bool __pure novamap(void);
+extern const efi_system_table_t *efi_system_table;
-extern __pure efi_system_table_t *efi_system_table(void);
+efi_status_t __efiapi efi_pe_entry(efi_handle_t handle,
+ efi_system_table_t *sys_table_arg);
-#define pr_efi(msg) do { \
- if (!is_quiet()) efi_printk("EFI stub: "msg); \
-} while (0)
+#ifndef ARCH_HAS_EFISTUB_WRAPPERS
-#define pr_efi_err(msg) efi_printk("EFI stub: ERROR: "msg)
+#define efi_is_native() (true)
+#define efi_bs_call(func, ...) efi_system_table->boottime->func(__VA_ARGS__)
+#define efi_rt_call(func, ...) efi_system_table->runtime->func(__VA_ARGS__)
+#define efi_table_attr(inst, attr) (inst->attr)
+#define efi_call_proto(inst, func, ...) inst->func(inst, ##__VA_ARGS__)
+
+#endif
+
+#define efi_info(fmt, ...) \
+ efi_printk(KERN_INFO fmt, ##__VA_ARGS__)
+#define efi_err(fmt, ...) \
+ efi_printk(KERN_ERR "ERROR: " fmt, ##__VA_ARGS__)
+#define efi_debug(fmt, ...) \
+ efi_printk(KERN_DEBUG "DEBUG: " fmt, ##__VA_ARGS__)
/* Helper macros for the usual case of using simple C variables: */
#ifndef fdt_setprop_inplace_var
((handle = efi_get_handle_at((array), i)) || true); \
i++)
+static inline
+void efi_set_u64_split(u64 data, u32 *lo, u32 *hi)
+{
+ *lo = lower_32_bits(data);
+ *hi = upper_32_bits(data);
+}
+
/*
* Allocation types for calls to boottime->allocate_pages.
*/
#define EFI_LOCATE_BY_REGISTER_NOTIFY 1
#define EFI_LOCATE_BY_PROTOCOL 2
+/*
+ * boottime->stall takes the time period in microseconds
+ */
+#define EFI_USEC_PER_SEC 1000000
+
+/*
+ * boottime->set_timer takes the time in 100ns units
+ */
+#define EFI_100NSEC_PER_USEC ((u64)10)
+
+/*
+ * An efi_boot_memmap is used by efi_get_memory_map() to return the
+ * EFI memory map in a dynamically allocated buffer.
+ *
+ * The buffer allocated for the EFI memory map includes extra room for
+ * a minimum of EFI_MMAP_NR_SLACK_SLOTS additional EFI memory descriptors.
+ * This facilitates the reuse of the EFI memory map buffer when a second
+ * call to ExitBootServices() is needed because of intervening changes to
+ * the EFI memory map. Other related structures, e.g. x86 e820ext, need
+ * to factor in this headroom requirement as well.
+ */
+#define EFI_MMAP_NR_SLACK_SLOTS 8
+
struct efi_boot_memmap {
efi_memory_desc_t **map;
unsigned long *map_size;
typedef struct efi_generic_dev_path efi_device_path_protocol_t;
+typedef void *efi_event_t;
+/* Note that notifications won't work in mixed mode */
+typedef void (__efiapi *efi_event_notify_t)(efi_event_t, void *);
+
+#define EFI_EVT_TIMER 0x80000000U
+#define EFI_EVT_RUNTIME 0x40000000U
+#define EFI_EVT_NOTIFY_WAIT 0x00000100U
+#define EFI_EVT_NOTIFY_SIGNAL 0x00000200U
+
+/*
+ * boottime->wait_for_event takes an array of events as input.
+ * Provide a helper to set it up correctly for mixed mode.
+ */
+static inline
+void efi_set_event_at(efi_event_t *events, size_t idx, efi_event_t event)
+{
+ if (efi_is_native())
+ events[idx] = event;
+ else
+ ((u32 *)events)[idx] = (u32)(unsigned long)event;
+}
+
+#define EFI_TPL_APPLICATION 4
+#define EFI_TPL_CALLBACK 8
+#define EFI_TPL_NOTIFY 16
+#define EFI_TPL_HIGH_LEVEL 31
+
+typedef enum {
+ EfiTimerCancel,
+ EfiTimerPeriodic,
+ EfiTimerRelative
+} EFI_TIMER_DELAY;
+
/*
* EFI Boot Services table
*/
efi_status_t (__efiapi *allocate_pool)(int, unsigned long,
void **);
efi_status_t (__efiapi *free_pool)(void *);
- void *create_event;
- void *set_timer;
- void *wait_for_event;
+ efi_status_t (__efiapi *create_event)(u32, unsigned long,
+ efi_event_notify_t, void *,
+ efi_event_t *);
+ efi_status_t (__efiapi *set_timer)(efi_event_t,
+ EFI_TIMER_DELAY, u64);
+ efi_status_t (__efiapi *wait_for_event)(unsigned long,
+ efi_event_t *,
+ unsigned long *);
void *signal_event;
- void *close_event;
+ efi_status_t (__efiapi *close_event)(efi_event_t);
void *check_event;
void *install_protocol_interface;
void *reinstall_protocol_interface;
efi_status_t (__efiapi *exit_boot_services)(efi_handle_t,
unsigned long);
void *get_next_monotonic_count;
- void *stall;
+ efi_status_t (__efiapi *stall)(unsigned long);
void *set_watchdog_timer;
void *connect_controller;
efi_status_t (__efiapi *disconnect_controller)(efi_handle_t,
} mixed_mode;
};
+typedef struct {
+ u16 scan_code;
+ efi_char16_t unicode_char;
+} efi_input_key_t;
+
+union efi_simple_text_input_protocol {
+ struct {
+ void *reset;
+ efi_status_t (__efiapi *read_keystroke)(efi_simple_text_input_protocol_t *,
+ efi_input_key_t *);
+ efi_event_t wait_for_key;
+ };
+ struct {
+ u32 reset;
+ u32 read_keystroke;
+ u32 wait_for_key;
+ } mixed_mode;
+};
+
+efi_status_t efi_wait_for_key(unsigned long usec, efi_input_key_t *key);
+
union efi_simple_text_output_protocol {
struct {
void *reset;
union efi_graphics_output_protocol {
struct {
- void *query_mode;
- void *set_mode;
+ efi_status_t (__efiapi *query_mode)(efi_graphics_output_protocol_t *,
+ u32, unsigned long *,
+ efi_graphics_output_mode_info_t **);
+ efi_status_t (__efiapi *set_mode) (efi_graphics_output_protocol_t *, u32);
void *blt;
efi_graphics_output_protocol_mode_t *mode;
};
void *priv,
efi_exit_boot_map_processing priv_func);
-void efi_char16_printk(efi_char16_t *);
-
efi_status_t allocate_new_fdt_and_exit_boot(void *handle,
unsigned long *new_fdt_addr,
unsigned long max_addr,
void *get_efi_config_table(efi_guid_t guid);
-void efi_printk(char *str);
+/* NOTE: These functions do not print a trailing newline after the string */
+void efi_char16_puts(efi_char16_t *);
+void efi_puts(const char *str);
+
+__printf(1, 2) int efi_printk(char const *fmt, ...);
void efi_free(unsigned long size, unsigned long addr);
-char *efi_convert_cmdline(efi_loaded_image_t *image, int *cmd_line_len,
- unsigned long max_addr);
+char *efi_convert_cmdline(efi_loaded_image_t *image, int *cmd_line_len);
efi_status_t efi_get_memory_map(struct efi_boot_memmap *map);
-efi_status_t efi_low_alloc_above(unsigned long size, unsigned long align,
- unsigned long *addr, unsigned long min);
-
-static inline
-efi_status_t efi_low_alloc(unsigned long size, unsigned long align,
- unsigned long *addr)
-{
- /*
- * Don't allocate at 0x0. It will confuse code that
- * checks pointers against NULL. Skip the first 8
- * bytes so we start at a nice even number.
- */
- return efi_low_alloc_above(size, align, addr, 0x8);
-}
-
efi_status_t efi_allocate_pages(unsigned long size, unsigned long *addr,
unsigned long max);
+efi_status_t efi_allocate_pages_aligned(unsigned long size, unsigned long *addr,
+ unsigned long max, unsigned long align);
+
efi_status_t efi_relocate_kernel(unsigned long *image_addr,
unsigned long image_size,
unsigned long alloc_size,
efi_status_t efi_parse_options(char const *cmdline);
+void efi_parse_option_graphics(char *option);
+
efi_status_t efi_setup_gop(struct screen_info *si, efi_guid_t *proto,
unsigned long size);
-efi_status_t efi_load_dtb(efi_loaded_image_t *image,
- unsigned long *load_addr,
- unsigned long *load_size);
+efi_status_t handle_cmdline_files(efi_loaded_image_t *image,
+ const efi_char16_t *optstr,
+ int optstr_size,
+ unsigned long soft_limit,
+ unsigned long hard_limit,
+ unsigned long *load_addr,
+ unsigned long *load_size);
+
+
+static inline efi_status_t efi_load_dtb(efi_loaded_image_t *image,
+ unsigned long *load_addr,
+ unsigned long *load_size)
+{
+ return handle_cmdline_files(image, L"dtb=", sizeof(L"dtb=") - 2,
+ ULONG_MAX, ULONG_MAX, load_addr, load_size);
+}
efi_status_t efi_load_initrd(efi_loaded_image_t *image,
unsigned long *load_addr,
unsigned long soft_limit,
unsigned long hard_limit);
-efi_status_t efi_load_initrd_dev_path(unsigned long *load_addr,
- unsigned long *load_size,
- unsigned long max);
-
#endif
/* Do some checks on provided FDT, if it exists: */
if (orig_fdt) {
if (fdt_check_header(orig_fdt)) {
- pr_efi_err("Device Tree header not valid!\n");
+ efi_err("Device Tree header not valid!\n");
return EFI_LOAD_ERROR;
}
/*
* configuration table:
*/
if (orig_fdt_size && fdt_totalsize(orig_fdt) > orig_fdt_size) {
- pr_efi_err("Truncated device tree! foo!\n");
+ efi_err("Truncated device tree! foo!\n");
return EFI_LOAD_ERROR;
}
}
/* Add FDT entries for EFI runtime services in chosen node. */
node = fdt_subnode_offset(fdt, 0, "chosen");
- fdt_val64 = cpu_to_fdt64((u64)(unsigned long)efi_system_table());
+ fdt_val64 = cpu_to_fdt64((u64)(unsigned long)efi_system_table);
status = fdt_setprop_var(fdt, node, "linux,uefi-system-table", fdt_val64);
if (status)
*/
status = efi_get_memory_map(&map);
if (status != EFI_SUCCESS) {
- pr_efi_err("Unable to retrieve UEFI memory map.\n");
+ efi_err("Unable to retrieve UEFI memory map.\n");
return status;
}
- pr_efi("Exiting boot services and installing virtual address map...\n");
+ efi_info("Exiting boot services and installing virtual address map...\n");
map.map = &memory_map;
status = efi_allocate_pages(MAX_FDT_SIZE, new_fdt_addr, max_addr);
if (status != EFI_SUCCESS) {
- pr_efi_err("Unable to allocate memory for new device tree.\n");
+ efi_err("Unable to allocate memory for new device tree.\n");
goto fail;
}
initrd_addr, initrd_size);
if (status != EFI_SUCCESS) {
- pr_efi_err("Unable to construct new device tree.\n");
+ efi_err("Unable to construct new device tree.\n");
goto fail_free_new_fdt;
}
if (status == EFI_SUCCESS) {
efi_set_virtual_address_map_t *svam;
- if (novamap())
+ if (efi_novamap)
return EFI_SUCCESS;
/* Install the new virtual address map */
- svam = efi_system_table()->runtime->set_virtual_address_map;
+ svam = efi_system_table->runtime->set_virtual_address_map;
status = svam(runtime_entry_count * desc_size, desc_size,
desc_ver, runtime_map);
return EFI_SUCCESS;
}
- pr_efi_err("Exit boot services failed.\n");
+ efi_err("Exit boot services failed.\n");
fail_free_new_fdt:
efi_free(MAX_FDT_SIZE, *new_fdt_addr);
fail:
- efi_system_table()->boottime->free_pool(runtime_map);
+ efi_system_table->boottime->free_pool(runtime_map);
return EFI_LOAD_ERROR;
}
return NULL;
if (fdt_check_header(fdt) != 0) {
- pr_efi_err("Invalid header detected on UEFI supplied FDT, ignoring ...\n");
+ efi_err("Invalid header detected on UEFI supplied FDT, ignoring ...\n");
return NULL;
}
*fdt_size = fdt_totalsize(fdt);
status = volume->open(volume, &fh, fi->filename, EFI_FILE_MODE_READ, 0);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to open file: ");
- efi_char16_printk(fi->filename);
- efi_printk("\n");
+ efi_err("Failed to open file: %ls\n", fi->filename);
return status;
}
info_sz = sizeof(struct finfo);
status = fh->get_info(fh, &info_guid, &info_sz, fi);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to get file info\n");
+ efi_err("Failed to get file info\n");
fh->close(fh);
return status;
}
status = efi_bs_call(handle_protocol, image->device_handle, &fs_proto,
(void **)&io);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to handle fs_proto\n");
+ efi_err("Failed to handle fs_proto\n");
return status;
}
status = io->open_volume(io, fh);
if (status != EFI_SUCCESS)
- pr_efi_err("Failed to open volume\n");
+ efi_err("Failed to open volume\n");
return status;
}
* We only support loading a file from the same filesystem as
* the kernel image.
*/
-static efi_status_t handle_cmdline_files(efi_loaded_image_t *image,
- const efi_char16_t *optstr,
- int optstr_size,
- unsigned long soft_limit,
- unsigned long hard_limit,
- unsigned long *load_addr,
- unsigned long *load_size)
+efi_status_t handle_cmdline_files(efi_loaded_image_t *image,
+ const efi_char16_t *optstr,
+ int optstr_size,
+ unsigned long soft_limit,
+ unsigned long hard_limit,
+ unsigned long *load_addr,
+ unsigned long *load_size)
{
const efi_char16_t *cmdline = image->load_options;
int cmdline_len = image->load_options_size / 2;
if (!load_addr || !load_size)
return EFI_INVALID_PARAMETER;
- if (IS_ENABLED(CONFIG_X86) && !nochunk())
+ if (IS_ENABLED(CONFIG_X86) && !efi_nochunk)
efi_chunk_size = EFI_READ_CHUNK_SIZE;
alloc_addr = alloc_size = 0;
&alloc_addr,
hard_limit);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to allocate memory for files\n");
+ efi_err("Failed to allocate memory for files\n");
goto err_close_file;
}
status = file->read(file, &chunksize, addr);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to read file\n");
+ efi_err("Failed to read file\n");
goto err_close_file;
}
addr += chunksize;
efi_free(alloc_size, alloc_addr);
return status;
}
-
-efi_status_t efi_load_dtb(efi_loaded_image_t *image,
- unsigned long *load_addr,
- unsigned long *load_size)
-{
- return handle_cmdline_files(image, L"dtb=", sizeof(L"dtb=") - 2,
- ULONG_MAX, ULONG_MAX, load_addr, load_size);
-}
-
-efi_status_t efi_load_initrd(efi_loaded_image_t *image,
- unsigned long *load_addr,
- unsigned long *load_size,
- unsigned long soft_limit,
- unsigned long hard_limit)
-{
- return handle_cmdline_files(image, L"initrd=", sizeof(L"initrd=") - 2,
- soft_limit, hard_limit, load_addr, load_size);
-}
*
* ----------------------------------------------------------------------- */
+#include <linux/bitops.h>
+#include <linux/ctype.h>
#include <linux/efi.h>
#include <linux/screen_info.h>
+#include <linux/string.h>
#include <asm/efi.h>
#include <asm/setup.h>
#include "efistub.h"
-static void find_bits(unsigned long mask, u8 *pos, u8 *size)
+enum efi_cmdline_option {
+ EFI_CMDLINE_NONE,
+ EFI_CMDLINE_MODE_NUM,
+ EFI_CMDLINE_RES,
+ EFI_CMDLINE_AUTO,
+ EFI_CMDLINE_LIST
+};
+
+static struct {
+ enum efi_cmdline_option option;
+ union {
+ u32 mode;
+ struct {
+ u32 width, height;
+ int format;
+ u8 depth;
+ } res;
+ };
+} cmdline = { .option = EFI_CMDLINE_NONE };
+
+static bool parse_modenum(char *option, char **next)
+{
+ u32 m;
+
+ if (!strstarts(option, "mode="))
+ return false;
+ option += strlen("mode=");
+ m = simple_strtoull(option, &option, 0);
+ if (*option && *option++ != ',')
+ return false;
+ cmdline.option = EFI_CMDLINE_MODE_NUM;
+ cmdline.mode = m;
+
+ *next = option;
+ return true;
+}
+
+static bool parse_res(char *option, char **next)
+{
+ u32 w, h, d = 0;
+ int pf = -1;
+
+ if (!isdigit(*option))
+ return false;
+ w = simple_strtoull(option, &option, 10);
+ if (*option++ != 'x' || !isdigit(*option))
+ return false;
+ h = simple_strtoull(option, &option, 10);
+ if (*option == '-') {
+ option++;
+ if (strstarts(option, "rgb")) {
+ option += strlen("rgb");
+ pf = PIXEL_RGB_RESERVED_8BIT_PER_COLOR;
+ } else if (strstarts(option, "bgr")) {
+ option += strlen("bgr");
+ pf = PIXEL_BGR_RESERVED_8BIT_PER_COLOR;
+ } else if (isdigit(*option))
+ d = simple_strtoull(option, &option, 10);
+ else
+ return false;
+ }
+ if (*option && *option++ != ',')
+ return false;
+ cmdline.option = EFI_CMDLINE_RES;
+ cmdline.res.width = w;
+ cmdline.res.height = h;
+ cmdline.res.format = pf;
+ cmdline.res.depth = d;
+
+ *next = option;
+ return true;
+}
+
+static bool parse_auto(char *option, char **next)
+{
+ if (!strstarts(option, "auto"))
+ return false;
+ option += strlen("auto");
+ if (*option && *option++ != ',')
+ return false;
+ cmdline.option = EFI_CMDLINE_AUTO;
+
+ *next = option;
+ return true;
+}
+
+static bool parse_list(char *option, char **next)
{
- u8 first, len;
+ if (!strstarts(option, "list"))
+ return false;
+ option += strlen("list");
+ if (*option && *option++ != ',')
+ return false;
+ cmdline.option = EFI_CMDLINE_LIST;
+
+ *next = option;
+ return true;
+}
+
+void efi_parse_option_graphics(char *option)
+{
+ while (*option) {
+ if (parse_modenum(option, &option))
+ continue;
+ if (parse_res(option, &option))
+ continue;
+ if (parse_auto(option, &option))
+ continue;
+ if (parse_list(option, &option))
+ continue;
+
+ while (*option && *option++ != ',')
+ ;
+ }
+}
+
+static u32 choose_mode_modenum(efi_graphics_output_protocol_t *gop)
+{
+ efi_status_t status;
+
+ efi_graphics_output_protocol_mode_t *mode;
+ efi_graphics_output_mode_info_t *info;
+ unsigned long info_size;
+
+ u32 max_mode, cur_mode;
+ int pf;
+
+ mode = efi_table_attr(gop, mode);
+
+ cur_mode = efi_table_attr(mode, mode);
+ if (cmdline.mode == cur_mode)
+ return cur_mode;
+
+ max_mode = efi_table_attr(mode, max_mode);
+ if (cmdline.mode >= max_mode) {
+ efi_err("Requested mode is invalid\n");
+ return cur_mode;
+ }
+
+ status = efi_call_proto(gop, query_mode, cmdline.mode,
+ &info_size, &info);
+ if (status != EFI_SUCCESS) {
+ efi_err("Couldn't get mode information\n");
+ return cur_mode;
+ }
+
+ pf = info->pixel_format;
+
+ efi_bs_call(free_pool, info);
+
+ if (pf == PIXEL_BLT_ONLY || pf >= PIXEL_FORMAT_MAX) {
+ efi_err("Invalid PixelFormat\n");
+ return cur_mode;
+ }
+
+ return cmdline.mode;
+}
+
+static u8 pixel_bpp(int pixel_format, efi_pixel_bitmask_t pixel_info)
+{
+ if (pixel_format == PIXEL_BIT_MASK) {
+ u32 mask = pixel_info.red_mask | pixel_info.green_mask |
+ pixel_info.blue_mask | pixel_info.reserved_mask;
+ if (!mask)
+ return 0;
+ return __fls(mask) - __ffs(mask) + 1;
+ } else
+ return 32;
+}
+
+static u32 choose_mode_res(efi_graphics_output_protocol_t *gop)
+{
+ efi_status_t status;
+
+ efi_graphics_output_protocol_mode_t *mode;
+ efi_graphics_output_mode_info_t *info;
+ unsigned long info_size;
+
+ u32 max_mode, cur_mode;
+ int pf;
+ efi_pixel_bitmask_t pi;
+ u32 m, w, h;
+
+ mode = efi_table_attr(gop, mode);
+
+ cur_mode = efi_table_attr(mode, mode);
+ info = efi_table_attr(mode, info);
+ pf = info->pixel_format;
+ pi = info->pixel_information;
+ w = info->horizontal_resolution;
+ h = info->vertical_resolution;
+
+ if (w == cmdline.res.width && h == cmdline.res.height &&
+ (cmdline.res.format < 0 || cmdline.res.format == pf) &&
+ (!cmdline.res.depth || cmdline.res.depth == pixel_bpp(pf, pi)))
+ return cur_mode;
+
+ max_mode = efi_table_attr(mode, max_mode);
+
+ for (m = 0; m < max_mode; m++) {
+ if (m == cur_mode)
+ continue;
+
+ status = efi_call_proto(gop, query_mode, m,
+ &info_size, &info);
+ if (status != EFI_SUCCESS)
+ continue;
+
+ pf = info->pixel_format;
+ pi = info->pixel_information;
+ w = info->horizontal_resolution;
+ h = info->vertical_resolution;
+
+ efi_bs_call(free_pool, info);
+
+ if (pf == PIXEL_BLT_ONLY || pf >= PIXEL_FORMAT_MAX)
+ continue;
+ if (w == cmdline.res.width && h == cmdline.res.height &&
+ (cmdline.res.format < 0 || cmdline.res.format == pf) &&
+ (!cmdline.res.depth || cmdline.res.depth == pixel_bpp(pf, pi)))
+ return m;
+ }
+
+ efi_err("Couldn't find requested mode\n");
+
+ return cur_mode;
+}
+
+static u32 choose_mode_auto(efi_graphics_output_protocol_t *gop)
+{
+ efi_status_t status;
+
+ efi_graphics_output_protocol_mode_t *mode;
+ efi_graphics_output_mode_info_t *info;
+ unsigned long info_size;
+
+ u32 max_mode, cur_mode, best_mode, area;
+ u8 depth;
+ int pf;
+ efi_pixel_bitmask_t pi;
+ u32 m, w, h, a;
+ u8 d;
+
+ mode = efi_table_attr(gop, mode);
+
+ cur_mode = efi_table_attr(mode, mode);
+ max_mode = efi_table_attr(mode, max_mode);
- first = 0;
- len = 0;
+ info = efi_table_attr(mode, info);
- if (mask) {
- while (!(mask & 0x1)) {
- mask = mask >> 1;
- first++;
+ pf = info->pixel_format;
+ pi = info->pixel_information;
+ w = info->horizontal_resolution;
+ h = info->vertical_resolution;
+
+ best_mode = cur_mode;
+ area = w * h;
+ depth = pixel_bpp(pf, pi);
+
+ for (m = 0; m < max_mode; m++) {
+ if (m == cur_mode)
+ continue;
+
+ status = efi_call_proto(gop, query_mode, m,
+ &info_size, &info);
+ if (status != EFI_SUCCESS)
+ continue;
+
+ pf = info->pixel_format;
+ pi = info->pixel_information;
+ w = info->horizontal_resolution;
+ h = info->vertical_resolution;
+
+ efi_bs_call(free_pool, info);
+
+ if (pf == PIXEL_BLT_ONLY || pf >= PIXEL_FORMAT_MAX)
+ continue;
+ a = w * h;
+ if (a < area)
+ continue;
+ d = pixel_bpp(pf, pi);
+ if (a > area || d > depth) {
+ best_mode = m;
+ area = a;
+ depth = d;
}
+ }
+
+ return best_mode;
+}
+
+static u32 choose_mode_list(efi_graphics_output_protocol_t *gop)
+{
+ efi_status_t status;
+
+ efi_graphics_output_protocol_mode_t *mode;
+ efi_graphics_output_mode_info_t *info;
+ unsigned long info_size;
+
+ u32 max_mode, cur_mode;
+ int pf;
+ efi_pixel_bitmask_t pi;
+ u32 m, w, h;
+ u8 d;
+ const char *dstr;
+ bool valid;
+ efi_input_key_t key;
- while (mask & 0x1) {
- mask = mask >> 1;
- len++;
+ mode = efi_table_attr(gop, mode);
+
+ cur_mode = efi_table_attr(mode, mode);
+ max_mode = efi_table_attr(mode, max_mode);
+
+ efi_printk("Available graphics modes are 0-%u\n", max_mode-1);
+ efi_puts(" * = current mode\n"
+ " - = unusable mode\n");
+ for (m = 0; m < max_mode; m++) {
+ status = efi_call_proto(gop, query_mode, m,
+ &info_size, &info);
+ if (status != EFI_SUCCESS)
+ continue;
+
+ pf = info->pixel_format;
+ pi = info->pixel_information;
+ w = info->horizontal_resolution;
+ h = info->vertical_resolution;
+
+ efi_bs_call(free_pool, info);
+
+ valid = !(pf == PIXEL_BLT_ONLY || pf >= PIXEL_FORMAT_MAX);
+ d = 0;
+ switch (pf) {
+ case PIXEL_RGB_RESERVED_8BIT_PER_COLOR:
+ dstr = "rgb";
+ break;
+ case PIXEL_BGR_RESERVED_8BIT_PER_COLOR:
+ dstr = "bgr";
+ break;
+ case PIXEL_BIT_MASK:
+ dstr = "";
+ d = pixel_bpp(pf, pi);
+ break;
+ case PIXEL_BLT_ONLY:
+ dstr = "blt";
+ break;
+ default:
+ dstr = "xxx";
+ break;
}
+
+ efi_printk("Mode %3u %c%c: Resolution %ux%u-%s%.0hhu\n",
+ m,
+ m == cur_mode ? '*' : ' ',
+ !valid ? '-' : ' ',
+ w, h, dstr, d);
+ }
+
+ efi_puts("\nPress any key to continue (or wait 10 seconds)\n");
+ status = efi_wait_for_key(10 * EFI_USEC_PER_SEC, &key);
+ if (status != EFI_SUCCESS && status != EFI_TIMEOUT) {
+ efi_err("Unable to read key, continuing in 10 seconds\n");
+ efi_bs_call(stall, 10 * EFI_USEC_PER_SEC);
+ }
+
+ return cur_mode;
+}
+
+static void set_mode(efi_graphics_output_protocol_t *gop)
+{
+ efi_graphics_output_protocol_mode_t *mode;
+ u32 cur_mode, new_mode;
+
+ switch (cmdline.option) {
+ case EFI_CMDLINE_MODE_NUM:
+ new_mode = choose_mode_modenum(gop);
+ break;
+ case EFI_CMDLINE_RES:
+ new_mode = choose_mode_res(gop);
+ break;
+ case EFI_CMDLINE_AUTO:
+ new_mode = choose_mode_auto(gop);
+ break;
+ case EFI_CMDLINE_LIST:
+ new_mode = choose_mode_list(gop);
+ break;
+ default:
+ return;
+ }
+
+ mode = efi_table_attr(gop, mode);
+ cur_mode = efi_table_attr(mode, mode);
+
+ if (new_mode == cur_mode)
+ return;
+
+ if (efi_call_proto(gop, set_mode, new_mode) != EFI_SUCCESS)
+ efi_err("Failed to set requested mode\n");
+}
+
+static void find_bits(u32 mask, u8 *pos, u8 *size)
+{
+ if (!mask) {
+ *pos = *size = 0;
+ return;
}
- *pos = first;
- *size = len;
+ /* UEFI spec guarantees that the set bits are contiguous */
+ *pos = __ffs(mask);
+ *size = __fls(mask) - *pos + 1;
}
static void
setup_pixel_info(struct screen_info *si, u32 pixels_per_scan_line,
efi_pixel_bitmask_t pixel_info, int pixel_format)
{
- if (pixel_format == PIXEL_RGB_RESERVED_8BIT_PER_COLOR) {
- si->lfb_depth = 32;
- si->lfb_linelength = pixels_per_scan_line * 4;
- si->red_size = 8;
- si->red_pos = 0;
- si->green_size = 8;
- si->green_pos = 8;
- si->blue_size = 8;
- si->blue_pos = 16;
- si->rsvd_size = 8;
- si->rsvd_pos = 24;
- } else if (pixel_format == PIXEL_BGR_RESERVED_8BIT_PER_COLOR) {
- si->lfb_depth = 32;
- si->lfb_linelength = pixels_per_scan_line * 4;
- si->red_size = 8;
- si->red_pos = 16;
- si->green_size = 8;
- si->green_pos = 8;
- si->blue_size = 8;
- si->blue_pos = 0;
- si->rsvd_size = 8;
- si->rsvd_pos = 24;
- } else if (pixel_format == PIXEL_BIT_MASK) {
- find_bits(pixel_info.red_mask, &si->red_pos, &si->red_size);
- find_bits(pixel_info.green_mask, &si->green_pos,
- &si->green_size);
- find_bits(pixel_info.blue_mask, &si->blue_pos, &si->blue_size);
- find_bits(pixel_info.reserved_mask, &si->rsvd_pos,
- &si->rsvd_size);
+ if (pixel_format == PIXEL_BIT_MASK) {
+ find_bits(pixel_info.red_mask,
+ &si->red_pos, &si->red_size);
+ find_bits(pixel_info.green_mask,
+ &si->green_pos, &si->green_size);
+ find_bits(pixel_info.blue_mask,
+ &si->blue_pos, &si->blue_size);
+ find_bits(pixel_info.reserved_mask,
+ &si->rsvd_pos, &si->rsvd_size);
si->lfb_depth = si->red_size + si->green_size +
si->blue_size + si->rsvd_size;
si->lfb_linelength = (pixels_per_scan_line * si->lfb_depth) / 8;
} else {
- si->lfb_depth = 4;
- si->lfb_linelength = si->lfb_width / 2;
- si->red_size = 0;
- si->red_pos = 0;
- si->green_size = 0;
- si->green_pos = 0;
- si->blue_size = 0;
- si->blue_pos = 0;
- si->rsvd_size = 0;
- si->rsvd_pos = 0;
+ if (pixel_format == PIXEL_RGB_RESERVED_8BIT_PER_COLOR) {
+ si->red_pos = 0;
+ si->blue_pos = 16;
+ } else /* PIXEL_BGR_RESERVED_8BIT_PER_COLOR */ {
+ si->blue_pos = 0;
+ si->red_pos = 16;
+ }
+
+ si->green_pos = 8;
+ si->rsvd_pos = 24;
+ si->red_size = si->green_size =
+ si->blue_size = si->rsvd_size = 8;
+
+ si->lfb_depth = 32;
+ si->lfb_linelength = pixels_per_scan_line * 4;
}
}
-static efi_status_t setup_gop(struct screen_info *si, efi_guid_t *proto,
- unsigned long size, void **handles)
+static efi_graphics_output_protocol_t *
+find_gop(efi_guid_t *proto, unsigned long size, void **handles)
{
- efi_graphics_output_protocol_t *gop, *first_gop;
- u16 width, height;
- u32 pixels_per_scan_line;
- u32 ext_lfb_base;
- efi_physical_addr_t fb_base;
- efi_pixel_bitmask_t pixel_info;
- int pixel_format;
- efi_status_t status;
+ efi_graphics_output_protocol_t *first_gop;
efi_handle_t h;
int i;
first_gop = NULL;
- gop = NULL;
for_each_efi_handle(h, handles, size, i) {
+ efi_status_t status;
+
+ efi_graphics_output_protocol_t *gop;
efi_graphics_output_protocol_mode_t *mode;
- efi_graphics_output_mode_info_t *info = NULL;
+ efi_graphics_output_mode_info_t *info;
+
efi_guid_t conout_proto = EFI_CONSOLE_OUT_DEVICE_GUID;
- bool conout_found = false;
void *dummy = NULL;
- efi_physical_addr_t current_fb_base;
status = efi_bs_call(handle_protocol, h, proto, (void **)&gop);
if (status != EFI_SUCCESS)
continue;
+ mode = efi_table_attr(gop, mode);
+ info = efi_table_attr(mode, info);
+ if (info->pixel_format == PIXEL_BLT_ONLY ||
+ info->pixel_format >= PIXEL_FORMAT_MAX)
+ continue;
+
+ /*
+ * Systems that use the UEFI Console Splitter may
+ * provide multiple GOP devices, not all of which are
+ * backed by real hardware. The workaround is to search
+ * for a GOP implementing the ConOut protocol, and if
+ * one isn't found, to just fall back to the first GOP.
+ *
+ * Once we've found a GOP supporting ConOut,
+ * don't bother looking any further.
+ */
status = efi_bs_call(handle_protocol, h, &conout_proto, &dummy);
if (status == EFI_SUCCESS)
- conout_found = true;
+ return gop;
- mode = efi_table_attr(gop, mode);
- info = efi_table_attr(mode, info);
- current_fb_base = efi_table_attr(mode, frame_buffer_base);
-
- if ((!first_gop || conout_found) &&
- info->pixel_format != PIXEL_BLT_ONLY) {
- /*
- * Systems that use the UEFI Console Splitter may
- * provide multiple GOP devices, not all of which are
- * backed by real hardware. The workaround is to search
- * for a GOP implementing the ConOut protocol, and if
- * one isn't found, to just fall back to the first GOP.
- */
- width = info->horizontal_resolution;
- height = info->vertical_resolution;
- pixel_format = info->pixel_format;
- pixel_info = info->pixel_information;
- pixels_per_scan_line = info->pixels_per_scan_line;
- fb_base = current_fb_base;
-
- /*
- * Once we've found a GOP supporting ConOut,
- * don't bother looking any further.
- */
+ if (!first_gop)
first_gop = gop;
- if (conout_found)
- break;
- }
}
+ return first_gop;
+}
+
+static efi_status_t setup_gop(struct screen_info *si, efi_guid_t *proto,
+ unsigned long size, void **handles)
+{
+ efi_graphics_output_protocol_t *gop;
+ efi_graphics_output_protocol_mode_t *mode;
+ efi_graphics_output_mode_info_t *info;
+
+ gop = find_gop(proto, size, handles);
+
/* Did we find any GOPs? */
- if (!first_gop)
+ if (!gop)
return EFI_NOT_FOUND;
+ /* Change mode if requested */
+ set_mode(gop);
+
/* EFI framebuffer */
+ mode = efi_table_attr(gop, mode);
+ info = efi_table_attr(mode, info);
+
si->orig_video_isVGA = VIDEO_TYPE_EFI;
- si->lfb_width = width;
- si->lfb_height = height;
- si->lfb_base = fb_base;
+ si->lfb_width = info->horizontal_resolution;
+ si->lfb_height = info->vertical_resolution;
- ext_lfb_base = (u64)(unsigned long)fb_base >> 32;
- if (ext_lfb_base) {
+ efi_set_u64_split(efi_table_attr(mode, frame_buffer_base),
+ &si->lfb_base, &si->ext_lfb_base);
+ if (si->ext_lfb_base)
si->capabilities |= VIDEO_CAPABILITY_64BIT_BASE;
- si->ext_lfb_base = ext_lfb_base;
- }
si->pages = 1;
- setup_pixel_info(si, pixels_per_scan_line, pixel_info, pixel_format);
+ setup_pixel_info(si, info->pixels_per_scan_line,
+ info->pixel_information, info->pixel_format);
si->lfb_size = si->lfb_linelength * si->lfb_height;
#include "efistub.h"
-#define EFI_MMAP_NR_SLACK_SLOTS 8
-
static inline bool mmap_has_headroom(unsigned long buff_size,
unsigned long map_size,
unsigned long desc_size)
efi_status_t efi_allocate_pages(unsigned long size, unsigned long *addr,
unsigned long max)
{
- efi_physical_addr_t alloc_addr = ALIGN_DOWN(max + 1, EFI_ALLOC_ALIGN) - 1;
- int slack = EFI_ALLOC_ALIGN / EFI_PAGE_SIZE - 1;
+ efi_physical_addr_t alloc_addr;
efi_status_t status;
- size = round_up(size, EFI_ALLOC_ALIGN);
+ if (EFI_ALLOC_ALIGN > EFI_PAGE_SIZE)
+ return efi_allocate_pages_aligned(size, addr, max,
+ EFI_ALLOC_ALIGN);
+
+ alloc_addr = ALIGN_DOWN(max + 1, EFI_ALLOC_ALIGN) - 1;
status = efi_bs_call(allocate_pages, EFI_ALLOCATE_MAX_ADDRESS,
- EFI_LOADER_DATA, size / EFI_PAGE_SIZE + slack,
+ EFI_LOADER_DATA, DIV_ROUND_UP(size, EFI_PAGE_SIZE),
&alloc_addr);
if (status != EFI_SUCCESS)
return status;
- *addr = ALIGN((unsigned long)alloc_addr, EFI_ALLOC_ALIGN);
-
- if (slack > 0) {
- int l = (alloc_addr % EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
-
- if (l) {
- efi_bs_call(free_pages, alloc_addr, slack - l + 1);
- slack = l - 1;
- }
- if (slack)
- efi_bs_call(free_pages, *addr + size, slack);
- }
+ *addr = alloc_addr;
return EFI_SUCCESS;
}
-/**
- * efi_low_alloc_above() - allocate pages at or above given address
- * @size: size of the memory area to allocate
- * @align: minimum alignment of the allocated memory area. It should
- * a power of two.
- * @addr: on exit the address of the allocated memory
- * @min: minimum address to used for the memory allocation
- *
- * Allocate at the lowest possible address that is not below @min as
- * EFI_LOADER_DATA. The allocated pages are aligned according to @align but at
- * least EFI_ALLOC_ALIGN. The first allocated page will not below the address
- * given by @min.
- *
- * Return: status code
- */
-efi_status_t efi_low_alloc_above(unsigned long size, unsigned long align,
- unsigned long *addr, unsigned long min)
-{
- unsigned long map_size, desc_size, buff_size;
- efi_memory_desc_t *map;
- efi_status_t status;
- unsigned long nr_pages;
- int i;
- struct efi_boot_memmap boot_map;
-
- boot_map.map = ↦
- boot_map.map_size = &map_size;
- boot_map.desc_size = &desc_size;
- boot_map.desc_ver = NULL;
- boot_map.key_ptr = NULL;
- boot_map.buff_size = &buff_size;
-
- status = efi_get_memory_map(&boot_map);
- if (status != EFI_SUCCESS)
- goto fail;
-
- /*
- * Enforce minimum alignment that EFI or Linux requires when
- * requesting a specific address. We are doing page-based (or
- * larger) allocations, and both the address and size must meet
- * alignment constraints.
- */
- if (align < EFI_ALLOC_ALIGN)
- align = EFI_ALLOC_ALIGN;
-
- size = round_up(size, EFI_ALLOC_ALIGN);
- nr_pages = size / EFI_PAGE_SIZE;
- for (i = 0; i < map_size / desc_size; i++) {
- efi_memory_desc_t *desc;
- unsigned long m = (unsigned long)map;
- u64 start, end;
-
- desc = efi_early_memdesc_ptr(m, desc_size, i);
-
- if (desc->type != EFI_CONVENTIONAL_MEMORY)
- continue;
-
- if (efi_soft_reserve_enabled() &&
- (desc->attribute & EFI_MEMORY_SP))
- continue;
-
- if (desc->num_pages < nr_pages)
- continue;
-
- start = desc->phys_addr;
- end = start + desc->num_pages * EFI_PAGE_SIZE;
-
- if (start < min)
- start = min;
-
- start = round_up(start, align);
- if ((start + size) > end)
- continue;
-
- status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages, &start);
- if (status == EFI_SUCCESS) {
- *addr = start;
- break;
- }
- }
-
- if (i == map_size / desc_size)
- status = EFI_NOT_FOUND;
-
- efi_bs_call(free_pool, map);
-fail:
- return status;
-}
/**
* efi_free() - free memory pages
nr_pages = round_up(size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
efi_bs_call(free_pages, addr, nr_pages);
}
-
-/**
- * efi_relocate_kernel() - copy memory area
- * @image_addr: pointer to address of memory area to copy
- * @image_size: size of memory area to copy
- * @alloc_size: minimum size of memory to allocate, must be greater or
- * equal to image_size
- * @preferred_addr: preferred target address
- * @alignment: minimum alignment of the allocated memory area. It
- * should be a power of two.
- * @min_addr: minimum target address
- *
- * Copy a memory area to a newly allocated memory area aligned according
- * to @alignment but at least EFI_ALLOC_ALIGN. If the preferred address
- * is not available, the allocated address will not be below @min_addr.
- * On exit, @image_addr is updated to the target copy address that was used.
- *
- * This function is used to copy the Linux kernel verbatim. It does not apply
- * any relocation changes.
- *
- * Return: status code
- */
-efi_status_t efi_relocate_kernel(unsigned long *image_addr,
- unsigned long image_size,
- unsigned long alloc_size,
- unsigned long preferred_addr,
- unsigned long alignment,
- unsigned long min_addr)
-{
- unsigned long cur_image_addr;
- unsigned long new_addr = 0;
- efi_status_t status;
- unsigned long nr_pages;
- efi_physical_addr_t efi_addr = preferred_addr;
-
- if (!image_addr || !image_size || !alloc_size)
- return EFI_INVALID_PARAMETER;
- if (alloc_size < image_size)
- return EFI_INVALID_PARAMETER;
-
- cur_image_addr = *image_addr;
-
- /*
- * The EFI firmware loader could have placed the kernel image
- * anywhere in memory, but the kernel has restrictions on the
- * max physical address it can run at. Some architectures
- * also have a prefered address, so first try to relocate
- * to the preferred address. If that fails, allocate as low
- * as possible while respecting the required alignment.
- */
- nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
- status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
- EFI_LOADER_DATA, nr_pages, &efi_addr);
- new_addr = efi_addr;
- /*
- * If preferred address allocation failed allocate as low as
- * possible.
- */
- if (status != EFI_SUCCESS) {
- status = efi_low_alloc_above(alloc_size, alignment, &new_addr,
- min_addr);
- }
- if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to allocate usable memory for kernel.\n");
- return status;
- }
-
- /*
- * We know source/dest won't overlap since both memory ranges
- * have been allocated by UEFI, so we can safely use memcpy.
- */
- memcpy((void *)new_addr, (void *)cur_image_addr, image_size);
-
- /* Return the new address of the relocated image. */
- *image_addr = new_addr;
-
- return status;
-}
if (status != EFI_BUFFER_TOO_SMALL) {
if (status != EFI_SUCCESS && status != EFI_NOT_FOUND)
- pr_efi_err("Failed to locate PCI I/O handles'\n");
+ efi_err("Failed to locate PCI I/O handles'\n");
return;
}
status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, pci_handle_size,
(void **)&pci_handle);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to allocate memory for 'pci_handle'\n");
+ efi_err("Failed to allocate memory for 'pci_handle'\n");
return;
}
status = efi_bs_call(locate_handle, EFI_LOCATE_BY_PROTOCOL, &pci_proto,
NULL, &pci_handle_size, pci_handle);
if (status != EFI_SUCCESS) {
- pr_efi_err("Failed to locate PCI I/O handles'\n");
+ efi_err("Failed to locate PCI I/O handles'\n");
goto free_handle;
}
* access to the framebuffer. Drivers for true PCIe graphics
* controllers that are behind a PCIe root port do not use
* DMA to implement the GOP framebuffer anyway [although they
- * may use it in their implentation of Gop->Blt()], and so
+ * may use it in their implementation of Gop->Blt()], and so
* disabling DMA in the PCI bridge should not interfere with
* normal operation of the device.
*/
status = efi_call_proto(pci, pci.write, EfiPciIoWidthUint16,
PCI_COMMAND, 1, &command);
if (status != EFI_SUCCESS)
- pr_efi_err("Failed to disable PCI busmastering\n");
+ efi_err("Failed to disable PCI busmastering\n");
}
free_handle:
if (align < EFI_ALLOC_ALIGN)
align = EFI_ALLOC_ALIGN;
+ size = round_up(size, EFI_ALLOC_ALIGN);
+
/* count the suitable slots in each memory map entry */
for (map_offset = 0; map_offset < map_size; map_offset += desc_size) {
efi_memory_desc_t *md = (void *)memory_map + map_offset;
}
/* find a random number between 0 and total_slots */
- target_slot = (total_slots * (u16)random_seed) >> 16;
+ target_slot = (total_slots * (u64)(random_seed & U32_MAX)) >> 32;
/*
* target_slot is now a value in the range [0, total_slots), and so
}
target = round_up(md->phys_addr, align) + target_slot * align;
- pages = round_up(size, EFI_PAGE_SIZE) / EFI_PAGE_SIZE;
+ pages = size / EFI_PAGE_SIZE;
status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
EFI_LOADER_DATA, pages, &target);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+#include <linux/efi.h>
+#include <asm/efi.h>
+
+#include "efistub.h"
+
+/**
+ * efi_low_alloc_above() - allocate pages at or above given address
+ * @size: size of the memory area to allocate
+ * @align: minimum alignment of the allocated memory area. It should
+ * a power of two.
+ * @addr: on exit the address of the allocated memory
+ * @min: minimum address to used for the memory allocation
+ *
+ * Allocate at the lowest possible address that is not below @min as
+ * EFI_LOADER_DATA. The allocated pages are aligned according to @align but at
+ * least EFI_ALLOC_ALIGN. The first allocated page will not below the address
+ * given by @min.
+ *
+ * Return: status code
+ */
+static efi_status_t efi_low_alloc_above(unsigned long size, unsigned long align,
+ unsigned long *addr, unsigned long min)
+{
+ unsigned long map_size, desc_size, buff_size;
+ efi_memory_desc_t *map;
+ efi_status_t status;
+ unsigned long nr_pages;
+ int i;
+ struct efi_boot_memmap boot_map;
+
+ boot_map.map = ↦
+ boot_map.map_size = &map_size;
+ boot_map.desc_size = &desc_size;
+ boot_map.desc_ver = NULL;
+ boot_map.key_ptr = NULL;
+ boot_map.buff_size = &buff_size;
+
+ status = efi_get_memory_map(&boot_map);
+ if (status != EFI_SUCCESS)
+ goto fail;
+
+ /*
+ * Enforce minimum alignment that EFI or Linux requires when
+ * requesting a specific address. We are doing page-based (or
+ * larger) allocations, and both the address and size must meet
+ * alignment constraints.
+ */
+ if (align < EFI_ALLOC_ALIGN)
+ align = EFI_ALLOC_ALIGN;
+
+ size = round_up(size, EFI_ALLOC_ALIGN);
+ nr_pages = size / EFI_PAGE_SIZE;
+ for (i = 0; i < map_size / desc_size; i++) {
+ efi_memory_desc_t *desc;
+ unsigned long m = (unsigned long)map;
+ u64 start, end;
+
+ desc = efi_early_memdesc_ptr(m, desc_size, i);
+
+ if (desc->type != EFI_CONVENTIONAL_MEMORY)
+ continue;
+
+ if (efi_soft_reserve_enabled() &&
+ (desc->attribute & EFI_MEMORY_SP))
+ continue;
+
+ if (desc->num_pages < nr_pages)
+ continue;
+
+ start = desc->phys_addr;
+ end = start + desc->num_pages * EFI_PAGE_SIZE;
+
+ if (start < min)
+ start = min;
+
+ start = round_up(start, align);
+ if ((start + size) > end)
+ continue;
+
+ status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
+ EFI_LOADER_DATA, nr_pages, &start);
+ if (status == EFI_SUCCESS) {
+ *addr = start;
+ break;
+ }
+ }
+
+ if (i == map_size / desc_size)
+ status = EFI_NOT_FOUND;
+
+ efi_bs_call(free_pool, map);
+fail:
+ return status;
+}
+
+/**
+ * efi_relocate_kernel() - copy memory area
+ * @image_addr: pointer to address of memory area to copy
+ * @image_size: size of memory area to copy
+ * @alloc_size: minimum size of memory to allocate, must be greater or
+ * equal to image_size
+ * @preferred_addr: preferred target address
+ * @alignment: minimum alignment of the allocated memory area. It
+ * should be a power of two.
+ * @min_addr: minimum target address
+ *
+ * Copy a memory area to a newly allocated memory area aligned according
+ * to @alignment but at least EFI_ALLOC_ALIGN. If the preferred address
+ * is not available, the allocated address will not be below @min_addr.
+ * On exit, @image_addr is updated to the target copy address that was used.
+ *
+ * This function is used to copy the Linux kernel verbatim. It does not apply
+ * any relocation changes.
+ *
+ * Return: status code
+ */
+efi_status_t efi_relocate_kernel(unsigned long *image_addr,
+ unsigned long image_size,
+ unsigned long alloc_size,
+ unsigned long preferred_addr,
+ unsigned long alignment,
+ unsigned long min_addr)
+{
+ unsigned long cur_image_addr;
+ unsigned long new_addr = 0;
+ efi_status_t status;
+ unsigned long nr_pages;
+ efi_physical_addr_t efi_addr = preferred_addr;
+
+ if (!image_addr || !image_size || !alloc_size)
+ return EFI_INVALID_PARAMETER;
+ if (alloc_size < image_size)
+ return EFI_INVALID_PARAMETER;
+
+ cur_image_addr = *image_addr;
+
+ /*
+ * The EFI firmware loader could have placed the kernel image
+ * anywhere in memory, but the kernel has restrictions on the
+ * max physical address it can run at. Some architectures
+ * also have a preferred address, so first try to relocate
+ * to the preferred address. If that fails, allocate as low
+ * as possible while respecting the required alignment.
+ */
+ nr_pages = round_up(alloc_size, EFI_ALLOC_ALIGN) / EFI_PAGE_SIZE;
+ status = efi_bs_call(allocate_pages, EFI_ALLOCATE_ADDRESS,
+ EFI_LOADER_DATA, nr_pages, &efi_addr);
+ new_addr = efi_addr;
+ /*
+ * If preferred address allocation failed allocate as low as
+ * possible.
+ */
+ if (status != EFI_SUCCESS) {
+ status = efi_low_alloc_above(alloc_size, alignment, &new_addr,
+ min_addr);
+ }
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to allocate usable memory for kernel.\n");
+ return status;
+ }
+
+ /*
+ * We know source/dest won't overlap since both memory ranges
+ * have been allocated by UEFI, so we can safely use memcpy.
+ */
+ memcpy((void *)new_addr, (void *)cur_image_addr, image_size);
+
+ /* Return the new address of the relocated image. */
+ *image_addr = new_addr;
+
+ return status;
+}
return efi_secureboot_mode_disabled;
secure_boot_enabled:
- pr_efi("UEFI Secure Boot is enabled.\n");
+ efi_info("UEFI Secure Boot is enabled.\n");
return efi_secureboot_mode_enabled;
out_efi_err:
- pr_efi_err("Could not determine UEFI Secure Boot status.\n");
+ efi_err("Could not determine UEFI Secure Boot status.\n");
return efi_secureboot_mode_unknown;
}
efi_status_t status;
efi_physical_addr_t log_location = 0, log_last_entry = 0;
struct linux_efi_tpm_eventlog *log_tbl = NULL;
- struct efi_tcg2_final_events_table *final_events_table;
+ struct efi_tcg2_final_events_table *final_events_table = NULL;
unsigned long first_entry_addr, last_entry_addr;
size_t log_size, last_entry_size;
efi_bool_t truncated;
sizeof(*log_tbl) + log_size, (void **)&log_tbl);
if (status != EFI_SUCCESS) {
- efi_printk("Unable to allocate memory for event log\n");
+ efi_err("Unable to allocate memory for event log\n");
return;
}
* Figure out whether any events have already been logged to the
* final events structure, and if so how much space they take up
*/
- final_events_table = get_efi_config_table(LINUX_EFI_TPM_FINAL_LOG_GUID);
+ if (version == EFI_TCG2_EVENT_LOG_FORMAT_TCG_2)
+ final_events_table = get_efi_config_table(LINUX_EFI_TPM_FINAL_LOG_GUID);
if (final_events_table && final_events_table->nr_events) {
struct tcg_pcr_event2_head *header;
int offset;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/* -*- linux-c -*- ------------------------------------------------------- *
+ *
+ * Copyright (C) 1991, 1992 Linus Torvalds
+ * Copyright 2007 rPath, Inc. - All Rights Reserved
+ *
+ * ----------------------------------------------------------------------- */
+
+/*
+ * Oh, it's a waste of space, but oh-so-yummy for debugging.
+ */
+
+#include <stdarg.h>
+
+#include <linux/compiler.h>
+#include <linux/ctype.h>
+#include <linux/kernel.h>
+#include <linux/limits.h>
+#include <linux/string.h>
+#include <linux/types.h>
+
+static
+int skip_atoi(const char **s)
+{
+ int i = 0;
+
+ while (isdigit(**s))
+ i = i * 10 + *((*s)++) - '0';
+ return i;
+}
+
+/*
+ * put_dec_full4 handles numbers in the range 0 <= r < 10000.
+ * The multiplier 0xccd is round(2^15/10), and the approximation
+ * r/10 == (r * 0xccd) >> 15 is exact for all r < 16389.
+ */
+static
+void put_dec_full4(char *end, unsigned int r)
+{
+ int i;
+
+ for (i = 0; i < 3; i++) {
+ unsigned int q = (r * 0xccd) >> 15;
+ *--end = '0' + (r - q * 10);
+ r = q;
+ }
+ *--end = '0' + r;
+}
+
+/* put_dec is copied from lib/vsprintf.c with small modifications */
+
+/*
+ * Call put_dec_full4 on x % 10000, return x / 10000.
+ * The approximation x/10000 == (x * 0x346DC5D7) >> 43
+ * holds for all x < 1,128,869,999. The largest value this
+ * helper will ever be asked to convert is 1,125,520,955.
+ * (second call in the put_dec code, assuming n is all-ones).
+ */
+static
+unsigned int put_dec_helper4(char *end, unsigned int x)
+{
+ unsigned int q = (x * 0x346DC5D7ULL) >> 43;
+
+ put_dec_full4(end, x - q * 10000);
+ return q;
+}
+
+/* Based on code by Douglas W. Jones found at
+ * <http://www.cs.uiowa.edu/~jones/bcd/decimal.html#sixtyfour>
+ * (with permission from the author).
+ * Performs no 64-bit division and hence should be fast on 32-bit machines.
+ */
+static
+char *put_dec(char *end, unsigned long long n)
+{
+ unsigned int d3, d2, d1, q, h;
+ char *p = end;
+
+ d1 = ((unsigned int)n >> 16); /* implicit "& 0xffff" */
+ h = (n >> 32);
+ d2 = (h ) & 0xffff;
+ d3 = (h >> 16); /* implicit "& 0xffff" */
+
+ /* n = 2^48 d3 + 2^32 d2 + 2^16 d1 + d0
+ = 281_4749_7671_0656 d3 + 42_9496_7296 d2 + 6_5536 d1 + d0 */
+ q = 656 * d3 + 7296 * d2 + 5536 * d1 + ((unsigned int)n & 0xffff);
+ q = put_dec_helper4(p, q);
+ p -= 4;
+
+ q += 7671 * d3 + 9496 * d2 + 6 * d1;
+ q = put_dec_helper4(p, q);
+ p -= 4;
+
+ q += 4749 * d3 + 42 * d2;
+ q = put_dec_helper4(p, q);
+ p -= 4;
+
+ q += 281 * d3;
+ q = put_dec_helper4(p, q);
+ p -= 4;
+
+ put_dec_full4(p, q);
+ p -= 4;
+
+ /* strip off the extra 0's we printed */
+ while (p < end && *p == '0')
+ ++p;
+
+ return p;
+}
+
+static
+char *number(char *end, unsigned long long num, int base, char locase)
+{
+ /*
+ * locase = 0 or 0x20. ORing digits or letters with 'locase'
+ * produces same digits or (maybe lowercased) letters
+ */
+
+ /* we are called with base 8, 10 or 16, only, thus don't need "G..." */
+ static const char digits[16] = "0123456789ABCDEF"; /* "GHIJKLMNOPQRSTUVWXYZ"; */
+
+ switch (base) {
+ case 10:
+ if (num != 0)
+ end = put_dec(end, num);
+ break;
+ case 8:
+ for (; num != 0; num >>= 3)
+ *--end = '0' + (num & 07);
+ break;
+ case 16:
+ for (; num != 0; num >>= 4)
+ *--end = digits[num & 0xf] | locase;
+ break;
+ default:
+ unreachable();
+ };
+
+ return end;
+}
+
+#define ZEROPAD 1 /* pad with zero */
+#define SIGN 2 /* unsigned/signed long */
+#define PLUS 4 /* show plus */
+#define SPACE 8 /* space if plus */
+#define LEFT 16 /* left justified */
+#define SMALL 32 /* Must be 32 == 0x20 */
+#define SPECIAL 64 /* 0x */
+#define WIDE 128 /* UTF-16 string */
+
+static
+int get_flags(const char **fmt)
+{
+ int flags = 0;
+
+ do {
+ switch (**fmt) {
+ case '-':
+ flags |= LEFT;
+ break;
+ case '+':
+ flags |= PLUS;
+ break;
+ case ' ':
+ flags |= SPACE;
+ break;
+ case '#':
+ flags |= SPECIAL;
+ break;
+ case '0':
+ flags |= ZEROPAD;
+ break;
+ default:
+ return flags;
+ }
+ ++(*fmt);
+ } while (1);
+}
+
+static
+int get_int(const char **fmt, va_list *ap)
+{
+ if (isdigit(**fmt))
+ return skip_atoi(fmt);
+ if (**fmt == '*') {
+ ++(*fmt);
+ /* it's the next argument */
+ return va_arg(*ap, int);
+ }
+ return 0;
+}
+
+static
+unsigned long long get_number(int sign, int qualifier, va_list *ap)
+{
+ if (sign) {
+ switch (qualifier) {
+ case 'L':
+ return va_arg(*ap, long long);
+ case 'l':
+ return va_arg(*ap, long);
+ case 'h':
+ return (short)va_arg(*ap, int);
+ case 'H':
+ return (signed char)va_arg(*ap, int);
+ default:
+ return va_arg(*ap, int);
+ };
+ } else {
+ switch (qualifier) {
+ case 'L':
+ return va_arg(*ap, unsigned long long);
+ case 'l':
+ return va_arg(*ap, unsigned long);
+ case 'h':
+ return (unsigned short)va_arg(*ap, int);
+ case 'H':
+ return (unsigned char)va_arg(*ap, int);
+ default:
+ return va_arg(*ap, unsigned int);
+ }
+ }
+}
+
+static
+char get_sign(long long *num, int flags)
+{
+ if (!(flags & SIGN))
+ return 0;
+ if (*num < 0) {
+ *num = -(*num);
+ return '-';
+ }
+ if (flags & PLUS)
+ return '+';
+ if (flags & SPACE)
+ return ' ';
+ return 0;
+}
+
+static
+size_t utf16s_utf8nlen(const u16 *s16, size_t maxlen)
+{
+ size_t len, clen;
+
+ for (len = 0; len < maxlen && *s16; len += clen) {
+ u16 c0 = *s16++;
+
+ /* First, get the length for a BMP character */
+ clen = 1 + (c0 >= 0x80) + (c0 >= 0x800);
+ if (len + clen > maxlen)
+ break;
+ /*
+ * If this is a high surrogate, and we're already at maxlen, we
+ * can't include the character if it's a valid surrogate pair.
+ * Avoid accessing one extra word just to check if it's valid
+ * or not.
+ */
+ if ((c0 & 0xfc00) == 0xd800) {
+ if (len + clen == maxlen)
+ break;
+ if ((*s16 & 0xfc00) == 0xdc00) {
+ ++s16;
+ ++clen;
+ }
+ }
+ }
+
+ return len;
+}
+
+static
+u32 utf16_to_utf32(const u16 **s16)
+{
+ u16 c0, c1;
+
+ c0 = *(*s16)++;
+ /* not a surrogate */
+ if ((c0 & 0xf800) != 0xd800)
+ return c0;
+ /* invalid: low surrogate instead of high */
+ if (c0 & 0x0400)
+ return 0xfffd;
+ c1 = **s16;
+ /* invalid: missing low surrogate */
+ if ((c1 & 0xfc00) != 0xdc00)
+ return 0xfffd;
+ /* valid surrogate pair */
+ ++(*s16);
+ return (0x10000 - (0xd800 << 10) - 0xdc00) + (c0 << 10) + c1;
+}
+
+#define PUTC(c) \
+do { \
+ if (pos < size) \
+ buf[pos] = (c); \
+ ++pos; \
+} while (0);
+
+int vsnprintf(char *buf, size_t size, const char *fmt, va_list ap)
+{
+ /* The maximum space required is to print a 64-bit number in octal */
+ char tmp[(sizeof(unsigned long long) * 8 + 2) / 3];
+ char *tmp_end = &tmp[ARRAY_SIZE(tmp)];
+ long long num;
+ int base;
+ const char *s;
+ size_t len, pos;
+ char sign;
+
+ int flags; /* flags to number() */
+
+ int field_width; /* width of output field */
+ int precision; /* min. # of digits for integers; max
+ number of chars for from string */
+ int qualifier; /* 'h', 'hh', 'l' or 'll' for integer fields */
+
+ va_list args;
+
+ /*
+ * We want to pass our input va_list to helper functions by reference,
+ * but there's an annoying edge case. If va_list was originally passed
+ * to us by value, we could just pass &ap down to the helpers. This is
+ * the case on, for example, X86_32.
+ * However, on X86_64 (and possibly others), va_list is actually a
+ * size-1 array containing a structure. Our function parameter ap has
+ * decayed from T[1] to T*, and &ap has type T** rather than T(*)[1],
+ * which is what will be expected by a function taking a va_list *
+ * parameter.
+ * One standard way to solve this mess is by creating a copy in a local
+ * variable of type va_list and then passing a pointer to that local
+ * copy instead, which is what we do here.
+ */
+ va_copy(args, ap);
+
+ for (pos = 0; *fmt; ++fmt) {
+ if (*fmt != '%' || *++fmt == '%') {
+ PUTC(*fmt);
+ continue;
+ }
+
+ /* process flags */
+ flags = get_flags(&fmt);
+
+ /* get field width */
+ field_width = get_int(&fmt, &args);
+ if (field_width < 0) {
+ field_width = -field_width;
+ flags |= LEFT;
+ }
+
+ if (flags & LEFT)
+ flags &= ~ZEROPAD;
+
+ /* get the precision */
+ precision = -1;
+ if (*fmt == '.') {
+ ++fmt;
+ precision = get_int(&fmt, &args);
+ if (precision >= 0)
+ flags &= ~ZEROPAD;
+ }
+
+ /* get the conversion qualifier */
+ qualifier = -1;
+ if (*fmt == 'h' || *fmt == 'l') {
+ qualifier = *fmt;
+ ++fmt;
+ if (qualifier == *fmt) {
+ qualifier -= 'a'-'A';
+ ++fmt;
+ }
+ }
+
+ sign = 0;
+
+ switch (*fmt) {
+ case 'c':
+ flags &= LEFT;
+ s = tmp;
+ if (qualifier == 'l') {
+ ((u16 *)tmp)[0] = (u16)va_arg(args, unsigned int);
+ ((u16 *)tmp)[1] = L'\0';
+ precision = INT_MAX;
+ goto wstring;
+ } else {
+ tmp[0] = (unsigned char)va_arg(args, int);
+ precision = len = 1;
+ }
+ goto output;
+
+ case 's':
+ flags &= LEFT;
+ if (precision < 0)
+ precision = INT_MAX;
+ s = va_arg(args, void *);
+ if (!s)
+ s = precision < 6 ? "" : "(null)";
+ else if (qualifier == 'l') {
+ wstring:
+ flags |= WIDE;
+ precision = len = utf16s_utf8nlen((const u16 *)s, precision);
+ goto output;
+ }
+ precision = len = strnlen(s, precision);
+ goto output;
+
+ /* integer number formats - set up the flags and "break" */
+ case 'o':
+ base = 8;
+ break;
+
+ case 'p':
+ if (precision < 0)
+ precision = 2 * sizeof(void *);
+ fallthrough;
+ case 'x':
+ flags |= SMALL;
+ fallthrough;
+ case 'X':
+ base = 16;
+ break;
+
+ case 'd':
+ case 'i':
+ flags |= SIGN;
+ fallthrough;
+ case 'u':
+ flags &= ~SPECIAL;
+ base = 10;
+ break;
+
+ default:
+ /*
+ * Bail out if the conversion specifier is invalid.
+ * There's probably a typo in the format string and the
+ * remaining specifiers are unlikely to match up with
+ * the arguments.
+ */
+ goto fail;
+ }
+ if (*fmt == 'p') {
+ num = (unsigned long)va_arg(args, void *);
+ } else {
+ num = get_number(flags & SIGN, qualifier, &args);
+ }
+
+ sign = get_sign(&num, flags);
+ if (sign)
+ --field_width;
+
+ s = number(tmp_end, num, base, flags & SMALL);
+ len = tmp_end - s;
+ /* default precision is 1 */
+ if (precision < 0)
+ precision = 1;
+ /* precision is minimum number of digits to print */
+ if (precision < len)
+ precision = len;
+ if (flags & SPECIAL) {
+ /*
+ * For octal, a leading 0 is printed only if necessary,
+ * i.e. if it's not already there because of the
+ * precision.
+ */
+ if (base == 8 && precision == len)
+ ++precision;
+ /*
+ * For hexadecimal, the leading 0x is skipped if the
+ * output is empty, i.e. both the number and the
+ * precision are 0.
+ */
+ if (base == 16 && precision > 0)
+ field_width -= 2;
+ else
+ flags &= ~SPECIAL;
+ }
+ /*
+ * For zero padding, increase the precision to fill the field
+ * width.
+ */
+ if ((flags & ZEROPAD) && field_width > precision)
+ precision = field_width;
+
+output:
+ /* Calculate the padding necessary */
+ field_width -= precision;
+ /* Leading padding with ' ' */
+ if (!(flags & LEFT))
+ while (field_width-- > 0)
+ PUTC(' ');
+ /* sign */
+ if (sign)
+ PUTC(sign);
+ /* 0x/0X for hexadecimal */
+ if (flags & SPECIAL) {
+ PUTC('0');
+ PUTC( 'X' | (flags & SMALL));
+ }
+ /* Zero padding and excess precision */
+ while (precision-- > len)
+ PUTC('0');
+ /* Actual output */
+ if (flags & WIDE) {
+ const u16 *ws = (const u16 *)s;
+
+ while (len-- > 0) {
+ u32 c32 = utf16_to_utf32(&ws);
+ u8 *s8;
+ size_t clen;
+
+ if (c32 < 0x80) {
+ PUTC(c32);
+ continue;
+ }
+
+ /* Number of trailing octets */
+ clen = 1 + (c32 >= 0x800) + (c32 >= 0x10000);
+
+ len -= clen;
+ s8 = (u8 *)&buf[pos];
+
+ /* Avoid writing partial character */
+ PUTC('\0');
+ pos += clen;
+ if (pos >= size)
+ continue;
+
+ /* Set high bits of leading octet */
+ *s8 = (0xf00 >> 1) >> clen;
+ /* Write trailing octets in reverse order */
+ for (s8 += clen; clen; --clen, c32 >>= 6)
+ *s8-- = 0x80 | (c32 & 0x3f);
+ /* Set low bits of leading octet */
+ *s8 |= c32;
+ }
+ } else {
+ while (len-- > 0)
+ PUTC(*s++);
+ }
+ /* Trailing padding with ' ' */
+ while (field_width-- > 0)
+ PUTC(' ');
+ }
+fail:
+ va_end(args);
+
+ if (size)
+ buf[min(pos, size-1)] = '\0';
+
+ return pos;
+}
+
+int snprintf(char *buf, size_t size, const char *fmt, ...)
+{
+ va_list args;
+ int i;
+
+ va_start(args, fmt);
+ i = vsnprintf(buf, size, fmt, args);
+ va_end(args);
+ return i;
+}
/* Maximum physical address for 64-bit kernel with 4-level paging */
#define MAXMEM_X86_64_4LEVEL (1ull << 46)
-static efi_system_table_t *sys_table __efistub_global;
-extern const bool efi_is64;
+const efi_system_table_t *efi_system_table;
extern u32 image_offset;
-
-__pure efi_system_table_t *efi_system_table(void)
-{
- return sys_table;
-}
-
-__attribute_const__ bool efi_is_64bit(void)
-{
- if (IS_ENABLED(CONFIG_EFI_MIXED))
- return efi_is64;
- return IS_ENABLED(CONFIG_X86_64);
-}
+static efi_loaded_image_t *image = NULL;
static efi_status_t
preserve_pci_rom_image(efi_pci_io_protocol_t *pci, struct pci_setup_rom **__rom)
status = efi_bs_call(allocate_pool, EFI_LOADER_DATA, size,
(void **)&rom);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to allocate memory for 'rom'\n");
+ efi_err("Failed to allocate memory for 'rom'\n");
return status;
}
PCI_VENDOR_ID, 1, &rom->vendor);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to read rom->vendor\n");
+ efi_err("Failed to read rom->vendor\n");
goto free_struct;
}
PCI_DEVICE_ID, 1, &rom->devid);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to read rom->devid\n");
+ efi_err("Failed to read rom->devid\n");
goto free_struct;
}
(void **)&pci_handle);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to allocate memory for 'pci_handle'\n");
+ efi_err("Failed to allocate memory for 'pci_handle'\n");
return;
}
return;
if (efi_table_attr(p, version) != 0x10000) {
- efi_printk("Unsupported properties proto version\n");
+ efi_err("Unsupported properties proto version\n");
return;
}
size + sizeof(struct setup_data),
(void **)&new);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to allocate memory for 'properties'\n");
+ efi_err("Failed to allocate memory for 'properties'\n");
return;
}
static void setup_quirks(struct boot_params *boot_params)
{
efi_char16_t *fw_vendor = (efi_char16_t *)(unsigned long)
- efi_table_attr(efi_system_table(), fw_vendor);
+ efi_table_attr(efi_system_table, fw_vendor);
if (!memcmp(fw_vendor, apple, sizeof(apple))) {
if (IS_ENABLED(CONFIG_APPLE_PROPERTIES))
{
struct boot_params *boot_params;
struct setup_header *hdr;
- efi_loaded_image_t *image;
void *image_base;
efi_guid_t proto = LOADED_IMAGE_PROTOCOL_GUID;
int options_size = 0;
unsigned long ramdisk_addr;
unsigned long ramdisk_size;
- sys_table = sys_table_arg;
+ efi_system_table = sys_table_arg;
/* Check if we were booted by the EFI firmware */
- if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
+ if (efi_system_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
efi_exit(handle, EFI_INVALID_PARAMETER);
status = efi_bs_call(handle_protocol, handle, &proto, (void **)&image);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to get handle for LOADED_IMAGE_PROTOCOL\n");
+ efi_err("Failed to get handle for LOADED_IMAGE_PROTOCOL\n");
efi_exit(handle, status);
}
image_base = efi_table_attr(image, image_base);
image_offset = (void *)startup_32 - image_base;
- status = efi_allocate_pages(0x4000, (unsigned long *)&boot_params, ULONG_MAX);
+ status = efi_allocate_pages(sizeof(struct boot_params),
+ (unsigned long *)&boot_params, ULONG_MAX);
if (status != EFI_SUCCESS) {
- efi_printk("Failed to allocate lowmem for boot params\n");
+ efi_err("Failed to allocate lowmem for boot params\n");
efi_exit(handle, status);
}
- memset(boot_params, 0x0, 0x4000);
+ memset(boot_params, 0x0, sizeof(struct boot_params));
hdr = &boot_params->hdr;
hdr->type_of_loader = 0x21;
/* Convert unicode cmdline to ascii */
- cmdline_ptr = efi_convert_cmdline(image, &options_size, ULONG_MAX);
+ cmdline_ptr = efi_convert_cmdline(image, &options_size);
if (!cmdline_ptr)
goto fail;
- hdr->cmd_line_ptr = (unsigned long)cmdline_ptr;
- /* Fill in upper bits of command line address, NOP on 32 bit */
- boot_params->ext_cmd_line_ptr = (u64)(unsigned long)cmdline_ptr >> 32;
+ efi_set_u64_split((unsigned long)cmdline_ptr,
+ &hdr->cmd_line_ptr, &boot_params->ext_cmd_line_ptr);
hdr->ramdisk_image = 0;
hdr->ramdisk_size = 0;
- if (efi_is_native()) {
- status = efi_parse_options(cmdline_ptr);
- if (status != EFI_SUCCESS)
- goto fail2;
-
- if (!noinitrd()) {
- status = efi_load_initrd(image, &ramdisk_addr,
- &ramdisk_size,
- hdr->initrd_addr_max,
- ULONG_MAX);
- if (status != EFI_SUCCESS)
- goto fail2;
- hdr->ramdisk_image = ramdisk_addr & 0xffffffff;
- hdr->ramdisk_size = ramdisk_size & 0xffffffff;
- boot_params->ext_ramdisk_image = (u64)ramdisk_addr >> 32;
- boot_params->ext_ramdisk_size = (u64)ramdisk_size >> 32;
- }
- }
-
- efi_stub_entry(handle, sys_table, boot_params);
+ efi_stub_entry(handle, sys_table_arg, boot_params);
/* not reached */
-fail2:
- efi_free(options_size, (unsigned long)cmdline_ptr);
fail:
- efi_free(0x4000, (unsigned long)boot_params);
+ efi_free(sizeof(struct boot_params), (unsigned long)boot_params);
efi_exit(handle, status);
}
struct setup_data **e820ext,
u32 *e820ext_size)
{
- unsigned long map_size, desc_size, buff_size;
- struct efi_boot_memmap boot_map;
- efi_memory_desc_t *map;
+ unsigned long map_size, desc_size, map_key;
efi_status_t status;
- __u32 nr_desc;
+ __u32 nr_desc, desc_version;
- boot_map.map = ↦
- boot_map.map_size = &map_size;
- boot_map.desc_size = &desc_size;
- boot_map.desc_ver = NULL;
- boot_map.key_ptr = NULL;
- boot_map.buff_size = &buff_size;
+ /* Only need the size of the mem map and size of each mem descriptor */
+ map_size = 0;
+ status = efi_bs_call(get_memory_map, &map_size, NULL, &map_key,
+ &desc_size, &desc_version);
+ if (status != EFI_BUFFER_TOO_SMALL)
+ return (status != EFI_SUCCESS) ? status : EFI_UNSUPPORTED;
- status = efi_get_memory_map(&boot_map);
- if (status != EFI_SUCCESS)
- return status;
-
- nr_desc = buff_size / desc_size;
+ nr_desc = map_size / desc_size + EFI_MMAP_NR_SLACK_SLOTS;
if (nr_desc > ARRAY_SIZE(params->e820_table)) {
u32 nr_e820ext = nr_desc - ARRAY_SIZE(params->e820_table);
: EFI32_LOADER_SIGNATURE;
memcpy(&p->efi->efi_loader_signature, signature, sizeof(__u32));
- p->efi->efi_systab = (unsigned long)efi_system_table();
+ efi_set_u64_split((unsigned long)efi_system_table,
+ &p->efi->efi_systab, &p->efi->efi_systab_hi);
p->efi->efi_memdesc_size = *map->desc_size;
p->efi->efi_memdesc_version = *map->desc_ver;
- p->efi->efi_memmap = (unsigned long)*map->map;
+ efi_set_u64_split((unsigned long)*map->map,
+ &p->efi->efi_memmap, &p->efi->efi_memmap_hi);
p->efi->efi_memmap_size = *map->map_size;
-#ifdef CONFIG_X86_64
- p->efi->efi_systab_hi = (unsigned long)efi_system_table() >> 32;
- p->efi->efi_memmap_hi = (unsigned long)*map->map >> 32;
-#endif
-
return EFI_SUCCESS;
}
unsigned long buffer_start, buffer_end;
struct setup_header *hdr = &boot_params->hdr;
efi_status_t status;
- unsigned long cmdline_paddr;
- sys_table = sys_table_arg;
+ efi_system_table = sys_table_arg;
/* Check if we were booted by the EFI firmware */
- if (sys_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
+ if (efi_system_table->hdr.signature != EFI_SYSTEM_TABLE_SIGNATURE)
efi_exit(handle, EFI_INVALID_PARAMETER);
/*
hdr->kernel_alignment,
LOAD_PHYSICAL_ADDR);
if (status != EFI_SUCCESS) {
- efi_printk("efi_relocate_kernel() failed!\n");
+ efi_err("efi_relocate_kernel() failed!\n");
goto fail;
}
/*
image_offset = 0;
}
- /*
- * efi_pe_entry() may have been called before efi_main(), in which
- * case this is the second time we parse the cmdline. This is ok,
- * parsing the cmdline multiple times does not have side-effects.
- */
- cmdline_paddr = ((u64)hdr->cmd_line_ptr |
- ((u64)boot_params->ext_cmd_line_ptr << 32));
- efi_parse_options((char *)cmdline_paddr);
+#ifdef CONFIG_CMDLINE_BOOL
+ status = efi_parse_options(CONFIG_CMDLINE);
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to parse options\n");
+ goto fail;
+ }
+#endif
+ if (!IS_ENABLED(CONFIG_CMDLINE_OVERRIDE)) {
+ unsigned long cmdline_paddr = ((u64)hdr->cmd_line_ptr |
+ ((u64)boot_params->ext_cmd_line_ptr << 32));
+ status = efi_parse_options((char *)cmdline_paddr);
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to parse options\n");
+ goto fail;
+ }
+ }
/*
- * At this point, an initrd may already have been loaded, either by
- * the bootloader and passed via bootparams, or loaded from a initrd=
- * command line option by efi_pe_entry() above. In either case, we
- * permit an initrd loaded from the LINUX_EFI_INITRD_MEDIA_GUID device
- * path to supersede it.
+ * At this point, an initrd may already have been loaded by the
+ * bootloader and passed via bootparams. We permit an initrd loaded
+ * from the LINUX_EFI_INITRD_MEDIA_GUID device path to supersede it.
+ *
+ * If the device path is not present, any command-line initrd=
+ * arguments will be processed only if image is not NULL, which will be
+ * the case only if we were loaded via the PE entry point.
*/
- if (!noinitrd()) {
+ if (!efi_noinitrd) {
unsigned long addr, size;
- status = efi_load_initrd_dev_path(&addr, &size, ULONG_MAX);
- if (status == EFI_SUCCESS) {
- hdr->ramdisk_image = (u32)addr;
- hdr->ramdisk_size = (u32)size;
- boot_params->ext_ramdisk_image = (u64)addr >> 32;
- boot_params->ext_ramdisk_size = (u64)size >> 32;
- } else if (status != EFI_NOT_FOUND) {
- efi_printk("efi_load_initrd_dev_path() failed!\n");
+ status = efi_load_initrd(image, &addr, &size,
+ hdr->initrd_addr_max, ULONG_MAX);
+
+ if (status != EFI_SUCCESS) {
+ efi_err("Failed to load initrd!\n");
goto fail;
}
+ if (size > 0) {
+ efi_set_u64_split(addr, &hdr->ramdisk_image,
+ &boot_params->ext_ramdisk_image);
+ efi_set_u64_split(size, &hdr->ramdisk_size,
+ &boot_params->ext_ramdisk_size);
+ }
}
/*
status = exit_boot(boot_params, handle);
if (status != EFI_SUCCESS) {
- efi_printk("exit_boot() failed!\n");
+ efi_err("exit_boot() failed!\n");
goto fail;
}
return bzimage_addr;
fail:
- efi_printk("efi_main() failed!\n");
+ efi_err("efi_main() failed!\n");
efi_exit(handle, status);
}
tbl_size = sizeof(*log_tbl) + log_tbl->size;
memblock_reserve(efi.tpm_log, tbl_size);
- if (efi.tpm_final_log == EFI_INVALID_TABLE_ADDR)
+ if (efi.tpm_final_log == EFI_INVALID_TABLE_ADDR ||
+ log_tbl->version != EFI_TCG2_EVENT_LOG_FORMAT_TCG_2) {
+ pr_warn(FW_BUG "TPM Final Events table missing or invalid\n");
goto out;
+ }
final_tbl = early_memremap(efi.tpm_final_log, sizeof(*final_tbl));
RPI_FIRMWARE_GET_FIRMWARE_REVISION,
&packet, sizeof(packet));
- if (ret == 0) {
- struct tm tm;
-
- time64_to_tm(packet, 0, &tm);
+ if (ret)
+ return;
- dev_info(fw->cl.dev,
- "Attached to firmware from %04ld-%02d-%02d %02d:%02d\n",
- tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
- tm.tm_hour, tm.tm_min);
- }
+ dev_info(fw->cl.dev, "Attached to firmware from %ptT\n", &packet);
}
static void
kona_gpio->reg_base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(kona_gpio->reg_base)) {
- ret = -ENXIO;
+ ret = PTR_ERR(kona_gpio->reg_base);
goto err_irq_domain;
}
mutex_init(&exar_gpio->lock);
index = ida_simple_get(&ida_index, 0, 0, GFP_KERNEL);
- if (index < 0)
- goto err_destroy;
+ if (index < 0) {
+ ret = index;
+ goto err_mutex_destroy;
+ }
sprintf(exar_gpio->name, "exar_gpio%d", index);
exar_gpio->gpio_chip.label = exar_gpio->name;
err_destroy:
ida_simple_remove(&ida_index, index);
+err_mutex_destroy:
mutex_destroy(&exar_gpio->lock);
return ret;
}
{
u32 arm_gpio_lock_val;
- spin_lock(&gs->gc.bgpio_lock);
mutex_lock(yu_arm_gpio_lock_param.lock);
+ spin_lock(&gs->gc.bgpio_lock);
arm_gpio_lock_val = readl(yu_arm_gpio_lock_param.io);
* When lock active bit[31] is set, ModeX is write enabled
*/
if (YU_LOCK_ACTIVE_BIT(arm_gpio_lock_val)) {
- mutex_unlock(yu_arm_gpio_lock_param.lock);
spin_unlock(&gs->gc.bgpio_lock);
+ mutex_unlock(yu_arm_gpio_lock_param.lock);
return -EINVAL;
}
static void mlxbf2_gpio_lock_release(struct mlxbf2_gpio_context *gs)
{
writel(YU_ARM_GPIO_LOCK_RELEASE, yu_arm_gpio_lock_param.io);
- mutex_unlock(yu_arm_gpio_lock_param.lock);
spin_unlock(&gs->gc.bgpio_lock);
+ mutex_unlock(yu_arm_gpio_lock_param.lock);
}
/*
"marvell,armada-370-gpio"))
return 0;
+ /*
+ * There are only two sets of PWM configuration registers for
+ * all the GPIO lines on those SoCs which this driver reserves
+ * for the first two GPIO chips. So if the resource is missing
+ * we can't treat it as an error.
+ */
+ if (!platform_get_resource_byname(pdev, IORESOURCE_MEM, "pwm"))
+ return 0;
+
if (IS_ERR(mvchip->clk))
return PTR_ERR(mvchip->clk);
mvchip->mvpwm = mvpwm;
mvpwm->mvchip = mvchip;
- /*
- * There are only two sets of PWM configuration registers for
- * all the GPIO lines on those SoCs which this driver reserves
- * for the first two GPIO chips. So if the resource is missing
- * we can't treat it as an error.
- */
mvpwm->membase = devm_platform_ioremap_resource_byname(pdev, "pwm");
if (IS_ERR(mvpwm->membase))
return PTR_ERR(mvpwm->membase);
{
struct pca953x_chip *chip = gpiochip_get_data(gc);
- switch (config) {
+ switch (pinconf_to_config_param(config)) {
case PIN_CONFIG_BIAS_PULL_UP:
case PIN_CONFIG_BIAS_PULL_DOWN:
return pca953x_gpio_set_pull_up_down(chip, offset, config);
pchip->irq1 = irq1;
gpio_reg_base = devm_platform_ioremap_resource(pdev, 0);
- if (!gpio_reg_base)
- return -EINVAL;
+ if (IS_ERR(gpio_reg_base))
+ return PTR_ERR(gpio_reg_base);
clk = clk_get(&pdev->dev, NULL);
if (IS_ERR(clk)) {
struct tegra_gpio_info *tgi = bank->tgi;
unsigned int gpio = d->hwirq;
+ tegra_gpio_irq_mask(d);
gpiochip_unlock_as_irq(&tgi->gc, gpio);
}
if (ret)
goto out_free_descs;
}
+
+ atomic_notifier_call_chain(&desc->gdev->notifier,
+ GPIOLINE_CHANGED_REQUESTED, desc);
+
dev_dbg(&gdev->dev, "registered chardev handle for line %d\n",
offset);
}
if (ret)
goto out_free_desc;
+ atomic_notifier_call_chain(&desc->gdev->notifier,
+ GPIOLINE_CHANGED_REQUESTED, desc);
+
le->irq = gpiod_to_irq(desc);
if (le->irq <= 0) {
ret = -ENODEV;
struct gpioline_info *info)
{
struct gpio_chip *gc = desc->gdev->chip;
+ bool ok_for_pinctrl;
unsigned long flags;
+ /*
+ * This function takes a mutex so we must check this before taking
+ * the spinlock.
+ *
+ * FIXME: find a non-racy way to retrieve this information. Maybe a
+ * lock common to both frameworks?
+ */
+ ok_for_pinctrl =
+ pinctrl_gpio_can_use_line(gc->base + info->line_offset);
+
spin_lock_irqsave(&gpio_lock, flags);
if (desc->name) {
test_bit(FLAG_USED_AS_IRQ, &desc->flags) ||
test_bit(FLAG_EXPORT, &desc->flags) ||
test_bit(FLAG_SYSFS, &desc->flags) ||
- !pinctrl_gpio_can_use_line(gc->base + info->line_offset))
+ !ok_for_pinctrl)
info->flags |= GPIOLINE_FLAG_KERNEL;
if (test_bit(FLAG_IS_OUT, &desc->flags))
info->flags |= GPIOLINE_FLAG_IS_OUT;
void __user *ip = (void __user *)arg;
struct gpio_desc *desc;
__u32 offset;
+ int hwgpio;
/* We fail any subsequent ioctl():s when the chip is gone */
if (!gc)
if (IS_ERR(desc))
return PTR_ERR(desc);
+ hwgpio = gpio_chip_hwgpio(desc);
+
+ if (cmd == GPIO_GET_LINEINFO_WATCH_IOCTL &&
+ test_bit(hwgpio, priv->watched_lines))
+ return -EBUSY;
+
gpio_desc_to_lineinfo(desc, &lineinfo);
if (copy_to_user(ip, &lineinfo, sizeof(lineinfo)))
return -EFAULT;
if (cmd == GPIO_GET_LINEINFO_WATCH_IOCTL)
- set_bit(gpio_chip_hwgpio(desc), priv->watched_lines);
+ set_bit(hwgpio, priv->watched_lines);
return 0;
} else if (cmd == GPIO_GET_LINEHANDLE_IOCTL) {
if (IS_ERR(desc))
return PTR_ERR(desc);
- clear_bit(gpio_chip_hwgpio(desc), priv->watched_lines);
+ hwgpio = gpio_chip_hwgpio(desc);
+
+ if (!test_bit(hwgpio, priv->watched_lines))
+ return -EBUSY;
+
+ clear_bit(hwgpio, priv->watched_lines);
return 0;
}
return -EINVAL;
}
done:
spin_unlock_irqrestore(&gpio_lock, flags);
- atomic_notifier_call_chain(&desc->gdev->notifier,
- GPIOLINE_CHANGED_REQUESTED, desc);
return ret;
}
}
}
- if (test_bit(FLAG_IS_OUT, &desc->flags)) {
+ /* To be valid for IRQ the line needs to be input or open drain */
+ if (test_bit(FLAG_IS_OUT, &desc->flags) &&
+ !test_bit(FLAG_OPEN_DRAIN, &desc->flags)) {
chip_err(gc,
"%s: tried to flag a GPIO set as output for IRQ\n",
__func__);
if (!IS_ERR(desc) &&
!WARN_ON(!test_bit(FLAG_USED_AS_IRQ, &desc->flags))) {
- WARN_ON(test_bit(FLAG_IS_OUT, &desc->flags));
+ /*
+ * We must not be output when using IRQ UNLESS we are
+ * open drain.
+ */
+ WARN_ON(test_bit(FLAG_IS_OUT, &desc->flags) &&
+ !test_bit(FLAG_OPEN_DRAIN, &desc->flags));
set_bit(FLAG_IRQ_IS_ENABLED, &desc->flags);
}
}
return ERR_PTR(ret);
}
+ atomic_notifier_call_chain(&desc->gdev->notifier,
+ GPIOLINE_CHANGED_REQUESTED, desc);
+
return desc;
}
EXPORT_SYMBOL_GPL(gpiod_get_index);
return ERR_PTR(ret);
}
+ atomic_notifier_call_chain(&desc->gdev->notifier,
+ GPIOLINE_CHANGED_REQUESTED, desc);
+
return desc;
}
EXPORT_SYMBOL_GPL(fwnode_get_named_gpiod);
gpiolib_initialized = true;
gpiochip_setup_devs();
- if (IS_ENABLED(CONFIG_OF_DYNAMIC))
- WARN_ON(of_reconfig_notifier_register(&gpio_of_notifier));
+#if IS_ENABLED(CONFIG_OF_DYNAMIC) && IS_ENABLED(CONFIG_OF_GPIO)
+ WARN_ON(of_reconfig_notifier_register(&gpio_of_notifier));
+#endif /* CONFIG_OF_DYNAMIC && CONFIG_OF_GPIO */
return ret;
}
/* s3/s4 mask */
bool in_suspend;
+ bool in_hibernate;
/* record last mm index being written through WREG32*/
unsigned long last_mm_index;
}
/* Free the BO*/
- amdgpu_bo_unref(&mem->bo);
+ drm_gem_object_put_unlocked(&mem->bo->tbo.base);
mutex_destroy(&mem->lock);
kfree(mem);
| KFD_IOC_ALLOC_MEM_FLAGS_WRITABLE
| KFD_IOC_ALLOC_MEM_FLAGS_EXECUTABLE;
- (*mem)->bo = amdgpu_bo_ref(bo);
+ drm_gem_object_get(&bo->tbo.base);
+ (*mem)->bo = bo;
(*mem)->va = va;
(*mem)->domain = (bo->preferred_domains & AMDGPU_GEM_DOMAIN_VRAM) ?
AMDGPU_GEM_DOMAIN_VRAM : AMDGPU_GEM_DOMAIN_GTT;
struct amdgpu_device *adev = drm_dev->dev_private;
int r;
+ adev->in_hibernate = true;
r = amdgpu_device_suspend(drm_dev, true);
+ adev->in_hibernate = false;
if (r)
return r;
return amdgpu_asic_reset(adev);
u32 cpp;
u64 flags = AMDGPU_GEM_CREATE_CPU_ACCESS_REQUIRED |
AMDGPU_GEM_CREATE_VRAM_CONTIGUOUS |
- AMDGPU_GEM_CREATE_VRAM_CLEARED |
- AMDGPU_GEM_CREATE_CPU_GTT_USWC;
+ AMDGPU_GEM_CREATE_VRAM_CLEARED;
info = drm_get_format_info(adev->ddev, mode_cmd);
cpp = info->cpp[0];
/* === CGCG /CGLS for GFX 3D Only === */
gfx_v10_0_update_3d_clock_gating(adev, enable);
/* === MGCG + MGLS === */
- /* gfx_v10_0_update_medium_grain_clock_gating(adev, enable); */
+ gfx_v10_0_update_medium_grain_clock_gating(adev, enable);
}
if (adev->cg_flags &
switch (adev->asic_type) {
case CHIP_NAVI10:
case CHIP_NAVI14:
- if (!enable) {
- amdgpu_gfx_off_ctrl(adev, false);
- cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
- } else
- amdgpu_gfx_off_ctrl(adev, true);
+ amdgpu_gfx_off_ctrl(adev, enable);
break;
default:
break;
ref, mask);
}
+static void gfx_v10_0_ring_soft_recovery(struct amdgpu_ring *ring,
+ unsigned vmid)
+{
+ struct amdgpu_device *adev = ring->adev;
+ uint32_t value = 0;
+
+ value = REG_SET_FIELD(value, SQ_CMD, CMD, 0x03);
+ value = REG_SET_FIELD(value, SQ_CMD, MODE, 0x01);
+ value = REG_SET_FIELD(value, SQ_CMD, CHECK_VMID, 1);
+ value = REG_SET_FIELD(value, SQ_CMD, VM_ID, vmid);
+ WREG32_SOC15(GC, 0, mmSQ_CMD, value);
+}
+
static void
gfx_v10_0_set_gfx_eop_interrupt_state(struct amdgpu_device *adev,
uint32_t me, uint32_t pipe,
.emit_wreg = gfx_v10_0_ring_emit_wreg,
.emit_reg_wait = gfx_v10_0_ring_emit_reg_wait,
.emit_reg_write_reg_wait = gfx_v10_0_ring_emit_reg_write_reg_wait,
+ .soft_recovery = gfx_v10_0_ring_soft_recovery,
};
static const struct amdgpu_ring_funcs gfx_v10_0_ring_funcs_compute = {
{ 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc8 },
/* https://bugzilla.kernel.org/show_bug.cgi?id=207171 */
{ 0x1002, 0x15dd, 0x103c, 0x83e7, 0xd3 },
+ /* GFXOFF is unstable on C6 parts with a VBIOS 113-RAVEN-114 */
+ { 0x1002, 0x15dd, 0x1002, 0x15dd, 0xc6 },
{ 0, 0, 0, 0, 0 },
};
switch (adev->asic_type) {
case CHIP_RAVEN:
case CHIP_RENOIR:
- if (!enable) {
+ if (!enable)
amdgpu_gfx_off_ctrl(adev, false);
- cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
- }
+
if (adev->pg_flags & AMD_PG_SUPPORT_RLC_SMU_HS) {
gfx_v9_0_enable_sck_slow_down_on_power_up(adev, true);
gfx_v9_0_enable_sck_slow_down_on_power_down(adev, true);
amdgpu_gfx_off_ctrl(adev, true);
break;
case CHIP_VEGA12:
- if (!enable) {
- amdgpu_gfx_off_ctrl(adev, false);
- cancel_delayed_work_sync(&adev->gfx.gfx_off_delay_work);
- } else {
- amdgpu_gfx_off_ctrl(adev, true);
- }
+ amdgpu_gfx_off_ctrl(adev, enable);
break;
default:
break;
/* Check with device cgroup if @kfd device is accessible */
static inline int kfd_devcgroup_check_permission(struct kfd_dev *kfd)
{
-#if defined(CONFIG_CGROUP_DEVICE)
+#if defined(CONFIG_CGROUP_DEVICE) || defined(CONFIG_CGROUP_BPF)
struct drm_device *ddev = kfd->ddev;
return devcgroup_check_permission(DEVCG_DEV_CHAR, ddev->driver->major,
/**
* dm_crtc_high_irq() - Handles CRTC interrupt
- * @interrupt_params: ignored
+ * @interrupt_params: used for determining the CRTC instance
*
* Handles the CRTC/VSYNC interrupt by notfying DRM's VBLANK
* event handler.
unsigned long flags;
acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - IRQ_TYPE_VBLANK);
-
- if (acrtc) {
- acrtc_state = to_dm_crtc_state(acrtc->base.state);
-
- DRM_DEBUG_VBL("crtc:%d, vupdate-vrr:%d\n",
- acrtc->crtc_id,
- amdgpu_dm_vrr_active(acrtc_state));
-
- /* Core vblank handling at start of front-porch is only possible
- * in non-vrr mode, as only there vblank timestamping will give
- * valid results while done in front-porch. Otherwise defer it
- * to dm_vupdate_high_irq after end of front-porch.
- */
- if (!amdgpu_dm_vrr_active(acrtc_state))
- drm_crtc_handle_vblank(&acrtc->base);
-
- /* Following stuff must happen at start of vblank, for crc
- * computation and below-the-range btr support in vrr mode.
- */
- amdgpu_dm_crtc_handle_crc_irq(&acrtc->base);
-
- if (acrtc_state->stream && adev->family >= AMDGPU_FAMILY_AI &&
- acrtc_state->vrr_params.supported &&
- acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE) {
- spin_lock_irqsave(&adev->ddev->event_lock, flags);
- mod_freesync_handle_v_update(
- adev->dm.freesync_module,
- acrtc_state->stream,
- &acrtc_state->vrr_params);
-
- dc_stream_adjust_vmin_vmax(
- adev->dm.dc,
- acrtc_state->stream,
- &acrtc_state->vrr_params.adjust);
- spin_unlock_irqrestore(&adev->ddev->event_lock, flags);
- }
- }
-}
-
-#if defined(CONFIG_DRM_AMD_DC_DCN)
-/**
- * dm_dcn_crtc_high_irq() - Handles VStartup interrupt for DCN generation ASICs
- * @interrupt params - interrupt parameters
- *
- * Notify DRM's vblank event handler at VSTARTUP
- *
- * Unlike DCE hardware, we trigger the handler at VSTARTUP. at which:
- * * We are close enough to VUPDATE - the point of no return for hw
- * * We are in the fixed portion of variable front porch when vrr is enabled
- * * We are before VUPDATE, where double-buffered vrr registers are swapped
- *
- * It is therefore the correct place to signal vblank, send user flip events,
- * and update VRR.
- */
-static void dm_dcn_crtc_high_irq(void *interrupt_params)
-{
- struct common_irq_params *irq_params = interrupt_params;
- struct amdgpu_device *adev = irq_params->adev;
- struct amdgpu_crtc *acrtc;
- struct dm_crtc_state *acrtc_state;
- unsigned long flags;
-
- acrtc = get_crtc_by_otg_inst(adev, irq_params->irq_src - IRQ_TYPE_VBLANK);
-
if (!acrtc)
return;
amdgpu_dm_vrr_active(acrtc_state),
acrtc_state->active_planes);
+ /**
+ * Core vblank handling at start of front-porch is only possible
+ * in non-vrr mode, as only there vblank timestamping will give
+ * valid results while done in front-porch. Otherwise defer it
+ * to dm_vupdate_high_irq after end of front-porch.
+ */
+ if (!amdgpu_dm_vrr_active(acrtc_state))
+ drm_crtc_handle_vblank(&acrtc->base);
+
+ /**
+ * Following stuff must happen at start of vblank, for crc
+ * computation and below-the-range btr support in vrr mode.
+ */
amdgpu_dm_crtc_handle_crc_irq(&acrtc->base);
- drm_crtc_handle_vblank(&acrtc->base);
+
+ /* BTR updates need to happen before VUPDATE on Vega and above. */
+ if (adev->family < AMDGPU_FAMILY_AI)
+ return;
spin_lock_irqsave(&adev->ddev->event_lock, flags);
- if (acrtc_state->vrr_params.supported &&
+ if (acrtc_state->stream && acrtc_state->vrr_params.supported &&
acrtc_state->freesync_config.state == VRR_STATE_ACTIVE_VARIABLE) {
- mod_freesync_handle_v_update(
- adev->dm.freesync_module,
- acrtc_state->stream,
- &acrtc_state->vrr_params);
+ mod_freesync_handle_v_update(adev->dm.freesync_module,
+ acrtc_state->stream,
+ &acrtc_state->vrr_params);
- dc_stream_adjust_vmin_vmax(
- adev->dm.dc,
- acrtc_state->stream,
- &acrtc_state->vrr_params.adjust);
+ dc_stream_adjust_vmin_vmax(adev->dm.dc, acrtc_state->stream,
+ &acrtc_state->vrr_params.adjust);
}
/*
* avoid race conditions between flip programming and completion,
* which could cause too early flip completion events.
*/
- if (acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED &&
+ if (adev->family >= AMDGPU_FAMILY_RV &&
+ acrtc->pflip_status == AMDGPU_FLIP_SUBMITTED &&
acrtc_state->active_planes == 0) {
if (acrtc->event) {
drm_crtc_send_vblank_event(&acrtc->base, acrtc->event);
spin_unlock_irqrestore(&adev->ddev->event_lock, flags);
}
-#endif
static int dm_set_clockgating_state(void *handle,
enum amd_clockgating_state state)
c_irq_params->adev = adev;
c_irq_params->irq_src = int_params.irq_source;
+ amdgpu_dm_irq_register_interrupt(
+ adev, &int_params, dm_crtc_high_irq, c_irq_params);
+ }
+
+ /* Use VUPDATE_NO_LOCK interrupt on DCN, which seems to correspond to
+ * the regular VUPDATE interrupt on DCE. We want DC_IRQ_SOURCE_VUPDATEx
+ * to trigger at end of each vblank, regardless of state of the lock,
+ * matching DCE behaviour.
+ */
+ for (i = DCN_1_0__SRCID__OTG0_IHC_V_UPDATE_NO_LOCK_INTERRUPT;
+ i <= DCN_1_0__SRCID__OTG0_IHC_V_UPDATE_NO_LOCK_INTERRUPT + adev->mode_info.num_crtc - 1;
+ i++) {
+ r = amdgpu_irq_add_id(adev, SOC15_IH_CLIENTID_DCE, i, &adev->vupdate_irq);
+
+ if (r) {
+ DRM_ERROR("Failed to add vupdate irq id!\n");
+ return r;
+ }
+
+ int_params.int_context = INTERRUPT_HIGH_IRQ_CONTEXT;
+ int_params.irq_source =
+ dc_interrupt_to_irq_source(dc, i, 0);
+
+ c_irq_params = &adev->dm.vupdate_params[int_params.irq_source - DC_IRQ_SOURCE_VUPDATE1];
+
+ c_irq_params->adev = adev;
+ c_irq_params->irq_src = int_params.irq_source;
+
amdgpu_dm_irq_register_interrupt(adev, &int_params,
- dm_dcn_crtc_high_irq, c_irq_params);
+ dm_vupdate_high_irq, c_irq_params);
}
/* Use GRPH_PFLIP interrupt */
struct amdgpu_device *adev = crtc->dev->dev_private;
int rc;
- /* Do not set vupdate for DCN hardware */
- if (adev->family > AMDGPU_FAMILY_AI)
- return 0;
-
irq_source = IRQ_TYPE_VUPDATE + acrtc->otg_inst;
rc = dc_interrupt_set(adev->dm.dc, irq_source, enable) ? 0 : -EBUSY;
struct drm_crtc_state *old_crtc_state, *new_crtc_state;
struct dm_crtc_state *dm_new_crtc_state, *dm_old_crtc_state;
struct dm_plane_state *dm_new_plane_state, *dm_old_plane_state;
+ struct amdgpu_crtc *new_acrtc;
bool needs_reset;
int ret = 0;
dm_new_plane_state = to_dm_plane_state(new_plane_state);
dm_old_plane_state = to_dm_plane_state(old_plane_state);
- /*TODO Implement atomic check for cursor plane */
- if (plane->type == DRM_PLANE_TYPE_CURSOR)
+ /*TODO Implement better atomic check for cursor plane */
+ if (plane->type == DRM_PLANE_TYPE_CURSOR) {
+ if (!enable || !new_plane_crtc ||
+ drm_atomic_plane_disabling(plane->state, new_plane_state))
+ return 0;
+
+ new_acrtc = to_amdgpu_crtc(new_plane_crtc);
+
+ if ((new_plane_state->crtc_w > new_acrtc->max_cursor_width) ||
+ (new_plane_state->crtc_h > new_acrtc->max_cursor_height)) {
+ DRM_DEBUG_ATOMIC("Bad cursor size %d x %d\n",
+ new_plane_state->crtc_w, new_plane_state->crtc_h);
+ return -EINVAL;
+ }
+
return 0;
+ }
needs_reset = should_reset_plane(state, plane, old_plane_state,
new_plane_state);
struct mod_hdcp_display *display = &hdcp_work[link_index].display;
struct mod_hdcp_link *link = &hdcp_work[link_index].link;
- memset(display, 0, sizeof(*display));
- memset(link, 0, sizeof(*link));
-
- display->index = aconnector->base.index;
-
if (config->dpms_off) {
hdcp_remove_display(hdcp_work, link_index, aconnector);
return;
}
+
+ memset(display, 0, sizeof(*display));
+ memset(link, 0, sizeof(*link));
+
+ display->index = aconnector->base.index;
display->state = MOD_HDCP_DISPLAY_ACTIVE;
if (aconnector->dc_sink != NULL)
return dpcd_tr_pattern;
}
+static uint8_t dc_dp_initialize_scrambling_data_symbols(
+ struct dc_link *link,
+ enum dc_dp_training_pattern pattern)
+{
+ uint8_t disable_scrabled_data_symbols = 0;
+
+ switch (pattern) {
+ case DP_TRAINING_PATTERN_SEQUENCE_1:
+ case DP_TRAINING_PATTERN_SEQUENCE_2:
+ case DP_TRAINING_PATTERN_SEQUENCE_3:
+ disable_scrabled_data_symbols = 1;
+ break;
+ case DP_TRAINING_PATTERN_SEQUENCE_4:
+ disable_scrabled_data_symbols = 0;
+ break;
+ default:
+ ASSERT(0);
+ DC_LOG_HW_LINK_TRAINING("%s: Invalid HW Training pattern: %d\n",
+ __func__, pattern);
+ break;
+ }
+ return disable_scrabled_data_symbols;
+}
+
static inline bool is_repeater(struct dc_link *link, uint32_t offset)
{
return (!link->is_lttpr_mode_transparent && offset != 0);
dpcd_pattern.v1_4.TRAINING_PATTERN_SET =
dc_dp_training_pattern_to_dpcd_training_pattern(link, pattern);
+ dpcd_pattern.v1_4.SCRAMBLING_DISABLE =
+ dc_dp_initialize_scrambling_data_symbols(link, pattern);
+
dpcd_lt_buffer[DP_TRAINING_PATTERN_SET - DP_TRAINING_PATTERN_SET]
= dpcd_pattern.raw;
hws->funcs.verify_allow_pstate_change_high(dc);
}
+/**
+ * delay_cursor_until_vupdate() - Delay cursor update if too close to VUPDATE.
+ *
+ * Software keepout workaround to prevent cursor update locking from stalling
+ * out cursor updates indefinitely or from old values from being retained in
+ * the case where the viewport changes in the same frame as the cursor.
+ *
+ * The idea is to calculate the remaining time from VPOS to VUPDATE. If it's
+ * too close to VUPDATE, then stall out until VUPDATE finishes.
+ *
+ * TODO: Optimize cursor programming to be once per frame before VUPDATE
+ * to avoid the need for this workaround.
+ */
+static void delay_cursor_until_vupdate(struct dc *dc, struct pipe_ctx *pipe_ctx)
+{
+ struct dc_stream_state *stream = pipe_ctx->stream;
+ struct crtc_position position;
+ uint32_t vupdate_start, vupdate_end;
+ unsigned int lines_to_vupdate, us_to_vupdate, vpos;
+ unsigned int us_per_line, us_vupdate;
+
+ if (!dc->hwss.calc_vupdate_position || !dc->hwss.get_position)
+ return;
+
+ if (!pipe_ctx->stream_res.stream_enc || !pipe_ctx->stream_res.tg)
+ return;
+
+ dc->hwss.calc_vupdate_position(dc, pipe_ctx, &vupdate_start,
+ &vupdate_end);
+
+ dc->hwss.get_position(&pipe_ctx, 1, &position);
+ vpos = position.vertical_count;
+
+ /* Avoid wraparound calculation issues */
+ vupdate_start += stream->timing.v_total;
+ vupdate_end += stream->timing.v_total;
+ vpos += stream->timing.v_total;
+
+ if (vpos <= vupdate_start) {
+ /* VPOS is in VACTIVE or back porch. */
+ lines_to_vupdate = vupdate_start - vpos;
+ } else if (vpos > vupdate_end) {
+ /* VPOS is in the front porch. */
+ return;
+ } else {
+ /* VPOS is in VUPDATE. */
+ lines_to_vupdate = 0;
+ }
+
+ /* Calculate time until VUPDATE in microseconds. */
+ us_per_line =
+ stream->timing.h_total * 10000u / stream->timing.pix_clk_100hz;
+ us_to_vupdate = lines_to_vupdate * us_per_line;
+
+ /* 70 us is a conservative estimate of cursor update time*/
+ if (us_to_vupdate > 70)
+ return;
+
+ /* Stall out until the cursor update completes. */
+ if (vupdate_end < vupdate_start)
+ vupdate_end += stream->timing.v_total;
+ us_vupdate = (vupdate_end - vupdate_start + 1) * us_per_line;
+ udelay(us_to_vupdate + us_vupdate);
+}
+
void dcn10_cursor_lock(struct dc *dc, struct pipe_ctx *pipe, bool lock)
{
/* cursor lock is per MPCC tree, so only need to lock one pipe per stream */
if (!pipe || pipe->top_pipe)
return;
+ /* Prevent cursor lock from stalling out cursor updates. */
+ if (lock)
+ delay_cursor_until_vupdate(dc, pipe);
+
dc->res_pool->mpc->funcs->cursor_lock(dc->res_pool->mpc,
pipe->stream_res.opp->inst, lock);
}
return vertical_line_start;
}
-static void dcn10_calc_vupdate_position(
+void dcn10_calc_vupdate_position(
struct dc *dc,
struct pipe_ctx *pipe_ctx,
uint32_t *start_line,
void dcn10_hw_sequencer_construct(struct dc *dc);
int dcn10_get_vupdate_offset_from_vsync(struct pipe_ctx *pipe_ctx);
+void dcn10_calc_vupdate_position(
+ struct dc *dc,
+ struct pipe_ctx *pipe_ctx,
+ uint32_t *start_line,
+ uint32_t *end_line);
void dcn10_setup_vupdate_interrupt(struct dc *dc, struct pipe_ctx *pipe_ctx);
enum dc_status dcn10_enable_stream_timing(
struct pipe_ctx *pipe_ctx,
.set_clock = dcn10_set_clock,
.get_clock = dcn10_get_clock,
.get_vupdate_offset_from_vsync = dcn10_get_vupdate_offset_from_vsync,
+ .calc_vupdate_position = dcn10_calc_vupdate_position,
};
static const struct hwseq_private_funcs dcn10_private_funcs = {
.init_vm_ctx = dcn20_init_vm_ctx,
.set_flip_control_gsl = dcn20_set_flip_control_gsl,
.get_vupdate_offset_from_vsync = dcn10_get_vupdate_offset_from_vsync,
+ .calc_vupdate_position = dcn10_calc_vupdate_position,
};
static const struct hwseq_private_funcs dcn20_private_funcs = {
.optimize_pwr_state = dcn21_optimize_pwr_state,
.exit_optimized_pwr_state = dcn21_exit_optimized_pwr_state,
.get_vupdate_offset_from_vsync = dcn10_get_vupdate_offset_from_vsync,
+ .calc_vupdate_position = dcn10_calc_vupdate_position,
.set_cursor_position = dcn10_set_cursor_position,
.set_cursor_attribute = dcn10_set_cursor_attribute,
.set_cursor_sdr_white_level = dcn10_set_cursor_sdr_white_level,
endif
CFLAGS_$(AMDDALPATH)/dc/dml/dml1_display_rq_dlg_calc.o := $(dml_ccflags)
CFLAGS_$(AMDDALPATH)/dc/dml/display_rq_dlg_helpers.o := $(dml_ccflags)
-CFLAGS_$(AMDDALPATH)/dc/dml/dml_common_defs.o := $(dml_ccflags)
DML = display_mode_lib.o display_rq_dlg_helpers.o dml1_display_rq_dlg_calc.o \
- dml_common_defs.o
ifdef CONFIG_DRM_AMD_DC_DCN
DML += display_mode_vba.o dcn20/display_rq_dlg_calc_20.o dcn20/display_mode_vba_20.o
#ifndef __DML20_DISPLAY_RQ_DLG_CALC_H__
#define __DML20_DISPLAY_RQ_DLG_CALC_H__
-#include "../dml_common_defs.h"
#include "../display_rq_dlg_helpers.h"
struct display_mode_lib;
#ifndef __DML20V2_DISPLAY_RQ_DLG_CALC_H__
#define __DML20V2_DISPLAY_RQ_DLG_CALC_H__
-#include "../dml_common_defs.h"
#include "../display_rq_dlg_helpers.h"
struct display_mode_lib;
#ifndef __DML21_DISPLAY_RQ_DLG_CALC_H__
#define __DML21_DISPLAY_RQ_DLG_CALC_H__
-#include "../dml_common_defs.h"
+#include "dm_services.h"
#include "../display_rq_dlg_helpers.h"
struct display_mode_lib;
#ifndef __DISPLAY_MODE_LIB_H__
#define __DISPLAY_MODE_LIB_H__
-
-#include "dml_common_defs.h"
+#include "dm_services.h"
+#include "dc_features.h"
+#include "display_mode_structs.h"
+#include "display_mode_enums.h"
#include "display_mode_vba.h"
enum dml_project {
#ifndef __DML2_DISPLAY_MODE_VBA_H__
#define __DML2_DISPLAY_MODE_VBA_H__
-#include "dml_common_defs.h"
-
struct display_mode_lib;
void ModeSupportAndSystemConfiguration(struct display_mode_lib *mode_lib);
#ifndef __DISPLAY_RQ_DLG_HELPERS_H__
#define __DISPLAY_RQ_DLG_HELPERS_H__
-#include "dml_common_defs.h"
#include "display_mode_lib.h"
/* Function: Printer functions
#ifndef __DISPLAY_RQ_DLG_CALC_H__
#define __DISPLAY_RQ_DLG_CALC_H__
-#include "dml_common_defs.h"
-
struct display_mode_lib;
#include "display_rq_dlg_helpers.h"
+++ /dev/null
-/*
- * Copyright 2017 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#include "dml_common_defs.h"
-#include "dcn_calc_math.h"
-
-#include "dml_inline_defs.h"
-
-double dml_round(double a)
-{
- double round_pt = 0.5;
- double ceil = dml_ceil(a, 1);
- double floor = dml_floor(a, 1);
-
- if (a - floor >= round_pt)
- return ceil;
- else
- return floor;
-}
-
-
+++ /dev/null
-/*
- * Copyright 2017 Advanced Micro Devices, Inc.
- *
- * Permission is hereby granted, free of charge, to any person obtaining a
- * copy of this software and associated documentation files (the "Software"),
- * to deal in the Software without restriction, including without limitation
- * the rights to use, copy, modify, merge, publish, distribute, sublicense,
- * and/or sell copies of the Software, and to permit persons to whom the
- * Software is furnished to do so, subject to the following conditions:
- *
- * The above copyright notice and this permission notice shall be included in
- * all copies or substantial portions of the Software.
- *
- * THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
- * IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
- * FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
- * THE COPYRIGHT HOLDER(S) OR AUTHOR(S) BE LIABLE FOR ANY CLAIM, DAMAGES OR
- * OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE,
- * ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR
- * OTHER DEALINGS IN THE SOFTWARE.
- *
- * Authors: AMD
- *
- */
-
-#ifndef __DC_COMMON_DEFS_H__
-#define __DC_COMMON_DEFS_H__
-
-#include "dm_services.h"
-#include "dc_features.h"
-#include "display_mode_structs.h"
-#include "display_mode_enums.h"
-
-
-double dml_round(double a);
-
-#endif /* __DC_COMMON_DEFS_H__ */
#ifndef __DML_INLINE_DEFS_H__
#define __DML_INLINE_DEFS_H__
-#include "dml_common_defs.h"
#include "dcn_calc_math.h"
#include "dml_logger.h"
return (double) dcn_bw_floor2(a, granularity);
}
+static inline double dml_round(double a)
+{
+ double round_pt = 0.5;
+ double ceil = dml_ceil(a, 1);
+ double floor = dml_floor(a, 1);
+
+ if (a - floor >= round_pt)
+ return ceil;
+ else
+ return floor;
+}
+
static inline int dml_log2(double x)
{
return dml_round((double)dcn_bw_log(x, 2));
static inline unsigned int dml_round_to_multiple(unsigned int num,
unsigned int multiple,
- bool up)
+ unsigned char up)
{
unsigned int remainder;
void (*get_position)(struct pipe_ctx **pipe_ctx, int num_pipes,
struct crtc_position *position);
int (*get_vupdate_offset_from_vsync)(struct pipe_ctx *pipe_ctx);
+ void (*calc_vupdate_position)(
+ struct dc *dc,
+ struct pipe_ctx *pipe_ctx,
+ uint32_t *start_line,
+ uint32_t *end_line);
void (*enable_per_frame_crtc_position_reset)(struct dc *dc,
int group_size, struct pipe_ctx *grouped_pipes[]);
void (*enable_timing_synchronization)(struct dc *dc,
if (*level & profile_mode_mask) {
hwmgr->saved_dpm_level = hwmgr->dpm_level;
hwmgr->en_umd_pstate = true;
- amdgpu_device_ip_set_clockgating_state(hwmgr->adev,
- AMD_IP_BLOCK_TYPE_GFX,
- AMD_CG_STATE_UNGATE);
amdgpu_device_ip_set_powergating_state(hwmgr->adev,
AMD_IP_BLOCK_TYPE_GFX,
AMD_PG_STATE_UNGATE);
+ amdgpu_device_ip_set_clockgating_state(hwmgr->adev,
+ AMD_IP_BLOCK_TYPE_GFX,
+ AMD_CG_STATE_UNGATE);
}
} else {
/* exit umd pstate, restore level, enable gfx cg*/
bool use_baco = !smu->is_apu &&
((adev->in_gpu_reset &&
(amdgpu_asic_reset_method(adev) == AMD_RESET_METHOD_BACO)) ||
- (adev->in_runpm && amdgpu_asic_supports_baco(adev)));
+ ((adev->in_runpm || adev->in_hibernate) && amdgpu_asic_supports_baco(adev)));
ret = smu_get_smc_version(smu, NULL, &smu_version);
if (ret) {
if (*level & profile_mode_mask) {
smu_dpm_ctx->saved_dpm_level = smu_dpm_ctx->dpm_level;
smu_dpm_ctx->enable_umd_pstate = true;
- amdgpu_device_ip_set_clockgating_state(smu->adev,
- AMD_IP_BLOCK_TYPE_GFX,
- AMD_CG_STATE_UNGATE);
amdgpu_device_ip_set_powergating_state(smu->adev,
AMD_IP_BLOCK_TYPE_GFX,
AMD_PG_STATE_UNGATE);
+ amdgpu_device_ip_set_clockgating_state(smu->adev,
+ AMD_IP_BLOCK_TYPE_GFX,
+ AMD_CG_STATE_UNGATE);
}
} else {
/* exit umd pstate, restore level, enable gfx cg*/
{
struct drm_dp_mst_port *immediate_upstream_port;
struct drm_dp_mst_port *fec_port;
- struct drm_dp_desc desc = { 0 };
+ struct drm_dp_desc desc = { };
u8 endpoint_fec;
u8 endpoint_dsc;
{ "HVR", 0xaa01, EDID_QUIRK_NON_DESKTOP },
{ "HVR", 0xaa02, EDID_QUIRK_NON_DESKTOP },
- /* Oculus Rift DK1, DK2, and CV1 VR Headsets */
+ /* Oculus Rift DK1, DK2, CV1 and Rift S VR Headsets */
{ "OVR", 0x0001, EDID_QUIRK_NON_DESKTOP },
{ "OVR", 0x0003, EDID_QUIRK_NON_DESKTOP },
{ "OVR", 0x0004, EDID_QUIRK_NON_DESKTOP },
+ { "OVR", 0x0012, EDID_QUIRK_NON_DESKTOP },
/* Windows Mixed Reality Headsets */
{ "ACR", 0x7fce, EDID_QUIRK_NON_DESKTOP },
}
if ((submit->flags & ETNA_SUBMIT_SOFTPIN) &&
- submit->bos[i].va != mapping->iova)
+ submit->bos[i].va != mapping->iova) {
+ etnaviv_gem_mapping_unreference(mapping);
return -EINVAL;
+ }
atomic_inc(&etnaviv_obj->gpu_active);
if (!(gpu->identity.features & meta->feature))
continue;
- if (meta->nr_domains < (index - offset)) {
+ if (index - offset >= meta->nr_domains) {
offset += meta->nr_domains;
continue;
}
if (!ret)
goto err_llb;
else if (ret > 1) {
- DRM_INFO("Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.\n");
-
+ DRM_INFO_ONCE("Reducing the compressed framebuffer size. This may lead to less power savings than a non-reduced-size. Try to increase stolen memory size if available in BIOS.\n");
}
fbc->threshold = ret;
struct drm_i915_private *i915 = to_i915(obj->base.dev);
struct i915_vma *vma;
- GEM_BUG_ON(!i915_gem_object_has_pinned_pages(obj));
if (!atomic_read(&obj->bind_count))
return;
void
i915_gem_object_unpin_from_display_plane(struct i915_vma *vma)
{
- struct drm_i915_gem_object *obj = vma->obj;
-
- assert_object_held(obj);
-
/* Bump the LRU to try and avoid premature eviction whilst flipping */
- i915_gem_object_bump_inactive_ggtt(obj);
+ i915_gem_object_bump_inactive_ggtt(vma->obj);
i915_vma_unpin(vma);
}
#define CONTEXT_NOPREEMPT 7
u32 *lrc_reg_state;
- u64 lrc_desc;
+ union {
+ struct {
+ u32 lrca;
+ u32 ccid;
+ };
+ u64 desc;
+ } lrc;
u32 tag; /* cookie passed to HW to track this context on submission */
/* Time on GPU as tracked by the hw. */
return intel_engine_has_preemption(engine);
}
-static inline bool
-intel_engine_has_timeslices(const struct intel_engine_cs *engine)
-{
- if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
- return false;
-
- return intel_engine_has_semaphores(engine);
-}
-
#endif /* _INTEL_RINGBUFFER_H_ */
if (engine->id == RENDER_CLASS && IS_GEN_RANGE(dev_priv, 4, 7))
drm_printf(m, "\tCCID: 0x%08x\n", ENGINE_READ(engine, CCID));
+ if (HAS_EXECLISTS(dev_priv)) {
+ drm_printf(m, "\tEL_STAT_HI: 0x%08x\n",
+ ENGINE_READ(engine, RING_EXECLIST_STATUS_HI));
+ drm_printf(m, "\tEL_STAT_LO: 0x%08x\n",
+ ENGINE_READ(engine, RING_EXECLIST_STATUS_LO));
+ }
drm_printf(m, "\tRING_START: 0x%08x\n",
ENGINE_READ(engine, RING_START));
drm_printf(m, "\tRING_HEAD: 0x%08x\n",
*/
struct i915_priolist default_priolist;
+ /**
+ * @ccid: identifier for contexts submitted to this engine
+ */
+ u32 ccid;
+
+ /**
+ * @yield: CCID at the time of the last semaphore-wait interrupt.
+ *
+ * Instead of leaving a semaphore busy-spinning on an engine, we would
+ * like to switch to another ready context, i.e. yielding the semaphore
+ * timeslice.
+ */
+ u32 yield;
+
/**
* @error_interrupt: CS Master EIR
*
u32 context_size;
u32 mmio_base;
- unsigned int context_tag;
-#define NUM_CONTEXT_TAG roundup_pow_of_two(2 * EXECLIST_MAX_PORTS)
+ unsigned long context_tag;
struct rb_node uabi_node;
#define I915_ENGINE_SUPPORTS_STATS BIT(1)
#define I915_ENGINE_HAS_PREEMPTION BIT(2)
#define I915_ENGINE_HAS_SEMAPHORES BIT(3)
-#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(4)
-#define I915_ENGINE_IS_VIRTUAL BIT(5)
-#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(6)
-#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(7)
+#define I915_ENGINE_HAS_TIMESLICES BIT(4)
+#define I915_ENGINE_NEEDS_BREADCRUMB_TASKLET BIT(5)
+#define I915_ENGINE_IS_VIRTUAL BIT(6)
+#define I915_ENGINE_HAS_RELATIVE_MMIO BIT(7)
+#define I915_ENGINE_REQUIRES_CMD_PARSER BIT(8)
unsigned int flags;
/*
return engine->flags & I915_ENGINE_HAS_SEMAPHORES;
}
+static inline bool
+intel_engine_has_timeslices(const struct intel_engine_cs *engine)
+{
+ if (!IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
+ return false;
+
+ return engine->flags & I915_ENGINE_HAS_TIMESLICES;
+}
+
static inline bool
intel_engine_needs_breadcrumb_tasklet(const struct intel_engine_cs *engine)
{
}
}
+ if (iir & GT_WAIT_SEMAPHORE_INTERRUPT) {
+ WRITE_ONCE(engine->execlists.yield,
+ ENGINE_READ_FW(engine, RING_EXECLIST_STATUS_HI));
+ ENGINE_TRACE(engine, "semaphore yield: %08x\n",
+ engine->execlists.yield);
+ if (del_timer(&engine->execlists.timer))
+ tasklet = true;
+ }
+
if (iir & GT_CONTEXT_SWITCH_INTERRUPT)
tasklet = true;
const u32 irqs =
GT_CS_MASTER_ERROR_INTERRUPT |
GT_RENDER_USER_INTERRUPT |
- GT_CONTEXT_SWITCH_INTERRUPT;
+ GT_CONTEXT_SWITCH_INTERRUPT |
+ GT_WAIT_SEMAPHORE_INTERRUPT;
struct intel_uncore *uncore = gt->uncore;
const u32 dmask = irqs << 16 | irqs;
const u32 smask = irqs << 16;
const u32 irqs =
GT_CS_MASTER_ERROR_INTERRUPT |
GT_RENDER_USER_INTERRUPT |
- GT_CONTEXT_SWITCH_INTERRUPT;
+ GT_CONTEXT_SWITCH_INTERRUPT |
+ GT_WAIT_SEMAPHORE_INTERRUPT;
const u32 gt_interrupts[] = {
irqs << GEN8_RCS_IRQ_SHIFT | irqs << GEN8_BCS_IRQ_SHIFT,
irqs << GEN8_VCS0_IRQ_SHIFT | irqs << GEN8_VCS1_IRQ_SHIFT,
* engine info, SW context ID and SW counter need to form a unique number
* (Context ID) per lrc.
*/
-static u64
+static u32
lrc_descriptor(struct intel_context *ce, struct intel_engine_cs *engine)
{
- u64 desc;
+ u32 desc;
desc = INTEL_LEGACY_32B_CONTEXT;
if (i915_vm_is_4lvl(ce->vm))
if (IS_GEN(engine->i915, 8))
desc |= GEN8_CTX_L3LLC_COHERENT;
- desc |= i915_ggtt_offset(ce->state); /* bits 12-31 */
- /*
- * The following 32bits are copied into the OA reports (dword 2).
- * Consider updating oa_get_render_ctx_id in i915_perf.c when changing
- * anything below.
- */
- if (INTEL_GEN(engine->i915) >= 11) {
- desc |= (u64)engine->instance << GEN11_ENGINE_INSTANCE_SHIFT;
- /* bits 48-53 */
-
- desc |= (u64)engine->class << GEN11_ENGINE_CLASS_SHIFT;
- /* bits 61-63 */
- }
-
- return desc;
+ return i915_ggtt_offset(ce->state) | desc;
}
static inline unsigned int dword_in_page(void *addr)
__execlists_update_reg_state(ce, engine, head);
/* We've switched away, so this should be a no-op, but intent matters */
- ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
+ ce->lrc.desc |= CTX_DESC_FORCE_RESTORE;
}
static u32 intel_context_get_runtime(const struct intel_context *ce)
if (IS_ENABLED(CONFIG_DRM_I915_DEBUG_GEM))
execlists_check_context(ce, engine);
- ce->lrc_desc &= ~GENMASK_ULL(47, 37);
if (ce->tag) {
/* Use a fixed tag for OA and friends */
- ce->lrc_desc |= (u64)ce->tag << 32;
+ GEM_BUG_ON(ce->tag <= BITS_PER_LONG);
+ ce->lrc.ccid = ce->tag;
} else {
/* We don't need a strict matching tag, just different values */
- ce->lrc_desc |=
- (u64)(++engine->context_tag % NUM_CONTEXT_TAG) <<
- GEN11_SW_CTX_ID_SHIFT;
- BUILD_BUG_ON(NUM_CONTEXT_TAG > GEN12_MAX_CONTEXT_HW_ID);
+ unsigned int tag = ffs(engine->context_tag);
+
+ GEM_BUG_ON(tag == 0 || tag >= BITS_PER_LONG);
+ clear_bit(tag - 1, &engine->context_tag);
+ ce->lrc.ccid = tag << (GEN11_SW_CTX_ID_SHIFT - 32);
+
+ BUILD_BUG_ON(BITS_PER_LONG > GEN12_MAX_CONTEXT_HW_ID);
}
+ ce->lrc.ccid |= engine->execlists.ccid;
+
__intel_gt_pm_get(engine->gt);
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_IN);
intel_engine_context_in(engine);
static inline void
__execlists_schedule_out(struct i915_request *rq,
- struct intel_engine_cs * const engine)
+ struct intel_engine_cs * const engine,
+ unsigned int ccid)
{
struct intel_context * const ce = rq->context;
i915_request_completed(rq))
intel_engine_add_retire(engine, ce->timeline);
+ ccid >>= GEN11_SW_CTX_ID_SHIFT - 32;
+ ccid &= GEN12_MAX_CONTEXT_HW_ID;
+ if (ccid < BITS_PER_LONG) {
+ GEM_BUG_ON(ccid == 0);
+ GEM_BUG_ON(test_bit(ccid - 1, &engine->context_tag));
+ set_bit(ccid - 1, &engine->context_tag);
+ }
+
intel_context_update_runtime(ce);
intel_engine_context_out(engine);
execlists_context_status_change(rq, INTEL_CONTEXT_SCHEDULE_OUT);
{
struct intel_context * const ce = rq->context;
struct intel_engine_cs *cur, *old;
+ u32 ccid;
trace_i915_request_out(rq);
+ ccid = rq->context->lrc.ccid;
old = READ_ONCE(ce->inflight);
do
cur = ptr_unmask_bits(old, 2) ? ptr_dec(old) : NULL;
while (!try_cmpxchg(&ce->inflight, &old, cur));
if (!cur)
- __execlists_schedule_out(rq, old);
+ __execlists_schedule_out(rq, old, ccid);
i915_request_put(rq);
}
static u64 execlists_update_context(struct i915_request *rq)
{
struct intel_context *ce = rq->context;
- u64 desc = ce->lrc_desc;
+ u64 desc = ce->lrc.desc;
u32 tail, prev;
/*
*/
wmb();
- ce->lrc_desc &= ~CTX_DESC_FORCE_RESTORE;
+ ce->lrc.desc &= ~CTX_DESC_FORCE_RESTORE;
return desc;
}
struct i915_request *w =
container_of(p->waiter, typeof(*w), sched);
+ if (p->flags & I915_DEPENDENCY_WEAK)
+ continue;
+
/* Leave semaphores spinning on the other engines */
if (w->engine != rq->engine)
continue;
}
static bool
-need_timeslice(struct intel_engine_cs *engine, const struct i915_request *rq)
+need_timeslice(const struct intel_engine_cs *engine,
+ const struct i915_request *rq)
{
int hint;
return hint >= effective_prio(rq);
}
+static bool
+timeslice_yield(const struct intel_engine_execlists *el,
+ const struct i915_request *rq)
+{
+ /*
+ * Once bitten, forever smitten!
+ *
+ * If the active context ever busy-waited on a semaphore,
+ * it will be treated as a hog until the end of its timeslice (i.e.
+ * until it is scheduled out and replaced by a new submission,
+ * possibly even its own lite-restore). The HW only sends an interrupt
+ * on the first miss, and we do know if that semaphore has been
+ * signaled, or even if it is now stuck on another semaphore. Play
+ * safe, yield if it might be stuck -- it will be given a fresh
+ * timeslice in the near future.
+ */
+ return rq->context->lrc.ccid == READ_ONCE(el->yield);
+}
+
+static bool
+timeslice_expired(const struct intel_engine_execlists *el,
+ const struct i915_request *rq)
+{
+ return timer_expired(&el->timer) || timeslice_yield(el, rq);
+}
+
static int
switch_prio(struct intel_engine_cs *engine, const struct i915_request *rq)
{
return READ_ONCE(engine->props.timeslice_duration_ms);
}
-static unsigned long
-active_timeslice(const struct intel_engine_cs *engine)
+static unsigned long active_timeslice(const struct intel_engine_cs *engine)
{
const struct intel_engine_execlists *execlists = &engine->execlists;
const struct i915_request *rq = *execlists->active;
last = NULL;
} else if (need_timeslice(engine, last) &&
- timer_expired(&engine->execlists.timer)) {
+ timeslice_expired(execlists, last)) {
ENGINE_TRACE(engine,
- "expired last=%llx:%lld, prio=%d, hint=%d\n",
+ "expired last=%llx:%lld, prio=%d, hint=%d, yield?=%s\n",
last->fence.context,
last->fence.seqno,
last->sched.attr.priority,
- execlists->queue_priority_hint);
+ execlists->queue_priority_hint,
+ yesno(timeslice_yield(execlists, last)));
ring_set_paused(engine, 1);
defer_active(engine);
}
clear_ports(port + 1, last_port - port);
+ WRITE_ONCE(execlists->yield, -1);
execlists_submit_ports(engine);
set_preempt_timeout(engine, *active);
} else {
if (IS_ERR(vaddr))
return PTR_ERR(vaddr);
- ce->lrc_desc = lrc_descriptor(ce, engine) | CTX_DESC_FORCE_RESTORE;
+ ce->lrc.lrca = lrc_descriptor(ce, engine) | CTX_DESC_FORCE_RESTORE;
ce->lrc_reg_state = vaddr + LRC_STATE_PN * PAGE_SIZE;
__execlists_update_reg_state(ce, engine, ce->ring->tail);
ce, ce->engine, ce->ring, true);
__execlists_update_reg_state(ce, ce->engine, ce->ring->tail);
- ce->lrc_desc |= CTX_DESC_FORCE_RESTORE;
+ ce->lrc.desc |= CTX_DESC_FORCE_RESTORE;
}
static const struct intel_context_ops execlists_context_ops = {
enable_error_interrupt(engine);
- engine->context_tag = 0;
+ engine->context_tag = GENMASK(BITS_PER_LONG - 2, 0);
}
static bool unexpected_starting_state(struct intel_engine_cs *engine)
head, ce->ring->tail);
__execlists_reset_reg_state(ce, engine);
__execlists_update_reg_state(ce, engine, head);
- ce->lrc_desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
+ ce->lrc.desc |= CTX_DESC_FORCE_RESTORE; /* paranoid: GPU was reset! */
unwind:
/* Push back any incomplete requests for replay after the reset. */
engine->flags |= I915_ENGINE_SUPPORTS_STATS;
if (!intel_vgpu_active(engine->i915)) {
engine->flags |= I915_ENGINE_HAS_SEMAPHORES;
- if (HAS_LOGICAL_RING_PREEMPTION(engine->i915))
+ if (HAS_LOGICAL_RING_PREEMPTION(engine->i915)) {
engine->flags |= I915_ENGINE_HAS_PREEMPTION;
+ if (IS_ACTIVE(CONFIG_DRM_I915_TIMESLICE_DURATION))
+ engine->flags |= I915_ENGINE_HAS_TIMESLICES;
+ }
}
if (INTEL_GEN(engine->i915) >= 12)
engine->irq_enable_mask = GT_RENDER_USER_INTERRUPT << shift;
engine->irq_keep_mask = GT_CONTEXT_SWITCH_INTERRUPT << shift;
engine->irq_keep_mask |= GT_CS_MASTER_ERROR_INTERRUPT << shift;
+ engine->irq_keep_mask |= GT_WAIT_SEMAPHORE_INTERRUPT << shift;
}
static void rcs_submission_override(struct intel_engine_cs *engine)
else
execlists->csb_size = GEN11_CSB_ENTRIES;
+ if (INTEL_GEN(engine->i915) >= 11) {
+ execlists->ccid |= engine->instance << (GEN11_ENGINE_INSTANCE_SHIFT - 32);
+ execlists->ccid |= engine->class << (GEN11_ENGINE_CLASS_SHIFT - 32);
+ }
+
reset_csb_pointers(engine);
/* Finally, take ownership and responsibility for cleanup! */
goto err;
}
- cs = intel_ring_begin(rq, 10);
+ cs = intel_ring_begin(rq, 14);
if (IS_ERR(cs)) {
err = PTR_ERR(cs);
goto err;
*cs++ = MI_SEMAPHORE_WAIT |
MI_SEMAPHORE_GLOBAL_GTT |
MI_SEMAPHORE_POLL |
- MI_SEMAPHORE_SAD_NEQ_SDD;
- *cs++ = 0;
+ MI_SEMAPHORE_SAD_GTE_SDD;
+ *cs++ = idx;
*cs++ = offset;
*cs++ = 0;
*cs++ = offset + idx * sizeof(u32);
*cs++ = 0;
+ *cs++ = MI_STORE_DWORD_IMM_GEN4 | MI_USE_GGTT;
+ *cs++ = offset;
+ *cs++ = 0;
+ *cs++ = idx + 1;
+
intel_ring_advance(rq, cs);
rq->sched.attr.priority = I915_PRIORITY_MASK;
for_each_engine(engine, gt, id) {
enum { A1, A2, B1 };
- enum { X = 1, Y, Z };
+ enum { X = 1, Z, Y };
struct i915_request *rq[3] = {};
struct intel_context *ce;
unsigned long heartbeat;
goto err;
}
- rq[0] = create_rewinder(ce, NULL, slot, 1);
+ rq[0] = create_rewinder(ce, NULL, slot, X);
if (IS_ERR(rq[0])) {
intel_context_put(ce);
goto err;
}
- rq[1] = create_rewinder(ce, NULL, slot, 2);
+ rq[1] = create_rewinder(ce, NULL, slot, Y);
intel_context_put(ce);
if (IS_ERR(rq[1]))
goto err;
goto err;
}
- rq[2] = create_rewinder(ce, rq[0], slot, 3);
+ rq[2] = create_rewinder(ce, rq[0], slot, Z);
intel_context_put(ce);
if (IS_ERR(rq[2]))
goto err;
GEM_BUG_ON(!timer_pending(&engine->execlists.timer));
/* ELSP[] = { { A:rq1, A:rq2 }, { B:rq1 } } */
- GEM_BUG_ON(!i915_request_is_active(rq[A1]));
- GEM_BUG_ON(!i915_request_is_active(rq[A2]));
- GEM_BUG_ON(!i915_request_is_active(rq[B1]));
-
- /* Wait for the timeslice to kick in */
- del_timer(&engine->execlists.timer);
- tasklet_hi_schedule(&engine->execlists.tasklet);
- intel_engine_flush_submission(engine);
-
+ if (i915_request_is_active(rq[A2])) { /* semaphore yielded! */
+ /* Wait for the timeslice to kick in */
+ del_timer(&engine->execlists.timer);
+ tasklet_hi_schedule(&engine->execlists.tasklet);
+ intel_engine_flush_submission(engine);
+ }
/* -> ELSP[] = { { A:rq1 }, { B:rq1 } } */
GEM_BUG_ON(!i915_request_is_active(rq[A1]));
GEM_BUG_ON(!i915_request_is_active(rq[B1]));
static void guc_add_request(struct intel_guc *guc, struct i915_request *rq)
{
struct intel_engine_cs *engine = rq->engine;
- u32 ctx_desc = lower_32_bits(rq->context->lrc_desc);
+ u32 ctx_desc = rq->context->lrc.ccid;
u32 ring_tail = intel_ring_set_tail(rq->ring, rq->tail) / sizeof(u64);
guc_wq_item_append(guc, engine->guc_id, ctx_desc,
SKL_FUSE_PG_DIST_STATUS(SKL_PG0) |
SKL_FUSE_PG_DIST_STATUS(SKL_PG1) |
SKL_FUSE_PG_DIST_STATUS(SKL_PG2);
- vgpu_vreg_t(vgpu, LCPLL1_CTL) |=
- LCPLL_PLL_ENABLE |
- LCPLL_PLL_LOCK;
- vgpu_vreg_t(vgpu, LCPLL2_CTL) |= LCPLL_PLL_ENABLE;
-
+ /*
+ * Only 1 PIPE enabled in current vGPU display and PIPE_A is
+ * tied to TRANSCODER_A in HW, so it's safe to assume PIPE_A,
+ * TRANSCODER_A can be enabled. PORT_x depends on the input of
+ * setup_virtual_dp_monitor, we can bind DPLL0 to any PORT_x
+ * so we fixed to DPLL0 here.
+ * Setup DPLL0: DP link clk 1620 MHz, non SSC, DP Mode
+ */
+ vgpu_vreg_t(vgpu, DPLL_CTRL1) =
+ DPLL_CTRL1_OVERRIDE(DPLL_ID_SKL_DPLL0);
+ vgpu_vreg_t(vgpu, DPLL_CTRL1) |=
+ DPLL_CTRL1_LINK_RATE(DPLL_CTRL1_LINK_RATE_1620, DPLL_ID_SKL_DPLL0);
+ vgpu_vreg_t(vgpu, LCPLL1_CTL) =
+ LCPLL_PLL_ENABLE | LCPLL_PLL_LOCK;
+ vgpu_vreg_t(vgpu, DPLL_STATUS) = DPLL_LOCK(DPLL_ID_SKL_DPLL0);
+ /*
+ * Golden M/N are calculated based on:
+ * 24 bpp, 4 lanes, 154000 pixel clk (from virtual EDID),
+ * DP link clk 1620 MHz and non-constant_n.
+ * TODO: calculate DP link symbol clk and stream clk m/n.
+ */
+ vgpu_vreg_t(vgpu, PIPE_DATA_M1(TRANSCODER_A)) = 63 << TU_SIZE_SHIFT;
+ vgpu_vreg_t(vgpu, PIPE_DATA_M1(TRANSCODER_A)) |= 0x5b425e;
+ vgpu_vreg_t(vgpu, PIPE_DATA_N1(TRANSCODER_A)) = 0x800000;
+ vgpu_vreg_t(vgpu, PIPE_LINK_M1(TRANSCODER_A)) = 0x3cd6e;
+ vgpu_vreg_t(vgpu, PIPE_LINK_N1(TRANSCODER_A)) = 0x80000;
}
if (intel_vgpu_has_monitor_on_port(vgpu, PORT_B)) {
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) &=
+ ~DPLL_CTRL2_DDI_CLK_OFF(PORT_B);
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) |=
+ DPLL_CTRL2_DDI_CLK_SEL(DPLL_ID_SKL_DPLL0, PORT_B);
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) |=
+ DPLL_CTRL2_DDI_SEL_OVERRIDE(PORT_B);
vgpu_vreg_t(vgpu, SFUSE_STRAP) |= SFUSE_STRAP_DDIB_DETECTED;
vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &=
~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK |
}
if (intel_vgpu_has_monitor_on_port(vgpu, PORT_C)) {
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) &=
+ ~DPLL_CTRL2_DDI_CLK_OFF(PORT_C);
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) |=
+ DPLL_CTRL2_DDI_CLK_SEL(DPLL_ID_SKL_DPLL0, PORT_C);
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) |=
+ DPLL_CTRL2_DDI_SEL_OVERRIDE(PORT_C);
vgpu_vreg_t(vgpu, SDEISR) |= SDE_PORTC_HOTPLUG_CPT;
vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &=
~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK |
}
if (intel_vgpu_has_monitor_on_port(vgpu, PORT_D)) {
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) &=
+ ~DPLL_CTRL2_DDI_CLK_OFF(PORT_D);
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) |=
+ DPLL_CTRL2_DDI_CLK_SEL(DPLL_ID_SKL_DPLL0, PORT_D);
+ vgpu_vreg_t(vgpu, DPLL_CTRL2) |=
+ DPLL_CTRL2_DDI_SEL_OVERRIDE(PORT_D);
vgpu_vreg_t(vgpu, SDEISR) |= SDE_PORTD_HOTPLUG_CPT;
vgpu_vreg_t(vgpu, TRANS_DDI_FUNC_CTL(TRANSCODER_A)) &=
~(TRANS_DDI_BPC_MASK | TRANS_DDI_MODE_SELECT_MASK |
shadow_context_descriptor_update(struct intel_context *ce,
struct intel_vgpu_workload *workload)
{
- u64 desc = ce->lrc_desc;
+ u64 desc = ce->lrc.desc;
/*
* Update bits 0-11 of the context descriptor which includes flags
desc |= (u64)workload->ctx_desc.addressing_mode <<
GEN8_CTX_ADDRESSING_MODE_SHIFT;
- ce->lrc_desc = desc;
+ ce->lrc.desc = desc;
}
static int copy_workload_to_ring_buffer(struct intel_vgpu_workload *workload)
for (i = 0; i < GVT_RING_CTX_NR_PDPS; i++) {
struct i915_page_directory * const pd =
i915_pd_entry(ppgtt->pd, i);
-
+ /* skip now as current i915 ppgtt alloc won't allocate
+ top level pdp for non 4-level table, won't impact
+ shadow ppgtt. */
+ if (!pd)
+ break;
px_dma(pd) = mm->ppgtt_mm.shadow_pdps[i];
}
}
active = NULL;
INIT_LIST_HEAD(&eviction_list);
list_for_each_entry_safe(vma, next, &vm->bound_list, vm_link) {
+ if (vma == active) { /* now seen this vma twice */
+ if (flags & PIN_NONBLOCK)
+ break;
+
+ active = ERR_PTR(-EAGAIN);
+ }
+
/*
* We keep this list in a rough least-recently scanned order
* of active elements (inactive elements are cheap to reap).
* To notice when we complete one full cycle, we record the
* first active element seen, before moving it to the tail.
*/
- if (i915_vma_is_active(vma)) {
- if (vma == active) {
- if (flags & PIN_NONBLOCK)
- break;
-
- active = ERR_PTR(-EAGAIN);
- }
-
- if (active != ERR_PTR(-EAGAIN)) {
- if (!active)
- active = vma;
+ if (active != ERR_PTR(-EAGAIN) && i915_vma_is_active(vma)) {
+ if (!active)
+ active = vma;
- list_move_tail(&vma->vm_link, &vm->bound_list);
- continue;
- }
+ list_move_tail(&vma->vm_link, &vm->bound_list);
+ continue;
}
if (mark_free(&scan, vma, flags, &eviction_list))
static void record_request(const struct i915_request *request,
struct i915_request_coredump *erq)
{
- const struct i915_gem_context *ctx;
-
erq->flags = request->fence.flags;
erq->context = request->fence.context;
erq->seqno = request->fence.seqno;
erq->pid = 0;
rcu_read_lock();
- ctx = rcu_dereference(request->context->gem_context);
- if (ctx)
- erq->pid = pid_nr(ctx->pid);
+ if (!intel_context_is_closed(request->context)) {
+ const struct i915_gem_context *ctx;
+
+ ctx = rcu_dereference(request->context->gem_context);
+ if (ctx)
+ erq->pid = pid_nr(ctx->pid);
+ }
rcu_read_unlock();
}
u32 de_pipe_masked = gen8_de_pipe_fault_mask(dev_priv) |
GEN8_PIPE_CDCLK_CRC_DONE;
u32 de_pipe_enables;
- u32 de_port_masked = GEN8_AUX_CHANNEL_A;
+ u32 de_port_masked = gen8_de_port_aux_mask(dev_priv);
u32 de_port_enables;
u32 de_misc_masked = GEN8_DE_EDP_PSR;
enum pipe pipe;
if (INTEL_GEN(dev_priv) <= 10)
de_misc_masked |= GEN8_DE_MISC_GSE;
- if (INTEL_GEN(dev_priv) >= 9) {
- de_port_masked |= GEN9_AUX_CHANNEL_B | GEN9_AUX_CHANNEL_C |
- GEN9_AUX_CHANNEL_D;
- if (IS_GEN9_LP(dev_priv))
- de_port_masked |= BXT_DE_PORT_GMBUS;
- }
-
- if (INTEL_GEN(dev_priv) >= 11)
- de_port_masked |= ICL_AUX_CHANNEL_E;
-
- if (IS_CNL_WITH_PORT_F(dev_priv) || INTEL_GEN(dev_priv) >= 11)
- de_port_masked |= CNL_AUX_CHANNEL_F;
+ if (IS_GEN9_LP(dev_priv))
+ de_port_masked |= BXT_DE_PORT_GMBUS;
de_pipe_enables = de_pipe_masked | GEN8_PIPE_VBLANK |
GEN8_PIPE_FIFO_UNDERRUN;
* dropped by GuC. They won't be part of the context
* ID in the OA reports, so squash those lower bits.
*/
- stream->specific_ctx_id =
- lower_32_bits(ce->lrc_desc) >> 12;
+ stream->specific_ctx_id = ce->lrc.lrca >> 12;
/*
* GuC uses the top bit to signal proxy submission, so
((1U << GEN11_SW_CTX_ID_WIDTH) - 1) << (GEN11_SW_CTX_ID_SHIFT - 32);
/*
* Pick an unused context id
- * 0 - (NUM_CONTEXT_TAG - 1) are used by other contexts
+ * 0 - BITS_PER_LONG are used by other contexts
* GEN12_MAX_CONTEXT_HW_ID (0x7ff) is used by idle context
*/
stream->specific_ctx_id = (GEN12_MAX_CONTEXT_HW_ID - 1) << (GEN11_SW_CTX_ID_SHIFT - 32);
- BUILD_BUG_ON((GEN12_MAX_CONTEXT_HW_ID - 1) < NUM_CONTEXT_TAG);
break;
}
/* Similar to perf's kernel.perf_paranoid_cpu sysctl option
* we check a dev.i915.perf_stream_paranoid sysctl option
* to determine if it's ok to access system wide OA counters
- * without CAP_SYS_ADMIN privileges.
+ * without CAP_PERFMON or CAP_SYS_ADMIN privileges.
*/
if (privileged_op &&
- i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+ i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to open i915 perf stream\n");
ret = -EACCES;
goto err_ctx;
} else
oa_freq_hz = 0;
- if (oa_freq_hz > i915_oa_max_sample_rate &&
- !capable(CAP_SYS_ADMIN)) {
- DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without root privileges\n",
+ if (oa_freq_hz > i915_oa_max_sample_rate && !perfmon_capable()) {
+ DRM_DEBUG("OA exponent would exceed the max sampling frequency (sysctl dev.i915.oa_max_sample_rate) %uHz without CAP_PERFMON or CAP_SYS_ADMIN privileges\n",
i915_oa_max_sample_rate);
return -EACCES;
}
return -EINVAL;
}
- if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+ if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to add i915 OA config\n");
return -EACCES;
}
return -ENOTSUPP;
}
- if (i915_perf_stream_paranoid && !capable(CAP_SYS_ADMIN)) {
+ if (i915_perf_stream_paranoid && !perfmon_capable()) {
DRM_DEBUG("Insufficient privileges to remove i915 OA config\n");
return -EACCES;
}
#define GT_BSD_CS_ERROR_INTERRUPT (1 << 15)
#define GT_BSD_USER_INTERRUPT (1 << 12)
#define GT_RENDER_L3_PARITY_ERROR_INTERRUPT_S1 (1 << 11) /* hsw+; rsvd on snb, ivb, vlv */
+#define GT_WAIT_SEMAPHORE_INTERRUPT REG_BIT(11) /* bdw+ */
#define GT_CONTEXT_SWITCH_INTERRUPT (1 << 8)
#define GT_RENDER_L3_PARITY_ERROR_INTERRUPT (1 << 5) /* !snb */
#define GT_RENDER_PIPECTL_NOTIFY_INTERRUPT (1 << 4)
GEM_BUG_ON(to == from);
GEM_BUG_ON(to->timeline == from->timeline);
- if (i915_request_completed(from))
+ if (i915_request_completed(from)) {
+ i915_sw_fence_set_error_once(&to->submit, from->fence.error);
return 0;
+ }
if (to->engine->schedule) {
- ret = i915_sched_node_add_dependency(&to->sched, &from->sched);
+ ret = i915_sched_node_add_dependency(&to->sched,
+ &from->sched,
+ I915_DEPENDENCY_EXTERNAL);
if (ret < 0)
return ret;
}
/* Couple the dependency tree for PI on this exposed to->fence */
if (to->engine->schedule) {
- err = i915_sched_node_add_dependency(&to->sched, &from->sched);
+ err = i915_sched_node_add_dependency(&to->sched,
+ &from->sched,
+ I915_DEPENDENCY_WEAK);
if (err < 0)
return err;
}
}
int i915_sched_node_add_dependency(struct i915_sched_node *node,
- struct i915_sched_node *signal)
+ struct i915_sched_node *signal,
+ unsigned long flags)
{
struct i915_dependency *dep;
return -ENOMEM;
if (!__i915_sched_node_add_dependency(node, signal, dep,
- I915_DEPENDENCY_EXTERNAL |
- I915_DEPENDENCY_ALLOC))
+ flags | I915_DEPENDENCY_ALLOC))
i915_dependency_free(dep);
return 0;
unsigned long flags);
int i915_sched_node_add_dependency(struct i915_sched_node *node,
- struct i915_sched_node *signal);
+ struct i915_sched_node *signal,
+ unsigned long flags);
void i915_sched_node_fini(struct i915_sched_node *node);
unsigned long flags;
#define I915_DEPENDENCY_ALLOC BIT(0)
#define I915_DEPENDENCY_EXTERNAL BIT(1)
+#define I915_DEPENDENCY_WEAK BIT(2)
};
#endif /* _I915_SCHEDULER_TYPES_H_ */
lockdep_assert_held(&vma->vm->mutex);
- /*
- * First wait upon any activity as retiring the request may
- * have side-effects such as unpinning or even unbinding this vma.
- *
- * XXX Actually waiting under the vm->mutex is a hinderance and
- * should be pipelined wherever possible. In cases where that is
- * unavoidable, we should lift the wait to before the mutex.
- */
- ret = i915_vma_sync(vma);
- if (ret)
- return ret;
-
if (i915_vma_is_pinned(vma)) {
vma_print_allocator(vma, "is pinned");
return -EAGAIN;
if (!drm_mm_node_allocated(&vma->node))
return 0;
- if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
- /* XXX not always required: nop_clear_range */
- wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm);
-
/* Optimistic wait before taking the mutex */
err = i915_vma_sync(vma);
if (err)
goto out_rpm;
+ if (i915_vma_is_pinned(vma)) {
+ vma_print_allocator(vma, "is pinned");
+ return -EAGAIN;
+ }
+
+ if (i915_vma_is_bound(vma, I915_VMA_GLOBAL_BIND))
+ /* XXX not always required: nop_clear_range */
+ wakeref = intel_runtime_pm_get(&vm->i915->runtime_pm);
+
err = mutex_lock_interruptible(&vm->mutex);
if (err)
goto out_rpm;
* WaIncreaseLatencyIPCEnabled: kbl,cfl
* Display WA #1141: kbl,cfl
*/
- if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv)) ||
+ if ((IS_KABYLAKE(dev_priv) || IS_COFFEELAKE(dev_priv)) &&
dev_priv->ipc_enabled)
latency += 4;
}
nc = 0;
- for_each_prime_number(num_ctx, 2 * NUM_CONTEXT_TAG) {
+ for_each_prime_number(num_ctx, 2 * BITS_PER_LONG) {
for (; nc < num_ctx; nc++) {
ctx = mock_context(i915, "mock");
if (!ctx)
if (!drm_atomic_crtc_needs_modeset(state))
return 0;
- if (state->mode.hdisplay > priv->soc_info->max_height ||
- state->mode.vdisplay > priv->soc_info->max_width)
+ if (state->mode.hdisplay > priv->soc_info->max_width ||
+ state->mode.vdisplay > priv->soc_info->max_height)
return -EINVAL;
rate = clk_round_rate(priv->pix_clk,
static irqreturn_t ingenic_drm_irq_handler(int irq, void *arg)
{
- struct ingenic_drm *priv = arg;
+ struct ingenic_drm *priv = drm_device_get_priv(arg);
unsigned int state;
regmap_read(priv->map, JZ_REG_LCD_STATE, &state);
if (priv->afbcd.ops)
priv->afbcd.ops->init(priv);
- drm_mode_config_helper_resume(priv->drm);
-
- return 0;
+ return drm_mode_config_helper_resume(priv->drm);
}
static int compare_of(struct device *dev, void *data)
static bool host1x_drm_wants_iommu(struct host1x_device *dev)
{
+ struct host1x *host1x = dev_get_drvdata(dev->dev.parent);
struct iommu_domain *domain;
/*
* sufficient and whether or not the host1x is attached to an IOMMU
* doesn't matter.
*/
- if (!domain && dma_get_mask(dev->dev.parent) <= DMA_BIT_MASK(32))
+ if (!domain && host1x_get_dma_mask(host1x) <= DMA_BIT_MASK(32))
return true;
return domain != NULL;
extern int vmw_bo_init(struct vmw_private *dev_priv,
struct vmw_buffer_object *vmw_bo,
size_t size, struct ttm_placement *placement,
- bool interuptable,
+ bool interruptible,
void (*bo_free)(struct ttm_buffer_object *bo));
extern int vmw_user_bo_verify_access(struct ttm_buffer_object *bo,
struct ttm_object_file *tfile);
struct vmw_fence_manager *fman = fman_from_fence(fence);
if (test_bit(DMA_FENCE_FLAG_SIGNALED_BIT, &fence->base.flags))
- return 1;
+ return true;
vmw_fences_update(fman);
struct vmw_surface_metadata *metadata;
struct ttm_base_object *base;
uint32_t backup_handle;
- int ret = -EINVAL;
+ int ret;
ret = vmw_surface_handle_reference(dev_priv, file_priv, req->sid,
req->handle_type, &base);
}
}
+static bool host1x_wants_iommu(struct host1x *host1x)
+{
+ /*
+ * If we support addressing a maximum of 32 bits of physical memory
+ * and if the host1x firewall is enabled, there's no need to enable
+ * IOMMU support. This can happen for example on Tegra20, Tegra30
+ * and Tegra114.
+ *
+ * Tegra124 and later can address up to 34 bits of physical memory and
+ * many platforms come equipped with more than 2 GiB of system memory,
+ * which requires crossing the 4 GiB boundary. But there's a catch: on
+ * SoCs before Tegra186 (i.e. Tegra124 and Tegra210), the host1x can
+ * only address up to 32 bits of memory in GATHER opcodes, which means
+ * that command buffers need to either be in the first 2 GiB of system
+ * memory (which could quickly lead to memory exhaustion), or command
+ * buffers need to be treated differently from other buffers (which is
+ * not possible with the current ABI).
+ *
+ * A third option is to use the IOMMU in these cases to make sure all
+ * buffers will be mapped into a 32-bit IOVA space that host1x can
+ * address. This allows all of the system memory to be used and works
+ * within the limitations of the host1x on these SoCs.
+ *
+ * In summary, default to enable IOMMU on Tegra124 and later. For any
+ * of the earlier SoCs, only use the IOMMU for additional safety when
+ * the host1x firewall is disabled.
+ */
+ if (host1x->info->dma_mask <= DMA_BIT_MASK(32)) {
+ if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL))
+ return false;
+ }
+
+ return true;
+}
+
static struct iommu_domain *host1x_iommu_attach(struct host1x *host)
{
struct iommu_domain *domain = iommu_get_domain_for_dev(host->dev);
int err;
/*
- * If the host1x firewall is enabled, there's no need to enable IOMMU
- * support. Similarly, if host1x is already attached to an IOMMU (via
- * the DMA API), don't try to attach again.
+ * We may not always want to enable IOMMU support (for example if the
+ * host1x firewall is already enabled and we don't support addressing
+ * more than 32 bits of physical memory), so check for that first.
+ *
+ * Similarly, if host1x is already attached to an IOMMU (via the DMA
+ * API), don't try to attach again.
*/
- if (IS_ENABLED(CONFIG_TEGRA_HOST1X_FIREWALL) || domain)
+ if (!host1x_wants_iommu(host) || domain)
return domain;
host->group = iommu_group_get(host->dev);
}
module_exit(tegra_host1x_exit);
+/**
+ * host1x_get_dma_mask() - query the supported DMA mask for host1x
+ * @host1x: host1x instance
+ *
+ * Note that this returns the supported DMA mask for host1x, which can be
+ * different from the applicable DMA mask under certain circumstances.
+ */
+u64 host1x_get_dma_mask(struct host1x *host1x)
+{
+ return host1x->info->dma_mask;
+}
+EXPORT_SYMBOL(host1x_get_dma_mask);
+
MODULE_AUTHOR("Thierry Reding <thierry.reding@avionic-design.de>");
MODULE_AUTHOR("Terje Bergstrom <tbergstrom@nvidia.com>");
MODULE_DESCRIPTION("Host1x driver for Tegra products");
This driver can also be built as a module. If so, the module
will be called fam15h_power.
+config SENSORS_AMD_ENERGY
+ tristate "AMD RAPL MSR based Energy driver"
+ depends on X86
+ help
+ If you say yes here you get support for core and package energy
+ sensors, based on RAPL MSR for AMD family 17h and above CPUs.
+
+ This driver can also be built as a module. If so, the module
+ will be called as amd_energy.
+
config SENSORS_APPLESMC
tristate "Apple SMC (Motion sensor, light sensor, keyboard backlight)"
depends on INPUT && X86
This driver can also be built as a module. If so, the module
will be called atxp1.
+config SENSORS_BT1_PVT
+ tristate "Baikal-T1 Process, Voltage, Temperature sensor driver"
+ depends on MIPS_BAIKAL_T1 || COMPILE_TEST
+ help
+ If you say yes here you get support for Baikal-T1 PVT sensor
+ embedded into the SoC.
+
+ This driver can also be built as a module. If so, the module will be
+ called bt1-pvt.
+
+config SENSORS_BT1_PVT_ALARMS
+ bool "Enable Baikal-T1 PVT sensor alarms"
+ depends on SENSORS_BT1_PVT
+ help
+ Baikal-T1 PVT IP-block provides threshold registers for each
+ supported sensor. But the corresponding interrupts might be
+ generated by the thresholds comparator only in synchronization with
+ a data conversion. Additionally there is only one sensor data can
+ be converted at a time. All of these makes the interface impossible
+ to be used for the hwmon alarms implementation without periodic
+ switch between the PVT sensors. By default the data conversion is
+ performed on demand from the user-space. If this config is enabled
+ the data conversion will be periodically performed and the data will be
+ saved in the internal driver cache.
+
config SENSORS_DRIVETEMP
tristate "Hard disk drives with temperature sensors"
depends on SCSI && ATA
This driver can also be built as a module. If so, the module
will be called f75375s.
+config SENSORS_GSC
+ tristate "Gateworks System Controller ADC"
+ depends on MFD_GATEWORKS_GSC
+ help
+ Support for the Gateworks System Controller A/D converters.
+
+ To compile this driver as a module, choose M here:
+ the module will be called gsc-hwmon.
+
config SENSORS_MC13783_ADC
tristate "Freescale MC13783/MC13892 ADC"
depends on MFD_MC13XXX
help
If you say yes here you get support for National Semiconductor LM90,
LM86, LM89 and LM99, Analog Devices ADM1032, ADT7461, and ADT7461A,
- Maxim MAX6646, MAX6647, MAX6648, MAX6649, MAX6657, MAX6658, MAX6659,
- MAX6680, MAX6681, MAX6692, MAX6695, MAX6696, ON Semiconductor NCT1008,
- Winbond/Nuvoton W83L771W/G/AWG/ASG, Philips SA56004, GMT G781, and
- Texas Instruments TMP451 sensor chips.
+ Maxim MAX6646, MAX6647, MAX6648, MAX6649, MAX6654, MAX6657, MAX6658,
+ MAX6659, MAX6680, MAX6681, MAX6692, MAX6695, MAX6696,
+ ON Semiconductor NCT1008, Winbond/Nuvoton W83L771W/G/AWG/ASG,
+ Philips SA56004, GMT G781, and Texas Instruments TMP451
+ sensor chips.
This driver can also be built as a module. If so, the module
will be called lm90.
config SENSORS_NCT7904
tristate "Nuvoton NCT7904"
- depends on I2C
+ depends on I2C && WATCHDOG
+ select WATCHDOG_CORE
help
If you say yes here you get support for the Nuvoton NCT7904
- hardware monitoring chip, including manual fan speed control.
+ hardware monitoring chip, including manual fan speed control
+ and support for the integrated watchdog.
This driver can also be built as a module. If so, the module
will be called nct7904.
obj-$(CONFIG_SENSORS_ADT7462) += adt7462.o
obj-$(CONFIG_SENSORS_ADT7470) += adt7470.o
obj-$(CONFIG_SENSORS_ADT7475) += adt7475.o
+obj-$(CONFIG_SENSORS_AMD_ENERGY) += amd_energy.o
obj-$(CONFIG_SENSORS_APPLESMC) += applesmc.o
obj-$(CONFIG_SENSORS_ARM_SCMI) += scmi-hwmon.o
obj-$(CONFIG_SENSORS_ARM_SCPI) += scpi-hwmon.o
obj-$(CONFIG_SENSORS_ASPEED) += aspeed-pwm-tacho.o
obj-$(CONFIG_SENSORS_ATXP1) += atxp1.o
obj-$(CONFIG_SENSORS_AXI_FAN_CONTROL) += axi-fan-control.o
+obj-$(CONFIG_SENSORS_BT1_PVT) += bt1-pvt.o
obj-$(CONFIG_SENSORS_CORETEMP) += coretemp.o
obj-$(CONFIG_SENSORS_DA9052_ADC)+= da9052-hwmon.o
obj-$(CONFIG_SENSORS_DA9055)+= da9055-hwmon.o
obj-$(CONFIG_SENSORS_G762) += g762.o
obj-$(CONFIG_SENSORS_GL518SM) += gl518sm.o
obj-$(CONFIG_SENSORS_GL520SM) += gl520sm.o
+obj-$(CONFIG_SENSORS_GSC) += gsc-hwmon.o
obj-$(CONFIG_SENSORS_GPIO_FAN) += gpio-fan.o
obj-$(CONFIG_SENSORS_HIH6130) += hih6130.o
obj-$(CONFIG_SENSORS_ULTRA45) += ultra45_env.o
module_i2c_driver(adt7411_driver);
-MODULE_AUTHOR("Sascha Hauer <s.hauer@pengutronix.de> and "
- "Wolfram Sang <w.sang@pengutronix.de>");
+MODULE_AUTHOR("Sascha Hauer, Wolfram Sang <kernel@pengutronix.de>");
MODULE_DESCRIPTION("ADT7411 driver");
MODULE_LICENSE("GPL v2");
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+
+/*
+ * Copyright (C) 2020 Advanced Micro Devices, Inc.
+ */
+#include <asm/cpu_device_id.h>
+
+#include <linux/bits.h>
+#include <linux/cpu.h>
+#include <linux/cpumask.h>
+#include <linux/delay.h>
+#include <linux/device.h>
+#include <linux/hwmon.h>
+#include <linux/kernel.h>
+#include <linux/kthread.h>
+#include <linux/list.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/processor.h>
+#include <linux/platform_device.h>
+#include <linux/sched.h>
+#include <linux/slab.h>
+#include <linux/topology.h>
+#include <linux/types.h>
+
+#define DRVNAME "amd_energy"
+
+#define ENERGY_PWR_UNIT_MSR 0xC0010299
+#define ENERGY_CORE_MSR 0xC001029A
+#define ENERGY_PKG_MSR 0xC001029B
+
+#define AMD_ENERGY_UNIT_MASK 0x01F00
+#define AMD_ENERGY_MASK 0xFFFFFFFF
+
+struct sensor_accumulator {
+ u64 energy_ctr;
+ u64 prev_value;
+ char label[10];
+};
+
+struct amd_energy_data {
+ struct hwmon_channel_info energy_info;
+ const struct hwmon_channel_info *info[2];
+ struct hwmon_chip_info chip;
+ struct task_struct *wrap_accumulate;
+ /* Lock around the accumulator */
+ struct mutex lock;
+ /* An accumulator for each core and socket */
+ struct sensor_accumulator *accums;
+ /* Energy Status Units */
+ u64 energy_units;
+ int nr_cpus;
+ int nr_socks;
+ int core_id;
+};
+
+static int amd_energy_read_labels(struct device *dev,
+ enum hwmon_sensor_types type,
+ u32 attr, int channel,
+ const char **str)
+{
+ struct amd_energy_data *data = dev_get_drvdata(dev);
+
+ *str = data->accums[channel].label;
+ return 0;
+}
+
+static void get_energy_units(struct amd_energy_data *data)
+{
+ u64 rapl_units;
+
+ rdmsrl_safe(ENERGY_PWR_UNIT_MSR, &rapl_units);
+ data->energy_units = (rapl_units & AMD_ENERGY_UNIT_MASK) >> 8;
+}
+
+static void accumulate_socket_delta(struct amd_energy_data *data,
+ int sock, int cpu)
+{
+ struct sensor_accumulator *s_accum;
+ u64 input;
+
+ mutex_lock(&data->lock);
+ rdmsrl_safe_on_cpu(cpu, ENERGY_PKG_MSR, &input);
+ input &= AMD_ENERGY_MASK;
+
+ s_accum = &data->accums[data->nr_cpus + sock];
+ if (input >= s_accum->prev_value)
+ s_accum->energy_ctr +=
+ input - s_accum->prev_value;
+ else
+ s_accum->energy_ctr += UINT_MAX -
+ s_accum->prev_value + input;
+
+ s_accum->prev_value = input;
+ mutex_unlock(&data->lock);
+}
+
+static void accumulate_core_delta(struct amd_energy_data *data)
+{
+ struct sensor_accumulator *c_accum;
+ u64 input;
+ int cpu;
+
+ mutex_lock(&data->lock);
+ if (data->core_id >= data->nr_cpus)
+ data->core_id = 0;
+
+ cpu = data->core_id;
+
+ if (!cpu_online(cpu))
+ goto out;
+
+ rdmsrl_safe_on_cpu(cpu, ENERGY_CORE_MSR, &input);
+ input &= AMD_ENERGY_MASK;
+
+ c_accum = &data->accums[cpu];
+
+ if (input >= c_accum->prev_value)
+ c_accum->energy_ctr +=
+ input - c_accum->prev_value;
+ else
+ c_accum->energy_ctr += UINT_MAX -
+ c_accum->prev_value + input;
+
+ c_accum->prev_value = input;
+
+out:
+ data->core_id++;
+ mutex_unlock(&data->lock);
+}
+
+static void read_accumulate(struct amd_energy_data *data)
+{
+ int sock;
+
+ for (sock = 0; sock < data->nr_socks; sock++) {
+ int cpu;
+
+ cpu = cpumask_first_and(cpu_online_mask,
+ cpumask_of_node(sock));
+
+ accumulate_socket_delta(data, sock, cpu);
+ }
+
+ accumulate_core_delta(data);
+}
+
+static void amd_add_delta(struct amd_energy_data *data, int ch,
+ int cpu, long *val, bool is_core)
+{
+ struct sensor_accumulator *s_accum, *c_accum;
+ u64 input;
+
+ mutex_lock(&data->lock);
+ if (!is_core) {
+ rdmsrl_safe_on_cpu(cpu, ENERGY_PKG_MSR, &input);
+ input &= AMD_ENERGY_MASK;
+
+ s_accum = &data->accums[ch];
+ if (input >= s_accum->prev_value)
+ input += s_accum->energy_ctr -
+ s_accum->prev_value;
+ else
+ input += UINT_MAX - s_accum->prev_value +
+ s_accum->energy_ctr;
+ } else {
+ rdmsrl_safe_on_cpu(cpu, ENERGY_CORE_MSR, &input);
+ input &= AMD_ENERGY_MASK;
+
+ c_accum = &data->accums[ch];
+ if (input >= c_accum->prev_value)
+ input += c_accum->energy_ctr -
+ c_accum->prev_value;
+ else
+ input += UINT_MAX - c_accum->prev_value +
+ c_accum->energy_ctr;
+ }
+
+ /* Energy consumed = (1/(2^ESU) * RAW * 1000000UL) μJoules */
+ *val = div64_ul(input * 1000000UL, BIT(data->energy_units));
+
+ mutex_unlock(&data->lock);
+}
+
+static int amd_energy_read(struct device *dev,
+ enum hwmon_sensor_types type,
+ u32 attr, int channel, long *val)
+{
+ struct amd_energy_data *data = dev_get_drvdata(dev);
+ int cpu;
+
+ if (channel >= data->nr_cpus) {
+ cpu = cpumask_first_and(cpu_online_mask,
+ cpumask_of_node
+ (channel - data->nr_cpus));
+ amd_add_delta(data, channel, cpu, val, false);
+ } else {
+ cpu = channel;
+ if (!cpu_online(cpu))
+ return -ENODEV;
+
+ amd_add_delta(data, channel, cpu, val, true);
+ }
+
+ return 0;
+}
+
+static umode_t amd_energy_is_visible(const void *_data,
+ enum hwmon_sensor_types type,
+ u32 attr, int channel)
+{
+ return 0444;
+}
+
+static int energy_accumulator(void *p)
+{
+ struct amd_energy_data *data = (struct amd_energy_data *)p;
+
+ while (!kthread_should_stop()) {
+ /*
+ * Ignoring the conditions such as
+ * cpu being offline or rdmsr failure
+ */
+ read_accumulate(data);
+
+ set_current_state(TASK_INTERRUPTIBLE);
+ if (kthread_should_stop())
+ break;
+
+ /*
+ * On a 240W system, with default resolution the
+ * Socket Energy status register may wrap around in
+ * 2^32*15.3 e-6/240 = 273.8041 secs (~4.5 mins)
+ *
+ * let us accumulate for every 100secs
+ */
+ schedule_timeout(msecs_to_jiffies(100000));
+ }
+ return 0;
+}
+
+static const struct hwmon_ops amd_energy_ops = {
+ .is_visible = amd_energy_is_visible,
+ .read = amd_energy_read,
+ .read_string = amd_energy_read_labels,
+};
+
+static int amd_create_sensor(struct device *dev,
+ struct amd_energy_data *data,
+ u8 type, u32 config)
+{
+ struct hwmon_channel_info *info = &data->energy_info;
+ struct sensor_accumulator *accums;
+ int i, num_siblings, cpus, sockets;
+ u32 *s_config;
+
+ /* Identify the number of siblings per core */
+ num_siblings = ((cpuid_ebx(0x8000001e) >> 8) & 0xff) + 1;
+
+ sockets = num_possible_nodes();
+
+ /*
+ * Energy counter register is accessed at core level.
+ * Hence, filterout the siblings.
+ */
+ cpus = num_present_cpus() / num_siblings;
+
+ s_config = devm_kcalloc(dev, cpus + sockets,
+ sizeof(u32), GFP_KERNEL);
+ if (!s_config)
+ return -ENOMEM;
+
+ accums = devm_kcalloc(dev, cpus + sockets,
+ sizeof(struct sensor_accumulator),
+ GFP_KERNEL);
+ if (!accums)
+ return -ENOMEM;
+
+ info->type = type;
+ info->config = s_config;
+
+ data->nr_cpus = cpus;
+ data->nr_socks = sockets;
+ data->accums = accums;
+
+ for (i = 0; i < cpus + sockets; i++) {
+ s_config[i] = config;
+ if (i < cpus)
+ scnprintf(accums[i].label, 10,
+ "Ecore%03u", i);
+ else
+ scnprintf(accums[i].label, 10,
+ "Esocket%u", (i - cpus));
+ }
+
+ return 0;
+}
+
+static int amd_energy_probe(struct platform_device *pdev)
+{
+ struct device *hwmon_dev;
+ struct amd_energy_data *data;
+ struct device *dev = &pdev->dev;
+
+ data = devm_kzalloc(dev,
+ sizeof(struct amd_energy_data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->chip.ops = &amd_energy_ops;
+ data->chip.info = data->info;
+
+ dev_set_drvdata(dev, data);
+ /* Populate per-core energy reporting */
+ data->info[0] = &data->energy_info;
+ amd_create_sensor(dev, data, hwmon_energy,
+ HWMON_E_INPUT | HWMON_E_LABEL);
+
+ mutex_init(&data->lock);
+ get_energy_units(data);
+
+ hwmon_dev = devm_hwmon_device_register_with_info(dev, DRVNAME,
+ data,
+ &data->chip,
+ NULL);
+ if (IS_ERR(hwmon_dev))
+ return PTR_ERR(hwmon_dev);
+
+ data->wrap_accumulate = kthread_run(energy_accumulator, data,
+ "%s", dev_name(hwmon_dev));
+ if (IS_ERR(data->wrap_accumulate))
+ return PTR_ERR(data->wrap_accumulate);
+
+ return PTR_ERR_OR_ZERO(data->wrap_accumulate);
+}
+
+static int amd_energy_remove(struct platform_device *pdev)
+{
+ struct amd_energy_data *data = dev_get_drvdata(&pdev->dev);
+
+ if (data && data->wrap_accumulate)
+ kthread_stop(data->wrap_accumulate);
+
+ return 0;
+}
+
+static const struct platform_device_id amd_energy_ids[] = {
+ { .name = DRVNAME, },
+ {}
+};
+MODULE_DEVICE_TABLE(platform, amd_energy_ids);
+
+static struct platform_driver amd_energy_driver = {
+ .probe = amd_energy_probe,
+ .remove = amd_energy_remove,
+ .id_table = amd_energy_ids,
+ .driver = {
+ .name = DRVNAME,
+ },
+};
+
+static struct platform_device *amd_energy_platdev;
+
+static const struct x86_cpu_id cpu_ids[] __initconst = {
+ X86_MATCH_VENDOR_FAM(AMD, 0x17, NULL),
+ {}
+};
+MODULE_DEVICE_TABLE(x86cpu, cpu_ids);
+
+static int __init amd_energy_init(void)
+{
+ int ret;
+
+ if (!x86_match_cpu(cpu_ids))
+ return -ENODEV;
+
+ ret = platform_driver_register(&amd_energy_driver);
+ if (ret)
+ return ret;
+
+ amd_energy_platdev = platform_device_alloc(DRVNAME, 0);
+ if (!amd_energy_platdev) {
+ platform_driver_unregister(&amd_energy_driver);
+ return -ENOMEM;
+ }
+
+ ret = platform_device_add(amd_energy_platdev);
+ if (ret) {
+ platform_device_put(amd_energy_platdev);
+ platform_driver_unregister(&amd_energy_driver);
+ return ret;
+ }
+
+ return ret;
+}
+
+static void __exit amd_energy_exit(void)
+{
+ platform_device_unregister(amd_energy_platdev);
+ platform_driver_unregister(&amd_energy_driver);
+}
+
+module_init(amd_energy_init);
+module_exit(amd_energy_exit);
+
+MODULE_DESCRIPTION("Driver for AMD Energy reporting from RAPL MSR via HWMON interface");
+MODULE_AUTHOR("Naveen Krishna Chatradhi <nchatrad@amd.com>");
+MODULE_LICENSE("GPL");
*/
static int wait_read(void)
{
+ unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;
u8 status;
int us;
+
for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
- udelay(us);
+ usleep_range(us, us * 16);
status = inb(APPLESMC_CMD_PORT);
/* read: wait for smc to settle */
if (status & 0x01)
return 0;
+ /* timeout: give up */
+ if (time_after(jiffies, end))
+ break;
}
pr_warn("wait_read() fail: 0x%02x\n", status);
{
u8 status;
int us;
+ unsigned long end = jiffies + (APPLESMC_MAX_WAIT * HZ) / USEC_PER_SEC;
outb(cmd, port);
for (us = APPLESMC_MIN_WAIT; us < APPLESMC_MAX_WAIT; us <<= 1) {
- udelay(us);
+ usleep_range(us, us * 16);
status = inb(APPLESMC_CMD_PORT);
/* write: wait for smc to settle */
if (status & 0x02)
if (status & 0x04)
return 0;
/* timeout: give up */
- if (us << 1 == APPLESMC_MAX_WAIT)
+ if (time_after(jiffies, end))
break;
/* busy: long wait and resend */
udelay(APPLESMC_RETRY_WAIT);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+ *
+ * Authors:
+ * Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>
+ * Serge Semin <Sergey.Semin@baikalelectronics.ru>
+ *
+ * Baikal-T1 Process, Voltage, Temperature sensor driver
+ */
+
+#include <linux/bitfield.h>
+#include <linux/bitops.h>
+#include <linux/clk.h>
+#include <linux/completion.h>
+#include <linux/device.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/hwmon.h>
+#include <linux/interrupt.h>
+#include <linux/io.h>
+#include <linux/kernel.h>
+#include <linux/ktime.h>
+#include <linux/limits.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/seqlock.h>
+#include <linux/sysfs.h>
+#include <linux/types.h>
+
+#include "bt1-pvt.h"
+
+/*
+ * For the sake of the code simplification we created the sensors info table
+ * with the sensor names, activation modes, threshold registers base address
+ * and the thresholds bit fields.
+ */
+static const struct pvt_sensor_info pvt_info[] = {
+ PVT_SENSOR_INFO(0, "CPU Core Temperature", hwmon_temp, TEMP, TTHRES),
+ PVT_SENSOR_INFO(0, "CPU Core Voltage", hwmon_in, VOLT, VTHRES),
+ PVT_SENSOR_INFO(1, "CPU Core Low-Vt", hwmon_in, LVT, LTHRES),
+ PVT_SENSOR_INFO(2, "CPU Core High-Vt", hwmon_in, HVT, HTHRES),
+ PVT_SENSOR_INFO(3, "CPU Core Standard-Vt", hwmon_in, SVT, STHRES),
+};
+
+/*
+ * The original translation formulae of the temperature (in degrees of Celsius)
+ * to PVT data and vice-versa are following:
+ * N = 1.8322e-8*(T^4) + 2.343e-5*(T^3) + 8.7018e-3*(T^2) + 3.9269*(T^1) +
+ * 1.7204e2,
+ * T = -1.6743e-11*(N^4) + 8.1542e-8*(N^3) + -1.8201e-4*(N^2) +
+ * 3.1020e-1*(N^1) - 4.838e1,
+ * where T = [-48.380, 147.438]C and N = [0, 1023].
+ * They must be accordingly altered to be suitable for the integer arithmetics.
+ * The technique is called 'factor redistribution', which just makes sure the
+ * multiplications and divisions are made so to have a result of the operations
+ * within the integer numbers limit. In addition we need to translate the
+ * formulae to accept millidegrees of Celsius. Here what they look like after
+ * the alterations:
+ * N = (18322e-20*(T^4) + 2343e-13*(T^3) + 87018e-9*(T^2) + 39269e-3*T +
+ * 17204e2) / 1e4,
+ * T = -16743e-12*(D^4) + 81542e-9*(D^3) - 182010e-6*(D^2) + 310200e-3*D -
+ * 48380,
+ * where T = [-48380, 147438] mC and N = [0, 1023].
+ */
+static const struct pvt_poly poly_temp_to_N = {
+ .total_divider = 10000,
+ .terms = {
+ {4, 18322, 10000, 10000},
+ {3, 2343, 10000, 10},
+ {2, 87018, 10000, 10},
+ {1, 39269, 1000, 1},
+ {0, 1720400, 1, 1}
+ }
+};
+
+static const struct pvt_poly poly_N_to_temp = {
+ .total_divider = 1,
+ .terms = {
+ {4, -16743, 1000, 1},
+ {3, 81542, 1000, 1},
+ {2, -182010, 1000, 1},
+ {1, 310200, 1000, 1},
+ {0, -48380, 1, 1}
+ }
+};
+
+/*
+ * Similar alterations are performed for the voltage conversion equations.
+ * The original formulae are:
+ * N = 1.8658e3*V - 1.1572e3,
+ * V = (N + 1.1572e3) / 1.8658e3,
+ * where V = [0.620, 1.168] V and N = [0, 1023].
+ * After the optimization they looks as follows:
+ * N = (18658e-3*V - 11572) / 10,
+ * V = N * 10^5 / 18658 + 11572 * 10^4 / 18658.
+ */
+static const struct pvt_poly poly_volt_to_N = {
+ .total_divider = 10,
+ .terms = {
+ {1, 18658, 1000, 1},
+ {0, -11572, 1, 1}
+ }
+};
+
+static const struct pvt_poly poly_N_to_volt = {
+ .total_divider = 10,
+ .terms = {
+ {1, 100000, 18658, 1},
+ {0, 115720000, 1, 18658}
+ }
+};
+
+/*
+ * Here is the polynomial calculation function, which performs the
+ * redistributed terms calculations. It's pretty straightforward. We walk
+ * over each degree term up to the free one, and perform the redistributed
+ * multiplication of the term coefficient, its divider (as for the rationale
+ * fraction representation), data power and the rational fraction divider
+ * leftover. Then all of this is collected in a total sum variable, which
+ * value is normalized by the total divider before being returned.
+ */
+static long pvt_calc_poly(const struct pvt_poly *poly, long data)
+{
+ const struct pvt_poly_term *term = poly->terms;
+ long tmp, ret = 0;
+ int deg;
+
+ do {
+ tmp = term->coef;
+ for (deg = 0; deg < term->deg; ++deg)
+ tmp = mult_frac(tmp, data, term->divider);
+ ret += tmp / term->divider_leftover;
+ } while ((term++)->deg);
+
+ return ret / poly->total_divider;
+}
+
+static inline u32 pvt_update(void __iomem *reg, u32 mask, u32 data)
+{
+ u32 old;
+
+ old = readl_relaxed(reg);
+ writel((old & ~mask) | (data & mask), reg);
+
+ return old & mask;
+}
+
+/*
+ * Baikal-T1 PVT mode can be updated only when the controller is disabled.
+ * So first we disable it, then set the new mode together with the controller
+ * getting back enabled. The same concerns the temperature trim and
+ * measurements timeout. If it is necessary the interface mutex is supposed
+ * to be locked at the time the operations are performed.
+ */
+static inline void pvt_set_mode(struct pvt_hwmon *pvt, u32 mode)
+{
+ u32 old;
+
+ mode = FIELD_PREP(PVT_CTRL_MODE_MASK, mode);
+
+ old = pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_MODE_MASK | PVT_CTRL_EN,
+ mode | old);
+}
+
+static inline u32 pvt_calc_trim(long temp)
+{
+ temp = clamp_val(temp, 0, PVT_TRIM_TEMP);
+
+ return DIV_ROUND_UP(temp, PVT_TRIM_STEP);
+}
+
+static inline void pvt_set_trim(struct pvt_hwmon *pvt, u32 trim)
+{
+ u32 old;
+
+ trim = FIELD_PREP(PVT_CTRL_TRIM_MASK, trim);
+
+ old = pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_TRIM_MASK | PVT_CTRL_EN,
+ trim | old);
+}
+
+static inline void pvt_set_tout(struct pvt_hwmon *pvt, u32 tout)
+{
+ u32 old;
+
+ old = pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+ writel(tout, pvt->regs + PVT_TTIMEOUT);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, old);
+}
+
+/*
+ * This driver can optionally provide the hwmon alarms for each sensor the PVT
+ * controller supports. The alarms functionality is made compile-time
+ * configurable due to the hardware interface implementation peculiarity
+ * described further in this comment. So in case if alarms are unnecessary in
+ * your system design it's recommended to have them disabled to prevent the PVT
+ * IRQs being periodically raised to get the data cache/alarms status up to
+ * date.
+ *
+ * Baikal-T1 PVT embedded controller is based on the Analog Bits PVT sensor,
+ * but is equipped with a dedicated control wrapper. It exposes the PVT
+ * sub-block registers space via the APB3 bus. In addition the wrapper provides
+ * a common interrupt vector of the sensors conversion completion events and
+ * threshold value alarms. Alas the wrapper interface hasn't been fully thought
+ * through. There is only one sensor can be activated at a time, for which the
+ * thresholds comparator is enabled right after the data conversion is
+ * completed. Due to this if alarms need to be implemented for all available
+ * sensors we can't just set the thresholds and enable the interrupts. We need
+ * to enable the sensors one after another and let the controller to detect
+ * the alarms by itself at each conversion. This also makes pointless to handle
+ * the alarms interrupts, since in occasion they happen synchronously with
+ * data conversion completion. The best driver design would be to have the
+ * completion interrupts enabled only and keep the converted value in the
+ * driver data cache. This solution is implemented if hwmon alarms are enabled
+ * in this driver. In case if the alarms are disabled, the conversion is
+ * performed on demand at the time a sensors input file is read.
+ */
+
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+
+#define pvt_hard_isr NULL
+
+static irqreturn_t pvt_soft_isr(int irq, void *data)
+{
+ const struct pvt_sensor_info *info;
+ struct pvt_hwmon *pvt = data;
+ struct pvt_cache *cache;
+ u32 val, thres_sts, old;
+
+ /*
+ * DVALID bit will be cleared by reading the data. We need to save the
+ * status before the next conversion happens. Threshold events will be
+ * handled a bit later.
+ */
+ thres_sts = readl(pvt->regs + PVT_RAW_INTR_STAT);
+
+ /*
+ * Then lets recharge the PVT interface with the next sampling mode.
+ * Lock the interface mutex to serialize trim, timeouts and alarm
+ * thresholds settings.
+ */
+ cache = &pvt->cache[pvt->sensor];
+ info = &pvt_info[pvt->sensor];
+ pvt->sensor = (pvt->sensor == PVT_SENSOR_LAST) ?
+ PVT_SENSOR_FIRST : (pvt->sensor + 1);
+
+ /*
+ * For some reason we have to mask the interrupt before changing the
+ * mode, otherwise sometimes the temperature mode doesn't get
+ * activated even though the actual mode in the ctrl register
+ * corresponds to one. Then we read the data. By doing so we also
+ * recharge the data conversion. After this the mode corresponding
+ * to the next sensor in the row is set. Finally we enable the
+ * interrupts back.
+ */
+ mutex_lock(&pvt->iface_mtx);
+
+ old = pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID,
+ PVT_INTR_DVALID);
+
+ val = readl(pvt->regs + PVT_DATA);
+
+ pvt_set_mode(pvt, pvt_info[pvt->sensor].mode);
+
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, old);
+
+ mutex_unlock(&pvt->iface_mtx);
+
+ /*
+ * We can now update the data cache with data just retrieved from the
+ * sensor. Lock write-seqlock to make sure the reader has a coherent
+ * data.
+ */
+ write_seqlock(&cache->data_seqlock);
+
+ cache->data = FIELD_GET(PVT_DATA_DATA_MASK, val);
+
+ write_sequnlock(&cache->data_seqlock);
+
+ /*
+ * While PVT core is doing the next mode data conversion, we'll check
+ * whether the alarms were triggered for the current sensor. Note that
+ * according to the documentation only one threshold IRQ status can be
+ * set at a time, that's why if-else statement is utilized.
+ */
+ if ((thres_sts & info->thres_sts_lo) ^ cache->thres_sts_lo) {
+ WRITE_ONCE(cache->thres_sts_lo, thres_sts & info->thres_sts_lo);
+ hwmon_notify_event(pvt->hwmon, info->type, info->attr_min_alarm,
+ info->channel);
+ } else if ((thres_sts & info->thres_sts_hi) ^ cache->thres_sts_hi) {
+ WRITE_ONCE(cache->thres_sts_hi, thres_sts & info->thres_sts_hi);
+ hwmon_notify_event(pvt->hwmon, info->type, info->attr_max_alarm,
+ info->channel);
+ }
+
+ return IRQ_HANDLED;
+}
+
+inline umode_t pvt_limit_is_visible(enum pvt_sensor_type type)
+{
+ return 0644;
+}
+
+inline umode_t pvt_alarm_is_visible(enum pvt_sensor_type type)
+{
+ return 0444;
+}
+
+static int pvt_read_data(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ long *val)
+{
+ struct pvt_cache *cache = &pvt->cache[type];
+ unsigned int seq;
+ u32 data;
+
+ do {
+ seq = read_seqbegin(&cache->data_seqlock);
+ data = cache->data;
+ } while (read_seqretry(&cache->data_seqlock, seq));
+
+ if (type == PVT_TEMP)
+ *val = pvt_calc_poly(&poly_N_to_temp, data);
+ else
+ *val = pvt_calc_poly(&poly_N_to_volt, data);
+
+ return 0;
+}
+
+static int pvt_read_limit(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ bool is_low, long *val)
+{
+ u32 data;
+
+ /* No need in serialization, since it is just read from MMIO. */
+ data = readl(pvt->regs + pvt_info[type].thres_base);
+
+ if (is_low)
+ data = FIELD_GET(PVT_THRES_LO_MASK, data);
+ else
+ data = FIELD_GET(PVT_THRES_HI_MASK, data);
+
+ if (type == PVT_TEMP)
+ *val = pvt_calc_poly(&poly_N_to_temp, data);
+ else
+ *val = pvt_calc_poly(&poly_N_to_volt, data);
+
+ return 0;
+}
+
+static int pvt_write_limit(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ bool is_low, long val)
+{
+ u32 data, limit, mask;
+ int ret;
+
+ if (type == PVT_TEMP) {
+ val = clamp(val, PVT_TEMP_MIN, PVT_TEMP_MAX);
+ data = pvt_calc_poly(&poly_temp_to_N, val);
+ } else {
+ val = clamp(val, PVT_VOLT_MIN, PVT_VOLT_MAX);
+ data = pvt_calc_poly(&poly_volt_to_N, val);
+ }
+
+ /* Serialize limit update, since a part of the register is changed. */
+ ret = mutex_lock_interruptible(&pvt->iface_mtx);
+ if (ret)
+ return ret;
+
+ /* Make sure the upper and lower ranges don't intersect. */
+ limit = readl(pvt->regs + pvt_info[type].thres_base);
+ if (is_low) {
+ limit = FIELD_GET(PVT_THRES_HI_MASK, limit);
+ data = clamp_val(data, PVT_DATA_MIN, limit);
+ data = FIELD_PREP(PVT_THRES_LO_MASK, data);
+ mask = PVT_THRES_LO_MASK;
+ } else {
+ limit = FIELD_GET(PVT_THRES_LO_MASK, limit);
+ data = clamp_val(data, limit, PVT_DATA_MAX);
+ data = FIELD_PREP(PVT_THRES_HI_MASK, data);
+ mask = PVT_THRES_HI_MASK;
+ }
+
+ pvt_update(pvt->regs + pvt_info[type].thres_base, mask, data);
+
+ mutex_unlock(&pvt->iface_mtx);
+
+ return 0;
+}
+
+static int pvt_read_alarm(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ bool is_low, long *val)
+{
+ if (is_low)
+ *val = !!READ_ONCE(pvt->cache[type].thres_sts_lo);
+ else
+ *val = !!READ_ONCE(pvt->cache[type].thres_sts_hi);
+
+ return 0;
+}
+
+static const struct hwmon_channel_info *pvt_channel_info[] = {
+ HWMON_CHANNEL_INFO(chip,
+ HWMON_C_REGISTER_TZ | HWMON_C_UPDATE_INTERVAL),
+ HWMON_CHANNEL_INFO(temp,
+ HWMON_T_INPUT | HWMON_T_TYPE | HWMON_T_LABEL |
+ HWMON_T_MIN | HWMON_T_MIN_ALARM |
+ HWMON_T_MAX | HWMON_T_MAX_ALARM |
+ HWMON_T_OFFSET),
+ HWMON_CHANNEL_INFO(in,
+ HWMON_I_INPUT | HWMON_I_LABEL |
+ HWMON_I_MIN | HWMON_I_MIN_ALARM |
+ HWMON_I_MAX | HWMON_I_MAX_ALARM,
+ HWMON_I_INPUT | HWMON_I_LABEL |
+ HWMON_I_MIN | HWMON_I_MIN_ALARM |
+ HWMON_I_MAX | HWMON_I_MAX_ALARM,
+ HWMON_I_INPUT | HWMON_I_LABEL |
+ HWMON_I_MIN | HWMON_I_MIN_ALARM |
+ HWMON_I_MAX | HWMON_I_MAX_ALARM,
+ HWMON_I_INPUT | HWMON_I_LABEL |
+ HWMON_I_MIN | HWMON_I_MIN_ALARM |
+ HWMON_I_MAX | HWMON_I_MAX_ALARM),
+ NULL
+};
+
+#else /* !CONFIG_SENSORS_BT1_PVT_ALARMS */
+
+static irqreturn_t pvt_hard_isr(int irq, void *data)
+{
+ struct pvt_hwmon *pvt = data;
+ struct pvt_cache *cache;
+ u32 val;
+
+ /*
+ * Mask the DVALID interrupt so after exiting from the handler a
+ * repeated conversion wouldn't happen.
+ */
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID,
+ PVT_INTR_DVALID);
+
+ /*
+ * Nothing special for alarm-less driver. Just read the data, update
+ * the cache and notify a waiter of this event.
+ */
+ val = readl(pvt->regs + PVT_DATA);
+ if (!(val & PVT_DATA_VALID)) {
+ dev_err(pvt->dev, "Got IRQ when data isn't valid\n");
+ return IRQ_HANDLED;
+ }
+
+ cache = &pvt->cache[pvt->sensor];
+
+ WRITE_ONCE(cache->data, FIELD_GET(PVT_DATA_DATA_MASK, val));
+
+ complete(&cache->conversion);
+
+ return IRQ_HANDLED;
+}
+
+#define pvt_soft_isr NULL
+
+inline umode_t pvt_limit_is_visible(enum pvt_sensor_type type)
+{
+ return 0;
+}
+
+inline umode_t pvt_alarm_is_visible(enum pvt_sensor_type type)
+{
+ return 0;
+}
+
+static int pvt_read_data(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ long *val)
+{
+ struct pvt_cache *cache = &pvt->cache[type];
+ u32 data;
+ int ret;
+
+ /*
+ * Lock PVT conversion interface until data cache is updated. The
+ * data read procedure is following: set the requested PVT sensor
+ * mode, enable IRQ and conversion, wait until conversion is finished,
+ * then disable conversion and IRQ, and read the cached data.
+ */
+ ret = mutex_lock_interruptible(&pvt->iface_mtx);
+ if (ret)
+ return ret;
+
+ pvt->sensor = type;
+ pvt_set_mode(pvt, pvt_info[type].mode);
+
+ /*
+ * Unmask the DVALID interrupt and enable the sensors conversions.
+ * Do the reverse procedure when conversion is done.
+ */
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, 0);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN);
+
+ wait_for_completion(&cache->conversion);
+
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID,
+ PVT_INTR_DVALID);
+
+ data = READ_ONCE(cache->data);
+
+ mutex_unlock(&pvt->iface_mtx);
+
+ if (type == PVT_TEMP)
+ *val = pvt_calc_poly(&poly_N_to_temp, data);
+ else
+ *val = pvt_calc_poly(&poly_N_to_volt, data);
+
+ return 0;
+}
+
+static int pvt_read_limit(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ bool is_low, long *val)
+{
+ return -EOPNOTSUPP;
+}
+
+static int pvt_write_limit(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ bool is_low, long val)
+{
+ return -EOPNOTSUPP;
+}
+
+static int pvt_read_alarm(struct pvt_hwmon *pvt, enum pvt_sensor_type type,
+ bool is_low, long *val)
+{
+ return -EOPNOTSUPP;
+}
+
+static const struct hwmon_channel_info *pvt_channel_info[] = {
+ HWMON_CHANNEL_INFO(chip,
+ HWMON_C_REGISTER_TZ | HWMON_C_UPDATE_INTERVAL),
+ HWMON_CHANNEL_INFO(temp,
+ HWMON_T_INPUT | HWMON_T_TYPE | HWMON_T_LABEL |
+ HWMON_T_OFFSET),
+ HWMON_CHANNEL_INFO(in,
+ HWMON_I_INPUT | HWMON_I_LABEL,
+ HWMON_I_INPUT | HWMON_I_LABEL,
+ HWMON_I_INPUT | HWMON_I_LABEL,
+ HWMON_I_INPUT | HWMON_I_LABEL),
+ NULL
+};
+
+#endif /* !CONFIG_SENSORS_BT1_PVT_ALARMS */
+
+static inline bool pvt_hwmon_channel_is_valid(enum hwmon_sensor_types type,
+ int ch)
+{
+ switch (type) {
+ case hwmon_temp:
+ if (ch < 0 || ch >= PVT_TEMP_CHS)
+ return false;
+ break;
+ case hwmon_in:
+ if (ch < 0 || ch >= PVT_VOLT_CHS)
+ return false;
+ break;
+ default:
+ break;
+ }
+
+ /* The rest of the types are independent from the channel number. */
+ return true;
+}
+
+static umode_t pvt_hwmon_is_visible(const void *data,
+ enum hwmon_sensor_types type,
+ u32 attr, int ch)
+{
+ if (!pvt_hwmon_channel_is_valid(type, ch))
+ return 0;
+
+ switch (type) {
+ case hwmon_chip:
+ switch (attr) {
+ case hwmon_chip_update_interval:
+ return 0644;
+ }
+ break;
+ case hwmon_temp:
+ switch (attr) {
+ case hwmon_temp_input:
+ case hwmon_temp_type:
+ case hwmon_temp_label:
+ return 0444;
+ case hwmon_temp_min:
+ case hwmon_temp_max:
+ return pvt_limit_is_visible(ch);
+ case hwmon_temp_min_alarm:
+ case hwmon_temp_max_alarm:
+ return pvt_alarm_is_visible(ch);
+ case hwmon_temp_offset:
+ return 0644;
+ }
+ break;
+ case hwmon_in:
+ switch (attr) {
+ case hwmon_in_input:
+ case hwmon_in_label:
+ return 0444;
+ case hwmon_in_min:
+ case hwmon_in_max:
+ return pvt_limit_is_visible(PVT_VOLT + ch);
+ case hwmon_in_min_alarm:
+ case hwmon_in_max_alarm:
+ return pvt_alarm_is_visible(PVT_VOLT + ch);
+ }
+ break;
+ default:
+ break;
+ }
+
+ return 0;
+}
+
+static int pvt_read_trim(struct pvt_hwmon *pvt, long *val)
+{
+ u32 data;
+
+ data = readl(pvt->regs + PVT_CTRL);
+ *val = FIELD_GET(PVT_CTRL_TRIM_MASK, data) * PVT_TRIM_STEP;
+
+ return 0;
+}
+
+static int pvt_write_trim(struct pvt_hwmon *pvt, long val)
+{
+ u32 trim;
+ int ret;
+
+ /*
+ * Serialize trim update, since a part of the register is changed and
+ * the controller is supposed to be disabled during this operation.
+ */
+ ret = mutex_lock_interruptible(&pvt->iface_mtx);
+ if (ret)
+ return ret;
+
+ trim = pvt_calc_trim(val);
+ pvt_set_trim(pvt, trim);
+
+ mutex_unlock(&pvt->iface_mtx);
+
+ return 0;
+}
+
+static int pvt_read_timeout(struct pvt_hwmon *pvt, long *val)
+{
+ unsigned long rate;
+ ktime_t kt;
+ u32 data;
+
+ rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk);
+ if (!rate)
+ return -ENODEV;
+
+ /*
+ * Don't bother with mutex here, since we just read data from MMIO.
+ * We also have to scale the ticks timeout up to compensate the
+ * ms-ns-data translations.
+ */
+ data = readl(pvt->regs + PVT_TTIMEOUT) + 1;
+
+ /*
+ * Calculate ref-clock based delay (Ttotal) between two consecutive
+ * data samples of the same sensor. So we first must calculate the
+ * delay introduced by the internal ref-clock timer (Tref * Fclk).
+ * Then add the constant timeout cuased by each conversion latency
+ * (Tmin). The basic formulae for each conversion is following:
+ * Ttotal = Tref * Fclk + Tmin
+ * Note if alarms are enabled the sensors are polled one after
+ * another, so in order to have the delay being applicable for each
+ * sensor the requested value must be equally redistirbuted.
+ */
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+ kt = ktime_set(PVT_SENSORS_NUM * (u64)data, 0);
+ kt = ktime_divns(kt, rate);
+ kt = ktime_add_ns(kt, PVT_SENSORS_NUM * PVT_TOUT_MIN);
+#else
+ kt = ktime_set(data, 0);
+ kt = ktime_divns(kt, rate);
+ kt = ktime_add_ns(kt, PVT_TOUT_MIN);
+#endif
+
+ /* Return the result in msec as hwmon sysfs interface requires. */
+ *val = ktime_to_ms(kt);
+
+ return 0;
+}
+
+static int pvt_write_timeout(struct pvt_hwmon *pvt, long val)
+{
+ unsigned long rate;
+ ktime_t kt;
+ u32 data;
+ int ret;
+
+ rate = clk_get_rate(pvt->clks[PVT_CLOCK_REF].clk);
+ if (!rate)
+ return -ENODEV;
+
+ /*
+ * If alarms are enabled, the requested timeout must be divided
+ * between all available sensors to have the requested delay
+ * applicable to each individual sensor.
+ */
+ kt = ms_to_ktime(val);
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+ kt = ktime_divns(kt, PVT_SENSORS_NUM);
+#endif
+
+ /*
+ * Subtract a constant lag, which always persists due to the limited
+ * PVT sampling rate. Make sure the timeout is not negative.
+ */
+ kt = ktime_sub_ns(kt, PVT_TOUT_MIN);
+ if (ktime_to_ns(kt) < 0)
+ kt = ktime_set(0, 0);
+
+ /*
+ * Finally recalculate the timeout in terms of the reference clock
+ * period.
+ */
+ data = ktime_divns(kt * rate, NSEC_PER_SEC);
+
+ /*
+ * Update the measurements delay, but lock the interface first, since
+ * we have to disable PVT in order to have the new delay actually
+ * updated.
+ */
+ ret = mutex_lock_interruptible(&pvt->iface_mtx);
+ if (ret)
+ return ret;
+
+ pvt_set_tout(pvt, data);
+
+ mutex_unlock(&pvt->iface_mtx);
+
+ return 0;
+}
+
+static int pvt_hwmon_read(struct device *dev, enum hwmon_sensor_types type,
+ u32 attr, int ch, long *val)
+{
+ struct pvt_hwmon *pvt = dev_get_drvdata(dev);
+
+ if (!pvt_hwmon_channel_is_valid(type, ch))
+ return -EINVAL;
+
+ switch (type) {
+ case hwmon_chip:
+ switch (attr) {
+ case hwmon_chip_update_interval:
+ return pvt_read_timeout(pvt, val);
+ }
+ break;
+ case hwmon_temp:
+ switch (attr) {
+ case hwmon_temp_input:
+ return pvt_read_data(pvt, ch, val);
+ case hwmon_temp_type:
+ *val = 1;
+ return 0;
+ case hwmon_temp_min:
+ return pvt_read_limit(pvt, ch, true, val);
+ case hwmon_temp_max:
+ return pvt_read_limit(pvt, ch, false, val);
+ case hwmon_temp_min_alarm:
+ return pvt_read_alarm(pvt, ch, true, val);
+ case hwmon_temp_max_alarm:
+ return pvt_read_alarm(pvt, ch, false, val);
+ case hwmon_temp_offset:
+ return pvt_read_trim(pvt, val);
+ }
+ break;
+ case hwmon_in:
+ switch (attr) {
+ case hwmon_in_input:
+ return pvt_read_data(pvt, PVT_VOLT + ch, val);
+ case hwmon_in_min:
+ return pvt_read_limit(pvt, PVT_VOLT + ch, true, val);
+ case hwmon_in_max:
+ return pvt_read_limit(pvt, PVT_VOLT + ch, false, val);
+ case hwmon_in_min_alarm:
+ return pvt_read_alarm(pvt, PVT_VOLT + ch, true, val);
+ case hwmon_in_max_alarm:
+ return pvt_read_alarm(pvt, PVT_VOLT + ch, false, val);
+ }
+ break;
+ default:
+ break;
+ }
+
+ return -EOPNOTSUPP;
+}
+
+static int pvt_hwmon_read_string(struct device *dev,
+ enum hwmon_sensor_types type,
+ u32 attr, int ch, const char **str)
+{
+ if (!pvt_hwmon_channel_is_valid(type, ch))
+ return -EINVAL;
+
+ switch (type) {
+ case hwmon_temp:
+ switch (attr) {
+ case hwmon_temp_label:
+ *str = pvt_info[ch].label;
+ return 0;
+ }
+ break;
+ case hwmon_in:
+ switch (attr) {
+ case hwmon_in_label:
+ *str = pvt_info[PVT_VOLT + ch].label;
+ return 0;
+ }
+ break;
+ default:
+ break;
+ }
+
+ return -EOPNOTSUPP;
+}
+
+static int pvt_hwmon_write(struct device *dev, enum hwmon_sensor_types type,
+ u32 attr, int ch, long val)
+{
+ struct pvt_hwmon *pvt = dev_get_drvdata(dev);
+
+ if (!pvt_hwmon_channel_is_valid(type, ch))
+ return -EINVAL;
+
+ switch (type) {
+ case hwmon_chip:
+ switch (attr) {
+ case hwmon_chip_update_interval:
+ return pvt_write_timeout(pvt, val);
+ }
+ break;
+ case hwmon_temp:
+ switch (attr) {
+ case hwmon_temp_min:
+ return pvt_write_limit(pvt, ch, true, val);
+ case hwmon_temp_max:
+ return pvt_write_limit(pvt, ch, false, val);
+ case hwmon_temp_offset:
+ return pvt_write_trim(pvt, val);
+ }
+ break;
+ case hwmon_in:
+ switch (attr) {
+ case hwmon_in_min:
+ return pvt_write_limit(pvt, PVT_VOLT + ch, true, val);
+ case hwmon_in_max:
+ return pvt_write_limit(pvt, PVT_VOLT + ch, false, val);
+ }
+ break;
+ default:
+ break;
+ }
+
+ return -EOPNOTSUPP;
+}
+
+static const struct hwmon_ops pvt_hwmon_ops = {
+ .is_visible = pvt_hwmon_is_visible,
+ .read = pvt_hwmon_read,
+ .read_string = pvt_hwmon_read_string,
+ .write = pvt_hwmon_write
+};
+
+static const struct hwmon_chip_info pvt_hwmon_info = {
+ .ops = &pvt_hwmon_ops,
+ .info = pvt_channel_info
+};
+
+static void pvt_clear_data(void *data)
+{
+ struct pvt_hwmon *pvt = data;
+#if !defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+ int idx;
+
+ for (idx = 0; idx < PVT_SENSORS_NUM; ++idx)
+ complete_all(&pvt->cache[idx].conversion);
+#endif
+
+ mutex_destroy(&pvt->iface_mtx);
+}
+
+static struct pvt_hwmon *pvt_create_data(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct pvt_hwmon *pvt;
+ int ret, idx;
+
+ pvt = devm_kzalloc(dev, sizeof(*pvt), GFP_KERNEL);
+ if (!pvt)
+ return ERR_PTR(-ENOMEM);
+
+ ret = devm_add_action(dev, pvt_clear_data, pvt);
+ if (ret) {
+ dev_err(dev, "Can't add PVT data clear action\n");
+ return ERR_PTR(ret);
+ }
+
+ pvt->dev = dev;
+ pvt->sensor = PVT_SENSOR_FIRST;
+ mutex_init(&pvt->iface_mtx);
+
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+ for (idx = 0; idx < PVT_SENSORS_NUM; ++idx)
+ seqlock_init(&pvt->cache[idx].data_seqlock);
+#else
+ for (idx = 0; idx < PVT_SENSORS_NUM; ++idx)
+ init_completion(&pvt->cache[idx].conversion);
+#endif
+
+ return pvt;
+}
+
+static int pvt_request_regs(struct pvt_hwmon *pvt)
+{
+ struct platform_device *pdev = to_platform_device(pvt->dev);
+ struct resource *res;
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ if (!res) {
+ dev_err(pvt->dev, "Couldn't find PVT memresource\n");
+ return -EINVAL;
+ }
+
+ pvt->regs = devm_ioremap_resource(pvt->dev, res);
+ if (IS_ERR(pvt->regs)) {
+ dev_err(pvt->dev, "Couldn't map PVT registers\n");
+ return PTR_ERR(pvt->regs);
+ }
+
+ return 0;
+}
+
+static void pvt_disable_clks(void *data)
+{
+ struct pvt_hwmon *pvt = data;
+
+ clk_bulk_disable_unprepare(PVT_CLOCK_NUM, pvt->clks);
+}
+
+static int pvt_request_clks(struct pvt_hwmon *pvt)
+{
+ int ret;
+
+ pvt->clks[PVT_CLOCK_APB].id = "pclk";
+ pvt->clks[PVT_CLOCK_REF].id = "ref";
+
+ ret = devm_clk_bulk_get(pvt->dev, PVT_CLOCK_NUM, pvt->clks);
+ if (ret) {
+ dev_err(pvt->dev, "Couldn't get PVT clocks descriptors\n");
+ return ret;
+ }
+
+ ret = clk_bulk_prepare_enable(PVT_CLOCK_NUM, pvt->clks);
+ if (ret) {
+ dev_err(pvt->dev, "Couldn't enable the PVT clocks\n");
+ return ret;
+ }
+
+ ret = devm_add_action_or_reset(pvt->dev, pvt_disable_clks, pvt);
+ if (ret) {
+ dev_err(pvt->dev, "Can't add PVT clocks disable action\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static void pvt_init_iface(struct pvt_hwmon *pvt)
+{
+ u32 trim, temp;
+
+ /*
+ * Make sure all interrupts and controller are disabled so not to
+ * accidentally have ISR executed before the driver data is fully
+ * initialized. Clear the IRQ status as well.
+ */
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_ALL, PVT_INTR_ALL);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+ readl(pvt->regs + PVT_CLR_INTR);
+ readl(pvt->regs + PVT_DATA);
+
+ /* Setup default sensor mode, timeout and temperature trim. */
+ pvt_set_mode(pvt, pvt_info[pvt->sensor].mode);
+ pvt_set_tout(pvt, PVT_TOUT_DEF);
+
+ trim = PVT_TRIM_DEF;
+ if (!of_property_read_u32(pvt->dev->of_node,
+ "baikal,pvt-temp-offset-millicelsius", &temp))
+ trim = pvt_calc_trim(temp);
+
+ pvt_set_trim(pvt, trim);
+}
+
+static int pvt_request_irq(struct pvt_hwmon *pvt)
+{
+ struct platform_device *pdev = to_platform_device(pvt->dev);
+ int ret;
+
+ pvt->irq = platform_get_irq(pdev, 0);
+ if (pvt->irq < 0)
+ return pvt->irq;
+
+ ret = devm_request_threaded_irq(pvt->dev, pvt->irq,
+ pvt_hard_isr, pvt_soft_isr,
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+ IRQF_SHARED | IRQF_TRIGGER_HIGH |
+ IRQF_ONESHOT,
+#else
+ IRQF_SHARED | IRQF_TRIGGER_HIGH,
+#endif
+ "pvt", pvt);
+ if (ret) {
+ dev_err(pvt->dev, "Couldn't request PVT IRQ\n");
+ return ret;
+ }
+
+ return 0;
+}
+
+static int pvt_create_hwmon(struct pvt_hwmon *pvt)
+{
+ pvt->hwmon = devm_hwmon_device_register_with_info(pvt->dev, "pvt", pvt,
+ &pvt_hwmon_info, NULL);
+ if (IS_ERR(pvt->hwmon)) {
+ dev_err(pvt->dev, "Couldn't create hwmon device\n");
+ return PTR_ERR(pvt->hwmon);
+ }
+
+ return 0;
+}
+
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+
+static void pvt_disable_iface(void *data)
+{
+ struct pvt_hwmon *pvt = data;
+
+ mutex_lock(&pvt->iface_mtx);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, 0);
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID,
+ PVT_INTR_DVALID);
+ mutex_unlock(&pvt->iface_mtx);
+}
+
+static int pvt_enable_iface(struct pvt_hwmon *pvt)
+{
+ int ret;
+
+ ret = devm_add_action(pvt->dev, pvt_disable_iface, pvt);
+ if (ret) {
+ dev_err(pvt->dev, "Can't add PVT disable interface action\n");
+ return ret;
+ }
+
+ /*
+ * Enable sensors data conversion and IRQ. We need to lock the
+ * interface mutex since hwmon has just been created and the
+ * corresponding sysfs files are accessible from user-space,
+ * which theoretically may cause races.
+ */
+ mutex_lock(&pvt->iface_mtx);
+ pvt_update(pvt->regs + PVT_INTR_MASK, PVT_INTR_DVALID, 0);
+ pvt_update(pvt->regs + PVT_CTRL, PVT_CTRL_EN, PVT_CTRL_EN);
+ mutex_unlock(&pvt->iface_mtx);
+
+ return 0;
+}
+
+#else /* !CONFIG_SENSORS_BT1_PVT_ALARMS */
+
+static int pvt_enable_iface(struct pvt_hwmon *pvt)
+{
+ return 0;
+}
+
+#endif /* !CONFIG_SENSORS_BT1_PVT_ALARMS */
+
+static int pvt_probe(struct platform_device *pdev)
+{
+ struct pvt_hwmon *pvt;
+ int ret;
+
+ pvt = pvt_create_data(pdev);
+ if (IS_ERR(pvt))
+ return PTR_ERR(pvt);
+
+ ret = pvt_request_regs(pvt);
+ if (ret)
+ return ret;
+
+ ret = pvt_request_clks(pvt);
+ if (ret)
+ return ret;
+
+ pvt_init_iface(pvt);
+
+ ret = pvt_request_irq(pvt);
+ if (ret)
+ return ret;
+
+ ret = pvt_create_hwmon(pvt);
+ if (ret)
+ return ret;
+
+ ret = pvt_enable_iface(pvt);
+ if (ret)
+ return ret;
+
+ return 0;
+}
+
+static const struct of_device_id pvt_of_match[] = {
+ { .compatible = "baikal,bt1-pvt" },
+ { }
+};
+MODULE_DEVICE_TABLE(of, pvt_of_match);
+
+static struct platform_driver pvt_driver = {
+ .probe = pvt_probe,
+ .driver = {
+ .name = "bt1-pvt",
+ .of_match_table = pvt_of_match
+ }
+};
+module_platform_driver(pvt_driver);
+
+MODULE_AUTHOR("Maxim Kaurkin <maxim.kaurkin@baikalelectronics.ru>");
+MODULE_DESCRIPTION("Baikal-T1 PVT driver");
+MODULE_LICENSE("GPL v2");
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2020 BAIKAL ELECTRONICS, JSC
+ *
+ * Baikal-T1 Process, Voltage, Temperature sensor driver
+ */
+#ifndef __HWMON_BT1_PVT_H__
+#define __HWMON_BT1_PVT_H__
+
+#include <linux/completion.h>
+#include <linux/hwmon.h>
+#include <linux/kernel.h>
+#include <linux/mutex.h>
+#include <linux/seqlock.h>
+
+/* Baikal-T1 PVT registers and their bitfields */
+#define PVT_CTRL 0x00
+#define PVT_CTRL_EN BIT(0)
+#define PVT_CTRL_MODE_FLD 1
+#define PVT_CTRL_MODE_MASK GENMASK(3, PVT_CTRL_MODE_FLD)
+#define PVT_CTRL_MODE_TEMP 0x0
+#define PVT_CTRL_MODE_VOLT 0x1
+#define PVT_CTRL_MODE_LVT 0x2
+#define PVT_CTRL_MODE_HVT 0x4
+#define PVT_CTRL_MODE_SVT 0x6
+#define PVT_CTRL_TRIM_FLD 4
+#define PVT_CTRL_TRIM_MASK GENMASK(8, PVT_CTRL_TRIM_FLD)
+#define PVT_DATA 0x04
+#define PVT_DATA_VALID BIT(10)
+#define PVT_DATA_DATA_FLD 0
+#define PVT_DATA_DATA_MASK GENMASK(9, PVT_DATA_DATA_FLD)
+#define PVT_TTHRES 0x08
+#define PVT_VTHRES 0x0C
+#define PVT_LTHRES 0x10
+#define PVT_HTHRES 0x14
+#define PVT_STHRES 0x18
+#define PVT_THRES_LO_FLD 0
+#define PVT_THRES_LO_MASK GENMASK(9, PVT_THRES_LO_FLD)
+#define PVT_THRES_HI_FLD 10
+#define PVT_THRES_HI_MASK GENMASK(19, PVT_THRES_HI_FLD)
+#define PVT_TTIMEOUT 0x1C
+#define PVT_INTR_STAT 0x20
+#define PVT_INTR_MASK 0x24
+#define PVT_RAW_INTR_STAT 0x28
+#define PVT_INTR_DVALID BIT(0)
+#define PVT_INTR_TTHRES_LO BIT(1)
+#define PVT_INTR_TTHRES_HI BIT(2)
+#define PVT_INTR_VTHRES_LO BIT(3)
+#define PVT_INTR_VTHRES_HI BIT(4)
+#define PVT_INTR_LTHRES_LO BIT(5)
+#define PVT_INTR_LTHRES_HI BIT(6)
+#define PVT_INTR_HTHRES_LO BIT(7)
+#define PVT_INTR_HTHRES_HI BIT(8)
+#define PVT_INTR_STHRES_LO BIT(9)
+#define PVT_INTR_STHRES_HI BIT(10)
+#define PVT_INTR_ALL GENMASK(10, 0)
+#define PVT_CLR_INTR 0x2C
+
+/*
+ * PVT sensors-related limits and default values
+ * @PVT_TEMP_MIN: Minimal temperature in millidegrees of Celsius.
+ * @PVT_TEMP_MAX: Maximal temperature in millidegrees of Celsius.
+ * @PVT_TEMP_CHS: Number of temperature hwmon channels.
+ * @PVT_VOLT_MIN: Minimal voltage in mV.
+ * @PVT_VOLT_MAX: Maximal voltage in mV.
+ * @PVT_VOLT_CHS: Number of voltage hwmon channels.
+ * @PVT_DATA_MIN: Minimal PVT raw data value.
+ * @PVT_DATA_MAX: Maximal PVT raw data value.
+ * @PVT_TRIM_MIN: Minimal temperature sensor trim value.
+ * @PVT_TRIM_MAX: Maximal temperature sensor trim value.
+ * @PVT_TRIM_DEF: Default temperature sensor trim value (set a proper value
+ * when one is determined for Baikal-T1 SoC).
+ * @PVT_TRIM_TEMP: Maximum temperature encoded by the trim factor.
+ * @PVT_TRIM_STEP: Temperature stride corresponding to the trim value.
+ * @PVT_TOUT_MIN: Minimal timeout between samples in nanoseconds.
+ * @PVT_TOUT_DEF: Default data measurements timeout. In case if alarms are
+ * activated the PVT IRQ is enabled to be raised after each
+ * conversion in order to have the thresholds checked and the
+ * converted value cached. Too frequent conversions may cause
+ * the system CPU overload. Lets set the 50ms delay between
+ * them by default to prevent this.
+ */
+#define PVT_TEMP_MIN -48380L
+#define PVT_TEMP_MAX 147438L
+#define PVT_TEMP_CHS 1
+#define PVT_VOLT_MIN 620L
+#define PVT_VOLT_MAX 1168L
+#define PVT_VOLT_CHS 4
+#define PVT_DATA_MIN 0
+#define PVT_DATA_MAX (PVT_DATA_DATA_MASK >> PVT_DATA_DATA_FLD)
+#define PVT_TRIM_MIN 0
+#define PVT_TRIM_MAX (PVT_CTRL_TRIM_MASK >> PVT_CTRL_TRIM_FLD)
+#define PVT_TRIM_TEMP 7130
+#define PVT_TRIM_STEP (PVT_TRIM_TEMP / PVT_TRIM_MAX)
+#define PVT_TRIM_DEF 0
+#define PVT_TOUT_MIN (NSEC_PER_SEC / 3000)
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+# define PVT_TOUT_DEF 60000
+#else
+# define PVT_TOUT_DEF 0
+#endif
+
+/*
+ * enum pvt_sensor_type - Baikal-T1 PVT sensor types (correspond to each PVT
+ * sampling mode)
+ * @PVT_SENSOR*: helpers to traverse the sensors in loops.
+ * @PVT_TEMP: PVT Temperature sensor.
+ * @PVT_VOLT: PVT Voltage sensor.
+ * @PVT_LVT: PVT Low-Voltage threshold sensor.
+ * @PVT_HVT: PVT High-Voltage threshold sensor.
+ * @PVT_SVT: PVT Standard-Voltage threshold sensor.
+ */
+enum pvt_sensor_type {
+ PVT_SENSOR_FIRST,
+ PVT_TEMP = PVT_SENSOR_FIRST,
+ PVT_VOLT,
+ PVT_LVT,
+ PVT_HVT,
+ PVT_SVT,
+ PVT_SENSOR_LAST = PVT_SVT,
+ PVT_SENSORS_NUM
+};
+
+/*
+ * enum pvt_clock_type - Baikal-T1 PVT clocks.
+ * @PVT_CLOCK_APB: APB clock.
+ * @PVT_CLOCK_REF: PVT reference clock.
+ */
+enum pvt_clock_type {
+ PVT_CLOCK_APB,
+ PVT_CLOCK_REF,
+ PVT_CLOCK_NUM
+};
+
+/*
+ * struct pvt_sensor_info - Baikal-T1 PVT sensor informational structure
+ * @channel: Sensor channel ID.
+ * @label: hwmon sensor label.
+ * @mode: PVT mode corresponding to the channel.
+ * @thres_base: upper and lower threshold values of the sensor.
+ * @thres_sts_lo: low threshold status bitfield.
+ * @thres_sts_hi: high threshold status bitfield.
+ * @type: Sensor type.
+ * @attr_min_alarm: Min alarm attribute ID.
+ * @attr_min_alarm: Max alarm attribute ID.
+ */
+struct pvt_sensor_info {
+ int channel;
+ const char *label;
+ u32 mode;
+ unsigned long thres_base;
+ u32 thres_sts_lo;
+ u32 thres_sts_hi;
+ enum hwmon_sensor_types type;
+ u32 attr_min_alarm;
+ u32 attr_max_alarm;
+};
+
+#define PVT_SENSOR_INFO(_ch, _label, _type, _mode, _thres) \
+ { \
+ .channel = _ch, \
+ .label = _label, \
+ .mode = PVT_CTRL_MODE_ ##_mode, \
+ .thres_base = PVT_ ##_thres, \
+ .thres_sts_lo = PVT_INTR_ ##_thres## _LO, \
+ .thres_sts_hi = PVT_INTR_ ##_thres## _HI, \
+ .type = _type, \
+ .attr_min_alarm = _type## _min, \
+ .attr_max_alarm = _type## _max, \
+ }
+
+/*
+ * struct pvt_cache - PVT sensors data cache
+ * @data: data cache in raw format.
+ * @thres_sts_lo: low threshold status saved on the previous data conversion.
+ * @thres_sts_hi: high threshold status saved on the previous data conversion.
+ * @data_seqlock: cached data seq-lock.
+ * @conversion: data conversion completion.
+ */
+struct pvt_cache {
+ u32 data;
+#if defined(CONFIG_SENSORS_BT1_PVT_ALARMS)
+ seqlock_t data_seqlock;
+ u32 thres_sts_lo;
+ u32 thres_sts_hi;
+#else
+ struct completion conversion;
+#endif
+};
+
+/*
+ * struct pvt_hwmon - Baikal-T1 PVT private data
+ * @dev: device structure of the PVT platform device.
+ * @hwmon: hwmon device structure.
+ * @regs: pointer to the Baikal-T1 PVT registers region.
+ * @irq: PVT events IRQ number.
+ * @clks: Array of the PVT clocks descriptor (APB/ref clocks).
+ * @ref_clk: Pointer to the reference clocks descriptor.
+ * @iface_mtx: Generic interface mutex (used to lock the alarm registers
+ * when the alarms enabled, or the data conversion interface
+ * if alarms are disabled).
+ * @sensor: current PVT sensor the data conversion is being performed for.
+ * @cache: data cache descriptor.
+ */
+struct pvt_hwmon {
+ struct device *dev;
+ struct device *hwmon;
+
+ void __iomem *regs;
+ int irq;
+
+ struct clk_bulk_data clks[PVT_CLOCK_NUM];
+
+ struct mutex iface_mtx;
+ enum pvt_sensor_type sensor;
+ struct pvt_cache cache[PVT_SENSORS_NUM];
+};
+
+/*
+ * struct pvt_poly_term - a term descriptor of the PVT data translation
+ * polynomial
+ * @deg: degree of the term.
+ * @coef: multiplication factor of the term.
+ * @divider: distributed divider per each degree.
+ * @divider_leftover: divider leftover, which couldn't be redistributed.
+ */
+struct pvt_poly_term {
+ unsigned int deg;
+ long coef;
+ long divider;
+ long divider_leftover;
+};
+
+/*
+ * struct pvt_poly - PVT data translation polynomial descriptor
+ * @total_divider: total data divider.
+ * @terms: polynomial terms up to a free one.
+ */
+struct pvt_poly {
+ long total_divider;
+ struct pvt_poly_term terms[];
+};
+
+#endif /* __HWMON_BT1_PVT_H__ */
int channel = to_sensor_dev_attr(devattr)->index;
int ret;
- mutex_lock(&hwmon->hwmon_lock);
+ mutex_lock(&hwmon->da9052->auxadc_lock);
ret = __da9052_read_tsi(dev, channel);
- mutex_unlock(&hwmon->hwmon_lock);
+ mutex_unlock(&hwmon->da9052->auxadc_lock);
if (ret < 0)
return ret;
DMI_MATCH(DMI_PRODUCT_NAME, "Vostro"),
},
},
- {
- .ident = "Dell XPS421",
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
- DMI_MATCH(DMI_PRODUCT_NAME, "XPS L421X"),
- },
- },
{
.ident = "Dell Studio",
.matches = {
},
.driver_data = (void *)&i8k_config_data[DELL_STUDIO],
},
- {
- .ident = "Dell XPS 13",
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
- DMI_MATCH(DMI_PRODUCT_NAME, "XPS13"),
- },
- .driver_data = (void *)&i8k_config_data[DELL_XPS],
- },
{
.ident = "Dell XPS M140",
.matches = {
.driver_data = (void *)&i8k_config_data[DELL_XPS],
},
{
- .ident = "Dell XPS 15 9560",
- .matches = {
- DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
- DMI_MATCH(DMI_PRODUCT_NAME, "XPS 15 9560"),
- },
- },
- {
- .ident = "Dell XPS 15 9570",
+ .ident = "Dell XPS",
.matches = {
DMI_MATCH(DMI_SYS_VENDOR, "Dell Inc."),
- DMI_MATCH(DMI_PRODUCT_NAME, "XPS 15 9570"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "XPS"),
},
},
{ }
st->have_temp_highest = temp_is_valid(buf[SCT_STATUS_TEMP_HIGHEST]);
if (!have_sct_data_table)
- goto skip_sct;
+ goto skip_sct_data;
/* Request and read temperature history table */
memset(buf, '\0', sizeof(st->smartdata));
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Driver for Gateworks System Controller Hardware Monitor module
+ *
+ * Copyright (C) 2020 Gateworks Corporation
+ */
+#include <linux/hwmon.h>
+#include <linux/hwmon-sysfs.h>
+#include <linux/mfd/gsc.h>
+#include <linux/module.h>
+#include <linux/of.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+#include <linux/slab.h>
+
+#include <linux/platform_data/gsc_hwmon.h>
+
+#define GSC_HWMON_MAX_TEMP_CH 16
+#define GSC_HWMON_MAX_IN_CH 16
+
+#define GSC_HWMON_RESOLUTION 12
+#define GSC_HWMON_VREF 2500
+
+struct gsc_hwmon_data {
+ struct gsc_dev *gsc;
+ struct gsc_hwmon_platform_data *pdata;
+ struct regmap *regmap;
+ const struct gsc_hwmon_channel *temp_ch[GSC_HWMON_MAX_TEMP_CH];
+ const struct gsc_hwmon_channel *in_ch[GSC_HWMON_MAX_IN_CH];
+ u32 temp_config[GSC_HWMON_MAX_TEMP_CH + 1];
+ u32 in_config[GSC_HWMON_MAX_IN_CH + 1];
+ struct hwmon_channel_info temp_info;
+ struct hwmon_channel_info in_info;
+ const struct hwmon_channel_info *info[3];
+ struct hwmon_chip_info chip;
+};
+
+static struct regmap_bus gsc_hwmon_regmap_bus = {
+ .reg_read = gsc_read,
+ .reg_write = gsc_write,
+};
+
+static const struct regmap_config gsc_hwmon_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .cache_type = REGCACHE_NONE,
+};
+
+static ssize_t pwm_auto_point_temp_show(struct device *dev,
+ struct device_attribute *devattr,
+ char *buf)
+{
+ struct gsc_hwmon_data *hwmon = dev_get_drvdata(dev);
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
+ u8 reg = hwmon->pdata->fan_base + (2 * attr->index);
+ u8 regs[2];
+ int ret;
+
+ ret = regmap_bulk_read(hwmon->regmap, reg, regs, 2);
+ if (ret)
+ return ret;
+
+ ret = regs[0] | regs[1] << 8;
+ return sprintf(buf, "%d\n", ret * 10);
+}
+
+static ssize_t pwm_auto_point_temp_store(struct device *dev,
+ struct device_attribute *devattr,
+ const char *buf, size_t count)
+{
+ struct gsc_hwmon_data *hwmon = dev_get_drvdata(dev);
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
+ u8 reg = hwmon->pdata->fan_base + (2 * attr->index);
+ u8 regs[2];
+ long temp;
+ int err;
+
+ if (kstrtol(buf, 10, &temp))
+ return -EINVAL;
+
+ temp = clamp_val(temp, 0, 10000);
+ temp = DIV_ROUND_CLOSEST(temp, 10);
+
+ regs[0] = temp & 0xff;
+ regs[1] = (temp >> 8) & 0xff;
+ err = regmap_bulk_write(hwmon->regmap, reg, regs, 2);
+ if (err)
+ return err;
+
+ return count;
+}
+
+static ssize_t pwm_auto_point_pwm_show(struct device *dev,
+ struct device_attribute *devattr,
+ char *buf)
+{
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(devattr);
+
+ return sprintf(buf, "%d\n", 255 * (50 + (attr->index * 10)) / 100);
+}
+
+static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point1_pwm, pwm_auto_point_pwm, 0);
+static SENSOR_DEVICE_ATTR_RW(pwm1_auto_point1_temp, pwm_auto_point_temp, 0);
+
+static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point2_pwm, pwm_auto_point_pwm, 1);
+static SENSOR_DEVICE_ATTR_RW(pwm1_auto_point2_temp, pwm_auto_point_temp, 1);
+
+static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point3_pwm, pwm_auto_point_pwm, 2);
+static SENSOR_DEVICE_ATTR_RW(pwm1_auto_point3_temp, pwm_auto_point_temp, 2);
+
+static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point4_pwm, pwm_auto_point_pwm, 3);
+static SENSOR_DEVICE_ATTR_RW(pwm1_auto_point4_temp, pwm_auto_point_temp, 3);
+
+static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point5_pwm, pwm_auto_point_pwm, 4);
+static SENSOR_DEVICE_ATTR_RW(pwm1_auto_point5_temp, pwm_auto_point_temp, 4);
+
+static SENSOR_DEVICE_ATTR_RO(pwm1_auto_point6_pwm, pwm_auto_point_pwm, 5);
+static SENSOR_DEVICE_ATTR_RW(pwm1_auto_point6_temp, pwm_auto_point_temp, 5);
+
+static struct attribute *gsc_hwmon_attributes[] = {
+ &sensor_dev_attr_pwm1_auto_point1_pwm.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point1_temp.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point2_pwm.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point2_temp.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point3_pwm.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point3_temp.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point4_pwm.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point4_temp.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point5_pwm.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point5_temp.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point6_pwm.dev_attr.attr,
+ &sensor_dev_attr_pwm1_auto_point6_temp.dev_attr.attr,
+ NULL
+};
+
+static const struct attribute_group gsc_hwmon_group = {
+ .attrs = gsc_hwmon_attributes,
+};
+__ATTRIBUTE_GROUPS(gsc_hwmon);
+
+static int
+gsc_hwmon_read(struct device *dev, enum hwmon_sensor_types type, u32 attr,
+ int channel, long *val)
+{
+ struct gsc_hwmon_data *hwmon = dev_get_drvdata(dev);
+ const struct gsc_hwmon_channel *ch;
+ int sz, ret;
+ long tmp;
+ u8 buf[3];
+
+ switch (type) {
+ case hwmon_in:
+ ch = hwmon->in_ch[channel];
+ break;
+ case hwmon_temp:
+ ch = hwmon->temp_ch[channel];
+ break;
+ default:
+ return -EOPNOTSUPP;
+ }
+
+ sz = (ch->mode == mode_voltage) ? 3 : 2;
+ ret = regmap_bulk_read(hwmon->regmap, ch->reg, buf, sz);
+ if (ret)
+ return ret;
+
+ tmp = 0;
+ while (sz-- > 0)
+ tmp |= (buf[sz] << (8 * sz));
+
+ switch (ch->mode) {
+ case mode_temperature:
+ if (tmp > 0x8000)
+ tmp -= 0xffff;
+ break;
+ case mode_voltage_raw:
+ tmp = clamp_val(tmp, 0, BIT(GSC_HWMON_RESOLUTION));
+ /* scale based on ref voltage and ADC resolution */
+ tmp *= GSC_HWMON_VREF;
+ tmp >>= GSC_HWMON_RESOLUTION;
+ /* scale based on optional voltage divider */
+ if (ch->vdiv[0] && ch->vdiv[1]) {
+ tmp *= (ch->vdiv[0] + ch->vdiv[1]);
+ tmp /= ch->vdiv[1];
+ }
+ /* adjust by uV offset */
+ tmp += ch->mvoffset;
+ break;
+ case mode_voltage:
+ /* no adjustment needed */
+ break;
+ }
+
+ *val = tmp;
+
+ return 0;
+}
+
+static int
+gsc_hwmon_read_string(struct device *dev, enum hwmon_sensor_types type,
+ u32 attr, int channel, const char **buf)
+{
+ struct gsc_hwmon_data *hwmon = dev_get_drvdata(dev);
+
+ switch (type) {
+ case hwmon_in:
+ *buf = hwmon->in_ch[channel]->name;
+ break;
+ case hwmon_temp:
+ *buf = hwmon->temp_ch[channel]->name;
+ break;
+ default:
+ return -ENOTSUPP;
+ }
+
+ return 0;
+}
+
+static umode_t
+gsc_hwmon_is_visible(const void *_data, enum hwmon_sensor_types type, u32 attr,
+ int ch)
+{
+ return 0444;
+}
+
+static const struct hwmon_ops gsc_hwmon_ops = {
+ .is_visible = gsc_hwmon_is_visible,
+ .read = gsc_hwmon_read,
+ .read_string = gsc_hwmon_read_string,
+};
+
+static struct gsc_hwmon_platform_data *
+gsc_hwmon_get_devtree_pdata(struct device *dev)
+{
+ struct gsc_hwmon_platform_data *pdata;
+ struct gsc_hwmon_channel *ch;
+ struct fwnode_handle *child;
+ struct device_node *fan;
+ int nchannels;
+
+ nchannels = device_get_child_node_count(dev);
+ if (nchannels == 0)
+ return ERR_PTR(-ENODEV);
+
+ pdata = devm_kzalloc(dev,
+ sizeof(*pdata) + nchannels * sizeof(*ch),
+ GFP_KERNEL);
+ if (!pdata)
+ return ERR_PTR(-ENOMEM);
+ ch = (struct gsc_hwmon_channel *)(pdata + 1);
+ pdata->channels = ch;
+ pdata->nchannels = nchannels;
+
+ /* fan controller base address */
+ fan = of_find_compatible_node(dev->parent->of_node, NULL, "gw,gsc-fan");
+ if (fan && of_property_read_u32(fan, "reg", &pdata->fan_base)) {
+ dev_err(dev, "fan node without base\n");
+ return ERR_PTR(-EINVAL);
+ }
+
+ /* allocate structures for channels and count instances of each type */
+ device_for_each_child_node(dev, child) {
+ if (fwnode_property_read_string(child, "label", &ch->name)) {
+ dev_err(dev, "channel without label\n");
+ fwnode_handle_put(child);
+ return ERR_PTR(-EINVAL);
+ }
+ if (fwnode_property_read_u32(child, "reg", &ch->reg)) {
+ dev_err(dev, "channel without reg\n");
+ fwnode_handle_put(child);
+ return ERR_PTR(-EINVAL);
+ }
+ if (fwnode_property_read_u32(child, "gw,mode", &ch->mode)) {
+ dev_err(dev, "channel without mode\n");
+ fwnode_handle_put(child);
+ return ERR_PTR(-EINVAL);
+ }
+ if (ch->mode > mode_max) {
+ dev_err(dev, "invalid channel mode\n");
+ fwnode_handle_put(child);
+ return ERR_PTR(-EINVAL);
+ }
+
+ if (!fwnode_property_read_u32(child,
+ "gw,voltage-offset-microvolt",
+ &ch->mvoffset))
+ ch->mvoffset /= 1000;
+ fwnode_property_read_u32_array(child,
+ "gw,voltage-divider-ohms",
+ ch->vdiv, ARRAY_SIZE(ch->vdiv));
+ ch++;
+ }
+
+ return pdata;
+}
+
+static int gsc_hwmon_probe(struct platform_device *pdev)
+{
+ struct gsc_dev *gsc = dev_get_drvdata(pdev->dev.parent);
+ struct device *dev = &pdev->dev;
+ struct device *hwmon_dev;
+ struct gsc_hwmon_platform_data *pdata = dev_get_platdata(dev);
+ struct gsc_hwmon_data *hwmon;
+ const struct attribute_group **groups;
+ int i, i_in, i_temp;
+
+ if (!pdata) {
+ pdata = gsc_hwmon_get_devtree_pdata(dev);
+ if (IS_ERR(pdata))
+ return PTR_ERR(pdata);
+ }
+
+ hwmon = devm_kzalloc(dev, sizeof(*hwmon), GFP_KERNEL);
+ if (!hwmon)
+ return -ENOMEM;
+ hwmon->gsc = gsc;
+ hwmon->pdata = pdata;
+
+ hwmon->regmap = devm_regmap_init(dev, &gsc_hwmon_regmap_bus,
+ gsc->i2c_hwmon,
+ &gsc_hwmon_regmap_config);
+ if (IS_ERR(hwmon->regmap))
+ return PTR_ERR(hwmon->regmap);
+
+ for (i = 0, i_in = 0, i_temp = 0; i < hwmon->pdata->nchannels; i++) {
+ const struct gsc_hwmon_channel *ch = &pdata->channels[i];
+
+ switch (ch->mode) {
+ case mode_temperature:
+ if (i_temp == GSC_HWMON_MAX_TEMP_CH) {
+ dev_err(gsc->dev, "too many temp channels\n");
+ return -EINVAL;
+ }
+ hwmon->temp_ch[i_temp] = ch;
+ hwmon->temp_config[i_temp] = HWMON_T_INPUT |
+ HWMON_T_LABEL;
+ i_temp++;
+ break;
+ case mode_voltage:
+ case mode_voltage_raw:
+ if (i_in == GSC_HWMON_MAX_IN_CH) {
+ dev_err(gsc->dev, "too many input channels\n");
+ return -EINVAL;
+ }
+ hwmon->in_ch[i_in] = ch;
+ hwmon->in_config[i_in] =
+ HWMON_I_INPUT | HWMON_I_LABEL;
+ i_in++;
+ break;
+ default:
+ dev_err(gsc->dev, "invalid mode: %d\n", ch->mode);
+ return -EINVAL;
+ }
+ }
+
+ /* setup config structures */
+ hwmon->chip.ops = &gsc_hwmon_ops;
+ hwmon->chip.info = hwmon->info;
+ hwmon->info[0] = &hwmon->temp_info;
+ hwmon->info[1] = &hwmon->in_info;
+ hwmon->temp_info.type = hwmon_temp;
+ hwmon->temp_info.config = hwmon->temp_config;
+ hwmon->in_info.type = hwmon_in;
+ hwmon->in_info.config = hwmon->in_config;
+
+ groups = pdata->fan_base ? gsc_hwmon_groups : NULL;
+ hwmon_dev = devm_hwmon_device_register_with_info(dev,
+ KBUILD_MODNAME, hwmon,
+ &hwmon->chip, groups);
+ return PTR_ERR_OR_ZERO(hwmon_dev);
+}
+
+static const struct of_device_id gsc_hwmon_of_match[] = {
+ { .compatible = "gw,gsc-adc", },
+ {}
+};
+
+static struct platform_driver gsc_hwmon_driver = {
+ .driver = {
+ .name = "gsc-hwmon",
+ .of_match_table = gsc_hwmon_of_match,
+ },
+ .probe = gsc_hwmon_probe,
+};
+
+module_platform_driver(gsc_hwmon_driver);
+
+MODULE_AUTHOR("Tim Harvey <tharvey@gateworks.com>");
+MODULE_DESCRIPTION("GSC hardware monitor driver");
+MODULE_LICENSE("GPL v2");
#include <linux/gfp.h>
#include <linux/hwmon.h>
#include <linux/idr.h>
+#include <linux/list.h>
#include <linux/module.h>
#include <linux/pci.h>
#include <linux/slab.h>
const char *name;
struct device dev;
const struct hwmon_chip_info *chip;
-
+ struct list_head tzdata;
struct attribute_group group;
const struct attribute_group **groups;
};
/*
* Thermal zone information
- * In addition to the reference to the hwmon device,
- * also provides the sensor index.
*/
struct hwmon_thermal_data {
+ struct list_head node; /* hwmon tzdata list entry */
struct device *dev; /* Reference to hwmon device */
int index; /* sensor index */
+ struct thermal_zone_device *tzd;/* thermal zone device */
};
static ssize_t
.get_temp = hwmon_thermal_get_temp,
};
+static void hwmon_thermal_remove_sensor(void *data)
+{
+ list_del(data);
+}
+
static int hwmon_thermal_add_sensor(struct device *dev, int index)
{
+ struct hwmon_device *hwdev = to_hwmon_device(dev);
struct hwmon_thermal_data *tdata;
struct thermal_zone_device *tzd;
+ int err;
tdata = devm_kzalloc(dev, sizeof(*tdata), GFP_KERNEL);
if (!tdata)
if (IS_ERR(tzd) && (PTR_ERR(tzd) != -ENODEV))
return PTR_ERR(tzd);
+ err = devm_add_action(dev, hwmon_thermal_remove_sensor, &tdata->node);
+ if (err)
+ return err;
+
+ tdata->tzd = tzd;
+ list_add(&tdata->node, &hwdev->tzdata);
+
return 0;
}
+
+static int hwmon_thermal_register_sensors(struct device *dev)
+{
+ struct hwmon_device *hwdev = to_hwmon_device(dev);
+ const struct hwmon_chip_info *chip = hwdev->chip;
+ const struct hwmon_channel_info **info = chip->info;
+ void *drvdata = dev_get_drvdata(dev);
+ int i;
+
+ for (i = 1; info[i]; i++) {
+ int j;
+
+ if (info[i]->type != hwmon_temp)
+ continue;
+
+ for (j = 0; info[i]->config[j]; j++) {
+ int err;
+
+ if (!(info[i]->config[j] & HWMON_T_INPUT) ||
+ !chip->ops->is_visible(drvdata, hwmon_temp,
+ hwmon_temp_input, j))
+ continue;
+
+ err = hwmon_thermal_add_sensor(dev, j);
+ if (err)
+ return err;
+ }
+ }
+
+ return 0;
+}
+
+static void hwmon_thermal_notify(struct device *dev, int index)
+{
+ struct hwmon_device *hwdev = to_hwmon_device(dev);
+ struct hwmon_thermal_data *tzdata;
+
+ list_for_each_entry(tzdata, &hwdev->tzdata, node) {
+ if (tzdata->index == index) {
+ thermal_zone_device_update(tzdata->tzd,
+ THERMAL_EVENT_UNSPECIFIED);
+ }
+ }
+}
+
#else
-static int hwmon_thermal_add_sensor(struct device *dev, int index)
+static int hwmon_thermal_register_sensors(struct device *dev)
{
return 0;
}
+
+static void hwmon_thermal_notify(struct device *dev, int index) { }
+
#endif /* IS_REACHABLE(CONFIG_THERMAL) && ... */
static int hwmon_attr_base(enum hwmon_sensor_types type)
[hwmon_intrusion] = ARRAY_SIZE(hwmon_intrusion_attr_templates),
};
+int hwmon_notify_event(struct device *dev, enum hwmon_sensor_types type,
+ u32 attr, int channel)
+{
+ char sattr[MAX_SYSFS_ATTR_NAME_LENGTH];
+ const char * const *templates;
+ const char *template;
+ int base;
+
+ if (type >= ARRAY_SIZE(__templates))
+ return -EINVAL;
+ if (attr >= __templates_size[type])
+ return -EINVAL;
+
+ templates = __templates[type];
+ template = templates[attr];
+
+ base = hwmon_attr_base(type);
+
+ scnprintf(sattr, MAX_SYSFS_ATTR_NAME_LENGTH, template, base + channel);
+ sysfs_notify(&dev->kobj, NULL, sattr);
+ kobject_uevent(&dev->kobj, KOBJ_CHANGE);
+
+ if (type == hwmon_temp)
+ hwmon_thermal_notify(dev, channel);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(hwmon_notify_event);
+
static int hwmon_num_channel_attrs(const struct hwmon_channel_info *info)
{
int i, n;
{
struct hwmon_device *hwdev;
struct device *hdev;
- int i, j, err, id;
+ int i, err, id;
/* Complain about invalid characters in hwmon name attribute */
if (name && (!strlen(name) || strpbrk(name, "-* \t\n")))
if (err)
goto free_hwmon;
+ INIT_LIST_HEAD(&hwdev->tzdata);
+
if (dev && dev->of_node && chip && chip->ops->read &&
chip->info[0]->type == hwmon_chip &&
(chip->info[0]->config[0] & HWMON_C_REGISTER_TZ)) {
- const struct hwmon_channel_info **info = chip->info;
-
- for (i = 1; info[i]; i++) {
- if (info[i]->type != hwmon_temp)
- continue;
-
- for (j = 0; info[i]->config[j]; j++) {
- if (!chip->ops->is_visible(drvdata, hwmon_temp,
- hwmon_temp_input, j))
- continue;
- if (info[i]->config[j] & HWMON_T_INPUT) {
- err = hwmon_thermal_add_sensor(hdev, j);
- if (err) {
- device_unregister(hdev);
- /*
- * Don't worry about hwdev;
- * hwmon_dev_release(), called
- * from device_unregister(),
- * will free it.
- */
- goto ida_remove;
- }
- }
- }
+ err = hwmon_thermal_register_sensors(hdev);
+ if (err) {
+ device_unregister(hdev);
+ /*
+ * Don't worry about hwdev; hwmon_dev_release(), called
+ * from device_unregister(), will free it.
+ */
+ goto ida_remove;
}
}
#define INA226_READ_AVG(reg) (((reg) & INA226_AVG_RD_MASK) >> 9)
#define INA226_SHIFT_AVG(val) ((val) << 9)
+/* bit number of alert functions in Mask/Enable Register */
+#define INA226_SHUNT_OVER_VOLTAGE_BIT 15
+#define INA226_SHUNT_UNDER_VOLTAGE_BIT 14
+#define INA226_BUS_OVER_VOLTAGE_BIT 13
+#define INA226_BUS_UNDER_VOLTAGE_BIT 12
+#define INA226_POWER_OVER_LIMIT_BIT 11
+
+/* bit mask for alert config bits of Mask/Enable Register */
+#define INA226_ALERT_CONFIG_MASK 0xFC00
+#define INA226_ALERT_FUNCTION_FLAG BIT(4)
+
/* common attrs, ina226 attrs and NULL */
#define INA2XX_MAX_ATTRIBUTE_GROUPS 3
ina2xx_get_value(data, attr->index, regval));
}
+static int ina226_reg_to_alert(struct ina2xx_data *data, u8 bit, u16 regval)
+{
+ int reg;
+
+ switch (bit) {
+ case INA226_SHUNT_OVER_VOLTAGE_BIT:
+ case INA226_SHUNT_UNDER_VOLTAGE_BIT:
+ reg = INA2XX_SHUNT_VOLTAGE;
+ break;
+ case INA226_BUS_OVER_VOLTAGE_BIT:
+ case INA226_BUS_UNDER_VOLTAGE_BIT:
+ reg = INA2XX_BUS_VOLTAGE;
+ break;
+ case INA226_POWER_OVER_LIMIT_BIT:
+ reg = INA2XX_POWER;
+ break;
+ default:
+ /* programmer goofed */
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+
+ return ina2xx_get_value(data, reg, regval);
+}
+
+/*
+ * Turns alert limit values into register values.
+ * Opposite of the formula in ina2xx_get_value().
+ */
+static s16 ina226_alert_to_reg(struct ina2xx_data *data, u8 bit, int val)
+{
+ switch (bit) {
+ case INA226_SHUNT_OVER_VOLTAGE_BIT:
+ case INA226_SHUNT_UNDER_VOLTAGE_BIT:
+ val *= data->config->shunt_div;
+ return clamp_val(val, SHRT_MIN, SHRT_MAX);
+ case INA226_BUS_OVER_VOLTAGE_BIT:
+ case INA226_BUS_UNDER_VOLTAGE_BIT:
+ val = (val * 1000) << data->config->bus_voltage_shift;
+ val = DIV_ROUND_CLOSEST(val, data->config->bus_voltage_lsb);
+ return clamp_val(val, 0, SHRT_MAX);
+ case INA226_POWER_OVER_LIMIT_BIT:
+ val = DIV_ROUND_CLOSEST(val, data->power_lsb_uW);
+ return clamp_val(val, 0, USHRT_MAX);
+ default:
+ /* programmer goofed */
+ WARN_ON_ONCE(1);
+ return 0;
+ }
+}
+
+static ssize_t ina226_alert_show(struct device *dev,
+ struct device_attribute *da, char *buf)
+{
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+ struct ina2xx_data *data = dev_get_drvdata(dev);
+ int regval;
+ int val = 0;
+ int ret;
+
+ mutex_lock(&data->config_lock);
+ ret = regmap_read(data->regmap, INA226_MASK_ENABLE, ®val);
+ if (ret)
+ goto abort;
+
+ if (regval & BIT(attr->index)) {
+ ret = regmap_read(data->regmap, INA226_ALERT_LIMIT, ®val);
+ if (ret)
+ goto abort;
+ val = ina226_reg_to_alert(data, attr->index, regval);
+ }
+
+ ret = snprintf(buf, PAGE_SIZE, "%d\n", val);
+abort:
+ mutex_unlock(&data->config_lock);
+ return ret;
+}
+
+static ssize_t ina226_alert_store(struct device *dev,
+ struct device_attribute *da,
+ const char *buf, size_t count)
+{
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+ struct ina2xx_data *data = dev_get_drvdata(dev);
+ unsigned long val;
+ int ret;
+
+ ret = kstrtoul(buf, 10, &val);
+ if (ret < 0)
+ return ret;
+
+ /*
+ * Clear all alerts first to avoid accidentally triggering ALERT pin
+ * due to register write sequence. Then, only enable the alert
+ * if the value is non-zero.
+ */
+ mutex_lock(&data->config_lock);
+ ret = regmap_update_bits(data->regmap, INA226_MASK_ENABLE,
+ INA226_ALERT_CONFIG_MASK, 0);
+ if (ret < 0)
+ goto abort;
+
+ ret = regmap_write(data->regmap, INA226_ALERT_LIMIT,
+ ina226_alert_to_reg(data, attr->index, val));
+ if (ret < 0)
+ goto abort;
+
+ if (val != 0) {
+ ret = regmap_update_bits(data->regmap, INA226_MASK_ENABLE,
+ INA226_ALERT_CONFIG_MASK,
+ BIT(attr->index));
+ if (ret < 0)
+ goto abort;
+ }
+
+ ret = count;
+abort:
+ mutex_unlock(&data->config_lock);
+ return ret;
+}
+
+static ssize_t ina226_alarm_show(struct device *dev,
+ struct device_attribute *da, char *buf)
+{
+ struct sensor_device_attribute *attr = to_sensor_dev_attr(da);
+ struct ina2xx_data *data = dev_get_drvdata(dev);
+ int regval;
+ int alarm = 0;
+ int ret;
+
+ ret = regmap_read(data->regmap, INA226_MASK_ENABLE, ®val);
+ if (ret)
+ return ret;
+
+ alarm = (regval & BIT(attr->index)) &&
+ (regval & INA226_ALERT_FUNCTION_FLAG);
+ return snprintf(buf, PAGE_SIZE, "%d\n", alarm);
+}
+
/*
* In order to keep calibration register value fixed, the product
* of current_lsb and shunt_resistor should also be fixed and equal
/* shunt voltage */
static SENSOR_DEVICE_ATTR_RO(in0_input, ina2xx_value, INA2XX_SHUNT_VOLTAGE);
+/* shunt voltage over/under voltage alert setting and alarm */
+static SENSOR_DEVICE_ATTR_RW(in0_crit, ina226_alert,
+ INA226_SHUNT_OVER_VOLTAGE_BIT);
+static SENSOR_DEVICE_ATTR_RW(in0_lcrit, ina226_alert,
+ INA226_SHUNT_UNDER_VOLTAGE_BIT);
+static SENSOR_DEVICE_ATTR_RO(in0_crit_alarm, ina226_alarm,
+ INA226_SHUNT_OVER_VOLTAGE_BIT);
+static SENSOR_DEVICE_ATTR_RO(in0_lcrit_alarm, ina226_alarm,
+ INA226_SHUNT_UNDER_VOLTAGE_BIT);
/* bus voltage */
static SENSOR_DEVICE_ATTR_RO(in1_input, ina2xx_value, INA2XX_BUS_VOLTAGE);
+/* bus voltage over/under voltage alert setting and alarm */
+static SENSOR_DEVICE_ATTR_RW(in1_crit, ina226_alert,
+ INA226_BUS_OVER_VOLTAGE_BIT);
+static SENSOR_DEVICE_ATTR_RW(in1_lcrit, ina226_alert,
+ INA226_BUS_UNDER_VOLTAGE_BIT);
+static SENSOR_DEVICE_ATTR_RO(in1_crit_alarm, ina226_alarm,
+ INA226_BUS_OVER_VOLTAGE_BIT);
+static SENSOR_DEVICE_ATTR_RO(in1_lcrit_alarm, ina226_alarm,
+ INA226_BUS_UNDER_VOLTAGE_BIT);
/* calculated current */
static SENSOR_DEVICE_ATTR_RO(curr1_input, ina2xx_value, INA2XX_CURRENT);
/* calculated power */
static SENSOR_DEVICE_ATTR_RO(power1_input, ina2xx_value, INA2XX_POWER);
+/* over-limit power alert setting and alarm */
+static SENSOR_DEVICE_ATTR_RW(power1_crit, ina226_alert,
+ INA226_POWER_OVER_LIMIT_BIT);
+static SENSOR_DEVICE_ATTR_RO(power1_crit_alarm, ina226_alarm,
+ INA226_POWER_OVER_LIMIT_BIT);
/* shunt resistance */
static SENSOR_DEVICE_ATTR_RW(shunt_resistor, ina2xx_shunt, INA2XX_CALIBRATION);
};
static struct attribute *ina226_attrs[] = {
+ &sensor_dev_attr_in0_crit.dev_attr.attr,
+ &sensor_dev_attr_in0_lcrit.dev_attr.attr,
+ &sensor_dev_attr_in0_crit_alarm.dev_attr.attr,
+ &sensor_dev_attr_in0_lcrit_alarm.dev_attr.attr,
+ &sensor_dev_attr_in1_crit.dev_attr.attr,
+ &sensor_dev_attr_in1_lcrit.dev_attr.attr,
+ &sensor_dev_attr_in1_crit_alarm.dev_attr.attr,
+ &sensor_dev_attr_in1_lcrit_alarm.dev_attr.attr,
+ &sensor_dev_attr_power1_crit.dev_attr.attr,
+ &sensor_dev_attr_power1_crit_alarm.dev_attr.attr,
&sensor_dev_attr_update_interval.dev_attr.attr,
NULL,
};
#include <linux/spi/spi.h>
#include <linux/slab.h>
#include <linux/of_device.h>
-
+#include <linux/acpi.h>
#define DRVNAME "lm70"
MODULE_DEVICE_TABLE(of, lm70_of_ids);
#endif
+#ifdef CONFIG_ACPI
+static const struct acpi_device_id lm70_acpi_ids[] = {
+ {
+ .id = "LM000070",
+ .driver_data = LM70_CHIP_LM70,
+ },
+ {
+ .id = "TMP00121",
+ .driver_data = LM70_CHIP_TMP121,
+ },
+ {
+ .id = "LM000071",
+ .driver_data = LM70_CHIP_LM71,
+ },
+ {
+ .id = "LM000074",
+ .driver_data = LM70_CHIP_LM74,
+ },
+ {},
+};
+MODULE_DEVICE_TABLE(acpi, lm70_acpi_ids);
+#endif
+
static int lm70_probe(struct spi_device *spi)
{
- const struct of_device_id *match;
+ const struct of_device_id *of_match;
struct device *hwmon_dev;
struct lm70 *p_lm70;
int chip;
- match = of_match_device(lm70_of_ids, &spi->dev);
- if (match)
- chip = (int)(uintptr_t)match->data;
- else
- chip = spi_get_device_id(spi)->driver_data;
+ of_match = of_match_device(lm70_of_ids, &spi->dev);
+ if (of_match)
+ chip = (int)(uintptr_t)of_match->data;
+ else {
+#ifdef CONFIG_ACPI
+ const struct acpi_device_id *acpi_match;
+
+ acpi_match = acpi_match_device(lm70_acpi_ids, &spi->dev);
+ if (acpi_match)
+ chip = (int)(uintptr_t)acpi_match->driver_data;
+ else
+#endif
+ chip = spi_get_device_id(spi)->driver_data;
+ }
/* signaling is SPI_MODE_0 */
if (spi->mode & (SPI_CPOL | SPI_CPHA))
.driver = {
.name = "lm70",
.of_match_table = of_match_ptr(lm70_of_ids),
+ .acpi_match_table = ACPI_PTR(lm70_acpi_ids),
},
.id_table = lm70_ids,
.probe = lm70_probe,
/* First check for LM75A */
if (i2c_smbus_read_byte_data(new_client, 7) == LM75A_ID) {
- /* LM75A returns 0xff on unused registers so
- just to be sure we check for that too. */
+ /*
+ * LM75A returns 0xff on unused registers so
+ * just to be sure we check for that too.
+ */
if (i2c_smbus_read_byte_data(new_client, 4) != 0xff
|| i2c_smbus_read_byte_data(new_client, 5) != 0xff
|| i2c_smbus_read_byte_data(new_client, 6) != 0xff)
{
int status;
struct i2c_client *client = to_i2c_client(dev);
+
status = i2c_smbus_read_byte_data(client, LM75_REG_CONF);
if (status < 0) {
dev_dbg(&client->dev, "Can't read config? %d\n", status);
{
int status;
struct i2c_client *client = to_i2c_client(dev);
+
status = i2c_smbus_read_byte_data(client, LM75_REG_CONF);
if (status < 0) {
dev_dbg(&client->dev, "Can't read config? %d\n", status);
/* SPDX-License-Identifier: GPL-2.0-or-later */
/*
- lm75.h - Part of lm_sensors, Linux kernel modules for hardware
- monitoring
- Copyright (c) 2003 Mark M. Hoffman <mhoffman@lightlink.com>
-
-*/
+ * lm75.h - Part of lm_sensors, Linux kernel modules for hardware monitoring
+ * Copyright (c) 2003 Mark M. Hoffman <mhoffman@lightlink.com>
+ */
/*
- This file contains common code for encoding/decoding LM75 type
- temperature readings, which are emulated by many of the chips
- we support. As the user is unlikely to load more than one driver
- which contains this code, we don't worry about the wasted space.
-*/
+ * This file contains common code for encoding/decoding LM75 type
+ * temperature readings, which are emulated by many of the chips
+ * we support. As the user is unlikely to load more than one driver
+ * which contains this code, we don't worry about the wasted space.
+ */
#include <linux/kernel.h>
#define LM75_TEMP_MAX 125000
#define LM75_SHUTDOWN 0x01
-/* TEMP: 0.001C/bit (-55C to +125C)
- REG: (0.5C/bit, two's complement) << 7 */
+/*
+ * TEMP: 0.001C/bit (-55C to +125C)
+ * REG: (0.5C/bit, two's complement) << 7
+ */
static inline u16 LM75_TEMP_TO_REG(long temp)
{
int ntemp = clamp_val(temp, LM75_TEMP_MIN, LM75_TEMP_MAX);
+
ntemp += (ntemp < 0 ? -250 : 250);
return (u16)((ntemp / 500) << 7);
}
static inline int LM75_TEMP_FROM_REG(u16 reg)
{
- /* use integer division instead of equivalent right shift to
- guarantee arithmetic shift and preserve the sign */
+ /*
+ * use integer division instead of equivalent right shift to
+ * guarantee arithmetic shift and preserve the sign
+ */
return ((s16)reg / 128) * 500;
}
* explicitly as max6659, or if its address is not 0x4c.
* These chips lack the remote temperature offset feature.
*
+ * This driver also supports the MAX6654 chip made by Maxim. This chip can
+ * be at 9 different addresses, similar to MAX6680/MAX6681. The MAX6654 is
+ * otherwise similar to MAX6657/MAX6658/MAX6659. Extended range is available
+ * by setting the configuration register accordingly, and is done during
+ * initialization. Extended precision is only available at conversion rates
+ * of 1 Hz and slower. Note that extended precision is not enabled by
+ * default, as this driver initializes all chips to 2 Hz by design.
+ *
* This driver also supports the MAX6646, MAX6647, MAX6648, MAX6649 and
* MAX6692 chips made by Maxim. These are again similar to the LM86,
* but they use unsigned temperature values and can report temperatures
* have address 0x4d.
* MAX6647 has address 0x4e.
* MAX6659 can have address 0x4c, 0x4d or 0x4e.
- * MAX6680 and MAX6681 can have address 0x18, 0x19, 0x1a, 0x29, 0x2a, 0x2b,
- * 0x4c, 0x4d or 0x4e.
+ * MAX6654, MAX6680, and MAX6681 can have address 0x18, 0x19, 0x1a, 0x29,
+ * 0x2a, 0x2b, 0x4c, 0x4d or 0x4e.
* SA56004 can have address 0x48 through 0x4F.
*/
0x4d, 0x4e, 0x4f, I2C_CLIENT_END };
enum chips { lm90, adm1032, lm99, lm86, max6657, max6659, adt7461, max6680,
- max6646, w83l771, max6696, sa56004, g781, tmp451 };
+ max6646, w83l771, max6696, sa56004, g781, tmp451, max6654 };
/*
* The LM90 registers
#define LM90_REG_R_TCRIT_HYST 0x21
#define LM90_REG_W_TCRIT_HYST 0x21
-/* MAX6646/6647/6649/6657/6658/6659/6695/6696 registers */
+/* MAX6646/6647/6649/6654/6657/6658/6659/6695/6696 registers */
#define MAX6657_REG_R_LOCAL_TEMPL 0x11
#define MAX6696_REG_R_STATUS2 0x12
{ "max6646", max6646 },
{ "max6647", max6646 },
{ "max6649", max6646 },
+ { "max6654", max6654 },
{ "max6657", max6657 },
{ "max6658", max6657 },
{ "max6659", max6659 },
.compatible = "dallas,max6649",
.data = (void *)max6646
},
+ {
+ .compatible = "dallas,max6654",
+ .data = (void *)max6654
+ },
{
.compatible = "dallas,max6657",
.data = (void *)max6657
.max_convrate = 6,
.reg_local_ext = MAX6657_REG_R_LOCAL_TEMPL,
},
+ [max6654] = {
+ .alert_alarms = 0x7c,
+ .max_convrate = 7,
+ .reg_local_ext = MAX6657_REG_R_LOCAL_TEMPL,
+ },
[max6657] = {
.flags = LM90_PAUSE_FOR_CONFIG,
.alert_alarms = 0x7c,
&& (config1 & 0x3f) == 0x00
&& convrate <= 0x07) {
name = "max6646";
+ } else
+ /*
+ * The chip_id of the MAX6654 holds the revision of the chip.
+ * The lowest 3 bits of the config1 register are unused and
+ * should return zero when read.
+ */
+ if (chip_id == 0x08
+ && (config1 & 0x07) == 0x00
+ && convrate <= 0x07) {
+ name = "max6654";
}
} else
if (address == 0x4C
if (data->kind == max6680)
config |= 0x18;
+ /*
+ * Put MAX6654 into extended range (0x20, extend minimum range from
+ * 0 degrees to -64 degrees). Note that extended resolution is not
+ * possible on the MAX6654 unless conversion rate is set to 1 Hz or
+ * slower, which is intentionally not done by default.
+ */
+ if (data->kind == max6654)
+ config |= 0x20;
+
/*
* Select external channel 0 for max6695/96
*/
static umode_t nct6775_in_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct6775_data *data = dev_get_drvdata(dev);
int in = index / 5; /* voltage index */
static umode_t nct6775_fan_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct6775_data *data = dev_get_drvdata(dev);
int fan = index / 6; /* fan index */
int nr = index % 6; /* attribute index */
static umode_t nct6775_temp_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct6775_data *data = dev_get_drvdata(dev);
int temp = index / 10; /* temp index */
int nr = index % 10; /* attribute index */
static umode_t nct6775_pwm_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct6775_data *data = dev_get_drvdata(dev);
int pwm = index / 36; /* pwm index */
int nr = index % 36; /* attribute index */
static umode_t nct6775_other_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct6775_data *data = dev_get_drvdata(dev);
if (index == 0 && !data->have_vid)
static umode_t nct7802_temp_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct7802_data *data = dev_get_drvdata(dev);
unsigned int reg;
int err;
static umode_t nct7802_in_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct7802_data *data = dev_get_drvdata(dev);
unsigned int reg;
int err;
static umode_t nct7802_fan_is_visible(struct kobject *kobj,
struct attribute *attr, int index)
{
- struct device *dev = container_of(kobj, struct device, kobj);
+ struct device *dev = kobj_to_dev(kobj);
struct nct7802_data *data = dev_get_drvdata(dev);
int fan = index / 4; /* 4 attributes per fan */
unsigned int reg;
* Copyright (c) 2019 Advantech
* Author: Amy.Shih <amy.shih@advantech.com.tw>
*
+ * Copyright (c) 2020 Advantech
+ * Author: Yuechao Zhao <yuechao.zhao@advantech.com.cn>
+ *
* Supports the following chips:
*
* Chip #vin #fan #pwm #temp #dts chip ID
#include <linux/i2c.h>
#include <linux/mutex.h>
#include <linux/hwmon.h>
+#include <linux/watchdog.h>
#define VENDOR_ID_REG 0x7A /* Any bank */
#define NUVOTON_ID 0x50
#define FANCTL_MAX 4 /* Counted from 1 */
#define TCPU_MAX 8 /* Counted from 1 */
#define TEMP_MAX 4 /* Counted from 1 */
+#define SMI_STS_MAX 10 /* Counted from 1 */
#define VT_ADC_CTRL0_REG 0x20 /* Bank 0 */
#define VT_ADC_CTRL1_REG 0x21 /* Bank 0 */
#define FANCTL1_FMR_REG 0x00 /* Bank 3; 1 reg per channel */
#define FANCTL1_OUT_REG 0x10 /* Bank 3; 1 reg per channel */
+#define WDT_LOCK_REG 0xE0 /* W/O Lock Watchdog Register */
+#define WDT_EN_REG 0xE1 /* R/O Watchdog Enable Register */
+#define WDT_STS_REG 0xE2 /* R/O Watchdog Status Register */
+#define WDT_TIMER_REG 0xE3 /* R/W Watchdog Timer Register */
+#define WDT_SOFT_EN 0x55 /* Enable soft watchdog timer */
+#define WDT_SOFT_DIS 0xAA /* Disable soft watchdog timer */
+
#define VOLT_MONITOR_MODE 0x0
#define THERMAL_DIODE_MODE 0x1
#define THERMISTOR_MODE 0x3
#define ENABLE_TSI BIT(1)
+#define WATCHDOG_TIMEOUT 1 /* 1 minute default timeout */
+
+/*The timeout range is 1-255 minutes*/
+#define MIN_TIMEOUT (1 * 60)
+#define MAX_TIMEOUT (255 * 60)
+
+static int timeout;
+module_param(timeout, int, 0);
+MODULE_PARM_DESC(timeout, "Watchdog timeout in minutes. 1 <= timeout <= 255, default="
+ __MODULE_STRING(WATCHDOG_TIMEOUT) ".");
+
+static bool nowayout = WATCHDOG_NOWAYOUT;
+module_param(nowayout, bool, 0);
+MODULE_PARM_DESC(nowayout, "Watchdog cannot be stopped once started (default="
+ __MODULE_STRING(WATCHDOG_NOWAYOUT) ")");
+
static const unsigned short normal_i2c[] = {
0x2d, 0x2e, I2C_CLIENT_END
};
struct nct7904_data {
struct i2c_client *client;
+ struct watchdog_device wdt;
struct mutex bank_lock;
int bank_sel;
u32 fanin_mask;
struct nct7904_data *data = dev_get_drvdata(dev);
int ret, temp;
unsigned int reg1, reg2, reg3;
+ s8 temps;
switch (attr) {
case hwmon_temp_input:
if (ret < 0)
return ret;
- *val = ret * 1000;
+ temps = ret;
+ *val = temps * 1000;
return 0;
}
.info = nct7904_info,
};
+/*
+ * Watchdog Function
+ */
+static int nct7904_wdt_start(struct watchdog_device *wdt)
+{
+ struct nct7904_data *data = watchdog_get_drvdata(wdt);
+
+ /* Enable soft watchdog timer */
+ return nct7904_write_reg(data, BANK_0, WDT_LOCK_REG, WDT_SOFT_EN);
+}
+
+static int nct7904_wdt_stop(struct watchdog_device *wdt)
+{
+ struct nct7904_data *data = watchdog_get_drvdata(wdt);
+
+ return nct7904_write_reg(data, BANK_0, WDT_LOCK_REG, WDT_SOFT_DIS);
+}
+
+static int nct7904_wdt_set_timeout(struct watchdog_device *wdt,
+ unsigned int timeout)
+{
+ struct nct7904_data *data = watchdog_get_drvdata(wdt);
+ /*
+ * The NCT7904 is very special in watchdog function.
+ * Its minimum unit is minutes. And wdt->timeout needs
+ * to match the actual timeout selected. So, this needs
+ * to be: wdt->timeout = timeout / 60 * 60.
+ * For example, if the user configures a timeout of
+ * 119 seconds, the actual timeout will be 60 seconds.
+ * So, wdt->timeout must then be set to 60 seconds.
+ */
+ wdt->timeout = timeout / 60 * 60;
+
+ return nct7904_write_reg(data, BANK_0, WDT_TIMER_REG,
+ wdt->timeout / 60);
+}
+
+static int nct7904_wdt_ping(struct watchdog_device *wdt)
+{
+ /*
+ * Note:
+ * NCT7904 does not support refreshing WDT_TIMER_REG register when
+ * the watchdog is active. Please disable watchdog before feeding
+ * the watchdog and enable it again.
+ */
+ struct nct7904_data *data = watchdog_get_drvdata(wdt);
+ int ret;
+
+ /* Disable soft watchdog timer */
+ ret = nct7904_write_reg(data, BANK_0, WDT_LOCK_REG, WDT_SOFT_DIS);
+ if (ret < 0)
+ return ret;
+
+ /* feed watchdog */
+ ret = nct7904_write_reg(data, BANK_0, WDT_TIMER_REG, wdt->timeout / 60);
+ if (ret < 0)
+ return ret;
+
+ /* Enable soft watchdog timer */
+ return nct7904_write_reg(data, BANK_0, WDT_LOCK_REG, WDT_SOFT_EN);
+}
+
+static unsigned int nct7904_wdt_get_timeleft(struct watchdog_device *wdt)
+{
+ struct nct7904_data *data = watchdog_get_drvdata(wdt);
+ int ret;
+
+ ret = nct7904_read_reg(data, BANK_0, WDT_TIMER_REG);
+ if (ret < 0)
+ return 0;
+
+ return ret * 60;
+}
+
+static const struct watchdog_info nct7904_wdt_info = {
+ .options = WDIOF_SETTIMEOUT | WDIOF_KEEPALIVEPING |
+ WDIOF_MAGICCLOSE,
+ .identity = "nct7904 watchdog",
+};
+
+static const struct watchdog_ops nct7904_wdt_ops = {
+ .owner = THIS_MODULE,
+ .start = nct7904_wdt_start,
+ .stop = nct7904_wdt_stop,
+ .ping = nct7904_wdt_ping,
+ .set_timeout = nct7904_wdt_set_timeout,
+ .get_timeleft = nct7904_wdt_get_timeleft,
+};
+
static int nct7904_probe(struct i2c_client *client,
const struct i2c_device_id *id)
{
data->fan_mode[i] = ret;
}
+ /* Read all of SMI status register to clear alarms */
+ for (i = 0; i < SMI_STS_MAX; i++) {
+ ret = nct7904_read_reg(data, BANK_0, SMI_STS1_REG + i);
+ if (ret < 0)
+ return ret;
+ }
+
hwmon_dev =
devm_hwmon_device_register_with_info(dev, client->name, data,
&nct7904_chip_info, NULL);
- return PTR_ERR_OR_ZERO(hwmon_dev);
+ ret = PTR_ERR_OR_ZERO(hwmon_dev);
+ if (ret)
+ return ret;
+
+ /* Watchdog initialization */
+ data->wdt.ops = &nct7904_wdt_ops;
+ data->wdt.info = &nct7904_wdt_info;
+
+ data->wdt.timeout = WATCHDOG_TIMEOUT * 60; /* Set default timeout */
+ data->wdt.min_timeout = MIN_TIMEOUT;
+ data->wdt.max_timeout = MAX_TIMEOUT;
+ data->wdt.parent = &client->dev;
+
+ watchdog_init_timeout(&data->wdt, timeout * 60, &client->dev);
+ watchdog_set_nowayout(&data->wdt, nowayout);
+ watchdog_set_drvdata(&data->wdt, data);
+
+ watchdog_stop_on_unregister(&data->wdt);
+
+ return devm_watchdog_register_device(dev, &data->wdt);
}
static const struct i2c_device_id nct7904_id[] = {
This driver can also be built as a module. If so, the module will
be called max16064.
+config SENSORS_MAX16601
+ tristate "Maxim MAX16601"
+ help
+ If you say yes here you get hardware monitoring support for Maxim
+ MAX16601.
+
+ This driver can also be built as a module. If so, the module will
+ be called max16601.
+
config SENSORS_MAX20730
tristate "Maxim MAX20730, MAX20734, MAX20743"
help
obj-$(CONFIG_SENSORS_LTC2978) += ltc2978.o
obj-$(CONFIG_SENSORS_LTC3815) += ltc3815.o
obj-$(CONFIG_SENSORS_MAX16064) += max16064.o
+obj-$(CONFIG_SENSORS_MAX16601) += max16601.o
obj-$(CONFIG_SENSORS_MAX20730) += max20730.o
obj-$(CONFIG_SENSORS_MAX20751) += max20751.o
obj-$(CONFIG_SENSORS_MAX31785) += max31785.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Hardware monitoring driver for Maxim MAX16601
+ *
+ * Implementation notes:
+ *
+ * Ths chip supports two rails, VCORE and VSA. Telemetry information for the
+ * two rails is reported in two subsequent I2C addresses. The driver
+ * instantiates a dummy I2C client at the second I2C address to report
+ * information for the VSA rail in a single instance of the driver.
+ * Telemetry for the VSA rail is reported to the PMBus core in PMBus page 2.
+ *
+ * The chip reports input current using two separate methods. The input current
+ * reported with the standard READ_IIN command is derived from the output
+ * current. The first method is reported to the PMBus core with PMBus page 0,
+ * the second method is reported with PMBus page 1.
+ *
+ * The chip supports reading per-phase temperatures and per-phase input/output
+ * currents for VCORE. Telemetry is reported in vendor specific registers.
+ * The driver translates the vendor specific register values to PMBus standard
+ * register values and reports per-phase information in PMBus page 0.
+ *
+ * Copyright 2019, 2020 Google LLC.
+ */
+
+#include <linux/bits.h>
+#include <linux/i2c.h>
+#include <linux/init.h>
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#include "pmbus.h"
+
+#define REG_SETPT_DVID 0xd1
+#define DAC_10MV_MODE BIT(4)
+#define REG_IOUT_AVG_PK 0xee
+#define REG_IIN_SENSOR 0xf1
+#define REG_TOTAL_INPUT_POWER 0xf2
+#define REG_PHASE_ID 0xf3
+#define CORE_RAIL_INDICATOR BIT(7)
+#define REG_PHASE_REPORTING 0xf4
+
+struct max16601_data {
+ struct pmbus_driver_info info;
+ struct i2c_client *vsa;
+ int iout_avg_pkg;
+};
+
+#define to_max16601_data(x) container_of(x, struct max16601_data, info)
+
+static int max16601_read_byte(struct i2c_client *client, int page, int reg)
+{
+ const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
+ struct max16601_data *data = to_max16601_data(info);
+
+ if (page > 0) {
+ if (page == 2) /* VSA */
+ return i2c_smbus_read_byte_data(data->vsa, reg);
+ return -EOPNOTSUPP;
+ }
+ return -ENODATA;
+}
+
+static int max16601_read_word(struct i2c_client *client, int page, int phase,
+ int reg)
+{
+ const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
+ struct max16601_data *data = to_max16601_data(info);
+ u8 buf[I2C_SMBUS_BLOCK_MAX + 1];
+ int ret;
+
+ switch (page) {
+ case 0: /* VCORE */
+ if (phase == 0xff)
+ return -ENODATA;
+ switch (reg) {
+ case PMBUS_READ_IIN:
+ case PMBUS_READ_IOUT:
+ case PMBUS_READ_TEMPERATURE_1:
+ ret = i2c_smbus_write_byte_data(client, REG_PHASE_ID,
+ phase);
+ if (ret)
+ return ret;
+ ret = i2c_smbus_read_block_data(client,
+ REG_PHASE_REPORTING,
+ buf);
+ if (ret < 0)
+ return ret;
+ if (ret < 6)
+ return -EIO;
+ switch (reg) {
+ case PMBUS_READ_TEMPERATURE_1:
+ return buf[1] << 8 | buf[0];
+ case PMBUS_READ_IOUT:
+ return buf[3] << 8 | buf[2];
+ case PMBUS_READ_IIN:
+ return buf[5] << 8 | buf[4];
+ default:
+ break;
+ }
+ }
+ return -EOPNOTSUPP;
+ case 1: /* VCORE, read IIN/PIN from sensor element */
+ switch (reg) {
+ case PMBUS_READ_IIN:
+ return i2c_smbus_read_word_data(client, REG_IIN_SENSOR);
+ case PMBUS_READ_PIN:
+ return i2c_smbus_read_word_data(client,
+ REG_TOTAL_INPUT_POWER);
+ default:
+ break;
+ }
+ return -EOPNOTSUPP;
+ case 2: /* VSA */
+ switch (reg) {
+ case PMBUS_VIRT_READ_IOUT_MAX:
+ ret = i2c_smbus_read_word_data(data->vsa,
+ REG_IOUT_AVG_PK);
+ if (ret < 0)
+ return ret;
+ if (sign_extend32(ret, 10) >
+ sign_extend32(data->iout_avg_pkg, 10))
+ data->iout_avg_pkg = ret;
+ return data->iout_avg_pkg;
+ case PMBUS_VIRT_RESET_IOUT_HISTORY:
+ return 0;
+ case PMBUS_IOUT_OC_FAULT_LIMIT:
+ case PMBUS_IOUT_OC_WARN_LIMIT:
+ case PMBUS_OT_FAULT_LIMIT:
+ case PMBUS_OT_WARN_LIMIT:
+ case PMBUS_READ_IIN:
+ case PMBUS_READ_IOUT:
+ case PMBUS_READ_TEMPERATURE_1:
+ case PMBUS_STATUS_WORD:
+ return i2c_smbus_read_word_data(data->vsa, reg);
+ default:
+ return -EOPNOTSUPP;
+ }
+ default:
+ return -EOPNOTSUPP;
+ }
+}
+
+static int max16601_write_byte(struct i2c_client *client, int page, u8 reg)
+{
+ const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
+ struct max16601_data *data = to_max16601_data(info);
+
+ if (page == 2) {
+ if (reg == PMBUS_CLEAR_FAULTS)
+ return i2c_smbus_write_byte(data->vsa, reg);
+ return -EOPNOTSUPP;
+ }
+ return -ENODATA;
+}
+
+static int max16601_write_word(struct i2c_client *client, int page, int reg,
+ u16 value)
+{
+ const struct pmbus_driver_info *info = pmbus_get_driver_info(client);
+ struct max16601_data *data = to_max16601_data(info);
+
+ switch (page) {
+ case 0: /* VCORE */
+ return -ENODATA;
+ case 1: /* VCORE IIN/PIN from sensor element */
+ default:
+ return -EOPNOTSUPP;
+ case 2: /* VSA */
+ switch (reg) {
+ case PMBUS_VIRT_RESET_IOUT_HISTORY:
+ data->iout_avg_pkg = 0xfc00;
+ return 0;
+ case PMBUS_IOUT_OC_FAULT_LIMIT:
+ case PMBUS_IOUT_OC_WARN_LIMIT:
+ case PMBUS_OT_FAULT_LIMIT:
+ case PMBUS_OT_WARN_LIMIT:
+ return i2c_smbus_write_word_data(data->vsa, reg, value);
+ default:
+ return -EOPNOTSUPP;
+ }
+ }
+}
+
+static int max16601_identify(struct i2c_client *client,
+ struct pmbus_driver_info *info)
+{
+ int reg;
+
+ reg = i2c_smbus_read_byte_data(client, REG_SETPT_DVID);
+ if (reg < 0)
+ return reg;
+ if (reg & DAC_10MV_MODE)
+ info->vrm_version[0] = vr13;
+ else
+ info->vrm_version[0] = vr12;
+
+ return 0;
+}
+
+static struct pmbus_driver_info max16601_info = {
+ .pages = 3,
+ .format[PSC_VOLTAGE_IN] = linear,
+ .format[PSC_VOLTAGE_OUT] = vid,
+ .format[PSC_CURRENT_IN] = linear,
+ .format[PSC_CURRENT_OUT] = linear,
+ .format[PSC_TEMPERATURE] = linear,
+ .format[PSC_POWER] = linear,
+ .func[0] = PMBUS_HAVE_VIN | PMBUS_HAVE_IIN | PMBUS_HAVE_PIN |
+ PMBUS_HAVE_STATUS_INPUT |
+ PMBUS_HAVE_VOUT | PMBUS_HAVE_STATUS_VOUT |
+ PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP |
+ PMBUS_HAVE_POUT | PMBUS_PAGE_VIRTUAL | PMBUS_PHASE_VIRTUAL,
+ .func[1] = PMBUS_HAVE_IIN | PMBUS_HAVE_PIN | PMBUS_PAGE_VIRTUAL,
+ .func[2] = PMBUS_HAVE_IIN | PMBUS_HAVE_STATUS_INPUT |
+ PMBUS_HAVE_IOUT | PMBUS_HAVE_STATUS_IOUT |
+ PMBUS_HAVE_TEMP | PMBUS_HAVE_STATUS_TEMP | PMBUS_PAGE_VIRTUAL,
+ .phases[0] = 8,
+ .pfunc[0] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT | PMBUS_HAVE_TEMP,
+ .pfunc[1] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT,
+ .pfunc[2] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT | PMBUS_HAVE_TEMP,
+ .pfunc[3] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT,
+ .pfunc[4] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT | PMBUS_HAVE_TEMP,
+ .pfunc[5] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT,
+ .pfunc[6] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT | PMBUS_HAVE_TEMP,
+ .pfunc[7] = PMBUS_HAVE_IIN | PMBUS_HAVE_IOUT,
+ .identify = max16601_identify,
+ .read_byte_data = max16601_read_byte,
+ .read_word_data = max16601_read_word,
+ .write_byte = max16601_write_byte,
+ .write_word_data = max16601_write_word,
+};
+
+static void max16601_remove(void *_data)
+{
+ struct max16601_data *data = _data;
+
+ i2c_unregister_device(data->vsa);
+}
+
+static int max16601_probe(struct i2c_client *client,
+ const struct i2c_device_id *id)
+{
+ struct device *dev = &client->dev;
+ u8 buf[I2C_SMBUS_BLOCK_MAX + 1];
+ struct max16601_data *data;
+ int ret;
+
+ if (!i2c_check_functionality(client->adapter,
+ I2C_FUNC_SMBUS_READ_BYTE_DATA |
+ I2C_FUNC_SMBUS_READ_BLOCK_DATA))
+ return -ENODEV;
+
+ ret = i2c_smbus_read_block_data(client, PMBUS_IC_DEVICE_ID, buf);
+ if (ret < 0)
+ return -ENODEV;
+
+ /* PMBUS_IC_DEVICE_ID is expected to return "MAX16601y.xx" */
+ if (ret < 11 || strncmp(buf, "MAX16601", 8)) {
+ buf[ret] = '\0';
+ dev_err(dev, "Unsupported chip '%s'\n", buf);
+ return -ENODEV;
+ }
+
+ ret = i2c_smbus_read_byte_data(client, REG_PHASE_ID);
+ if (ret < 0)
+ return ret;
+ if (!(ret & CORE_RAIL_INDICATOR)) {
+ dev_err(dev,
+ "Driver must be instantiated on CORE rail I2C address\n");
+ return -ENODEV;
+ }
+
+ data = devm_kzalloc(dev, sizeof(*data), GFP_KERNEL);
+ if (!data)
+ return -ENOMEM;
+
+ data->iout_avg_pkg = 0xfc00;
+ data->vsa = i2c_new_dummy_device(client->adapter, client->addr + 1);
+ if (IS_ERR(data->vsa)) {
+ dev_err(dev, "Failed to register VSA client\n");
+ return PTR_ERR(data->vsa);
+ }
+ ret = devm_add_action_or_reset(dev, max16601_remove, data);
+ if (ret)
+ return ret;
+
+ data->info = max16601_info;
+
+ return pmbus_do_probe(client, id, &data->info);
+}
+
+static const struct i2c_device_id max16601_id[] = {
+ {"max16601", 0},
+ {}
+};
+
+MODULE_DEVICE_TABLE(i2c, max16601_id);
+
+static struct i2c_driver max16601_driver = {
+ .driver = {
+ .name = "max16601",
+ },
+ .probe = max16601_probe,
+ .remove = pmbus_do_remove,
+ .id_table = max16601_id,
+};
+
+module_i2c_driver(max16601_driver);
+
+MODULE_AUTHOR("Guenter Roeck <linux@roeck-us.net>");
+MODULE_DESCRIPTION("PMBus driver for Maxim MAX16601");
+MODULE_LICENSE("GPL v2");
bool has_status_word; /* device uses STATUS_WORD register */
int (*read_status)(struct i2c_client *client, int page);
- u8 currpage;
- u8 currphase; /* current phase, 0xff for all */
+ s16 currpage; /* current page, -1 for unknown/unset */
+ s16 currphase; /* current phase, 0xff for all, -1 for unknown/unset */
};
struct pmbus_debugfs_entry {
if (pdata)
data->flags = pdata->flags;
data->info = info;
- data->currpage = 0xff;
- data->currphase = 0xfe;
+ data->currpage = -1;
+ data->currphase = -1;
ret = pmbus_init_common(client, data, info);
if (ret < 0)
/* Can optionally have an etm node - return if not */
cs_fwnode = fwnode_find_reference(root_fwnode, CTI_DT_CSDEV_ASSOC, 0);
- if (IS_ERR_OR_NULL(cs_fwnode))
+ if (IS_ERR(cs_fwnode))
return 0;
/* allocate memory */
/* associated device ? */
cs_fwnode = fwnode_find_reference(fwnode,
CTI_DT_CSDEV_ASSOC, 0);
- if (!IS_ERR_OR_NULL(cs_fwnode)) {
+ if (!IS_ERR(cs_fwnode)) {
assoc_name = cti_plat_get_csdev_or_node_name(cs_fwnode,
&csdev);
fwnode_handle_put(cs_fwnode);
EXPORT_SYMBOL(i2c_pca_add_numbered_bus);
MODULE_AUTHOR("Ian Campbell <icampbell@arcom.com>, "
- "Wolfram Sang <w.sang@pengutronix.de>");
+ "Wolfram Sang <kernel@pengutronix.de>");
MODULE_DESCRIPTION("I2C-Bus PCA9564/PCA9665 algorithm");
MODULE_LICENSE("GPL");
* @isr_mask: cached copy of local ISR enables.
* @isr_status: cached copy of local ISR status.
* @lock: spinlock for IRQ synchronization.
+ * @isr_mutex: mutex for IRQ thread.
*/
struct altr_i2c_dev {
void __iomem *base;
u32 isr_mask;
u32 isr_status;
spinlock_t lock; /* IRQ synchronization */
+ struct mutex isr_mutex;
};
static void
struct altr_i2c_dev *idev = _dev;
u32 status = idev->isr_status;
+ mutex_lock(&idev->isr_mutex);
if (!idev->msg) {
dev_warn(idev->dev, "unexpected interrupt\n");
altr_i2c_int_clear(idev, ALTR_I2C_ALL_IRQ);
- return IRQ_HANDLED;
+ goto out;
}
read = (idev->msg->flags & I2C_M_RD) != 0;
complete(&idev->msg_complete);
dev_dbg(idev->dev, "Message Complete\n");
}
+out:
+ mutex_unlock(&idev->isr_mutex);
return IRQ_HANDLED;
}
u32 value;
u8 addr = i2c_8bit_addr_from_msg(msg);
+ mutex_lock(&idev->isr_mutex);
idev->msg = msg;
idev->msg_len = msg->len;
idev->buf = msg->buf;
altr_i2c_int_enable(idev, imask, true);
altr_i2c_fill_tx_fifo(idev);
}
+ mutex_unlock(&idev->isr_mutex);
time_left = wait_for_completion_timeout(&idev->msg_complete,
ALTR_I2C_XFER_TIMEOUT);
idev->dev = &pdev->dev;
init_completion(&idev->msg_complete);
spin_lock_init(&idev->lock);
+ mutex_init(&idev->isr_mutex);
ret = device_property_read_u32(idev->dev, "fifo-size",
&idev->fifo_size);
PINCTRL_STATE_DEFAULT);
dev->pinctrl_pins_gpio = pinctrl_lookup_state(dev->pinctrl,
"gpio");
+ if (IS_ERR(dev->pinctrl_pins_default) ||
+ IS_ERR(dev->pinctrl_pins_gpio)) {
+ dev_info(&pdev->dev, "pinctrl states incomplete for recovery\n");
+ return -EINVAL;
+ }
+
+ /*
+ * pins will be taken as GPIO, so we might as well inform pinctrl about
+ * this and move the state to GPIO
+ */
+ pinctrl_select_state(dev->pinctrl, dev->pinctrl_pins_gpio);
+
rinfo->sda_gpiod = devm_gpiod_get(&pdev->dev, "sda", GPIOD_IN);
if (PTR_ERR(rinfo->sda_gpiod) == -EPROBE_DEFER)
return -EPROBE_DEFER;
return -EPROBE_DEFER;
if (IS_ERR(rinfo->sda_gpiod) ||
- IS_ERR(rinfo->scl_gpiod) ||
- IS_ERR(dev->pinctrl_pins_default) ||
- IS_ERR(dev->pinctrl_pins_gpio)) {
+ IS_ERR(rinfo->scl_gpiod)) {
dev_info(&pdev->dev, "recovery information incomplete\n");
if (!IS_ERR(rinfo->sda_gpiod)) {
gpiod_put(rinfo->sda_gpiod);
gpiod_put(rinfo->scl_gpiod);
rinfo->scl_gpiod = NULL;
}
+ pinctrl_select_state(dev->pinctrl, dev->pinctrl_pins_default);
return -EINVAL;
}
+ /* change the state of the pins back to their default state */
+ pinctrl_select_state(dev->pinctrl, dev->pinctrl_pins_default);
+
dev_info(&pdev->dev, "using scl, sda for recovery\n");
rinfo->prepare_recovery = at91_prepare_twi_recovery;
* Mux support by Rodolfo Giometti <giometti@enneenne.com> and
* Michael Lawnick <michael.lawnick.ext@nsn.com>
*
- * Copyright (C) 2013-2017 Wolfram Sang <wsa@the-dreams.de>
+ * Copyright (C) 2013-2017 Wolfram Sang <wsa@kernel.org>
*/
#define pr_fmt(fmt) "i2c-core: " fmt
} else if (ACPI_COMPANION(dev)) {
irq = i2c_acpi_get_irq(client);
}
- if (irq == -EPROBE_DEFER)
- return irq;
+ if (irq == -EPROBE_DEFER) {
+ status = irq;
+ goto put_sync_adapter;
+ }
if (irq < 0)
irq = 0;
*/
if (!driver->id_table &&
!i2c_acpi_match_device(dev->driver->acpi_match_table, client) &&
- !i2c_of_match_device(dev->driver->of_match_table, client))
- return -ENODEV;
+ !i2c_of_match_device(dev->driver->of_match_table, client)) {
+ status = -ENODEV;
+ goto put_sync_adapter;
+ }
if (client->flags & I2C_CLIENT_WAKE) {
int wakeirq;
wakeirq = of_irq_get_byname(dev->of_node, "wakeup");
- if (wakeirq == -EPROBE_DEFER)
- return wakeirq;
+ if (wakeirq == -EPROBE_DEFER) {
+ status = wakeirq;
+ goto put_sync_adapter;
+ }
device_init_wakeup(&client->dev, true);
err_clear_wakeup_irq:
dev_pm_clear_wake_irq(&client->dev);
device_init_wakeup(&client->dev, false);
+put_sync_adapter:
+ if (client->flags & I2C_CLIENT_HOST_NOTIFY)
+ pm_runtime_put_sync(&client->adapter->dev);
+
return status;
}
* Copyright (C) 2008 Jochen Friedrich <jochen@scram.de>
* based on a previous patch from Jon Smirl <jonsmirl@gmail.com>
*
- * Copyright (C) 2013, 2018 Wolfram Sang <wsa@the-dreams.de>
+ * Copyright (C) 2013, 2018 Wolfram Sang <wsa@kernel.org>
*/
#include <dt-bindings/i2c/i2c.h>
err_rollback_available:
device_remove_file(&pdev->dev, &dev_attr_available_masters);
err_rollback:
+ i2c_demux_deactivate_master(priv);
for (j = 0; j < i; j++) {
of_node_put(priv->chan[j].parent_np);
of_changeset_destroy(&priv->chan[j].chgset);
struct i3c_device_info *info)
{
struct i3c_ccc_cmd_dest dest;
- unsigned int expected_len;
struct i3c_ccc_mrl *mrl;
struct i3c_ccc_cmd cmd;
int ret;
if (!(info->bcr & I3C_BCR_IBI_PAYLOAD))
dest.payload.len -= 1;
- expected_len = dest.payload.len;
i3c_ccc_cmd_init(&cmd, true, I3C_CCC_GETMRL, &dest, 1);
ret = i3c_master_send_ccc_cmd_locked(master, &cmd);
if (ret)
goto out;
- if (dest.payload.len != expected_len) {
+ switch (dest.payload.len) {
+ case 3:
+ info->max_ibi_len = mrl->ibi_len;
+ fallthrough;
+ case 2:
+ info->max_read_len = be16_to_cpu(mrl->read_len);
+ break;
+ default:
ret = -EIO;
goto out;
}
- info->max_read_len = be16_to_cpu(mrl->read_len);
-
- if (info->bcr & I3C_BCR_IBI_PAYLOAD)
- info->max_ibi_len = mrl->ibi_len;
-
out:
i3c_ccc_cmd_dest_cleanup(&dest);
st->tx[0] = SCA3000_READ_REG(reg_address_high);
ret = spi_sync_transfer(st->us, xfer, ARRAY_SIZE(xfer));
if (ret) {
- dev_err(get_device(&st->us->dev), "problem reading register");
+ dev_err(&st->us->dev, "problem reading register\n");
return ret;
}
return 0;
}
-static int stm32_adc_dma_request(struct iio_dev *indio_dev)
+static int stm32_adc_dma_request(struct device *dev, struct iio_dev *indio_dev)
{
struct stm32_adc *adc = iio_priv(indio_dev);
struct dma_slave_config config;
int ret;
- adc->dma_chan = dma_request_chan(&indio_dev->dev, "rx");
+ adc->dma_chan = dma_request_chan(dev, "rx");
if (IS_ERR(adc->dma_chan)) {
ret = PTR_ERR(adc->dma_chan);
if (ret != -ENODEV) {
if (ret != -EPROBE_DEFER)
- dev_err(&indio_dev->dev,
+ dev_err(dev,
"DMA channel request failed with %d\n",
ret);
return ret;
if (ret < 0)
return ret;
- ret = stm32_adc_dma_request(indio_dev);
+ ret = stm32_adc_dma_request(dev, indio_dev);
if (ret < 0)
return ret;
struct stm32_dfsdm_dev_data {
int type;
- int (*init)(struct iio_dev *indio_dev);
+ int (*init)(struct device *dev, struct iio_dev *indio_dev);
unsigned int num_channels;
const struct regmap_config *regmap_cfg;
};
}
}
-static int stm32_dfsdm_dma_request(struct iio_dev *indio_dev)
+static int stm32_dfsdm_dma_request(struct device *dev,
+ struct iio_dev *indio_dev)
{
struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
- adc->dma_chan = dma_request_chan(&indio_dev->dev, "rx");
+ adc->dma_chan = dma_request_chan(dev, "rx");
if (IS_ERR(adc->dma_chan)) {
int ret = PTR_ERR(adc->dma_chan);
&adc->dfsdm->ch_list[ch->channel]);
}
-static int stm32_dfsdm_audio_init(struct iio_dev *indio_dev)
+static int stm32_dfsdm_audio_init(struct device *dev, struct iio_dev *indio_dev)
{
struct iio_chan_spec *ch;
struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
indio_dev->num_channels = 1;
indio_dev->channels = ch;
- return stm32_dfsdm_dma_request(indio_dev);
+ return stm32_dfsdm_dma_request(dev, indio_dev);
}
-static int stm32_dfsdm_adc_init(struct iio_dev *indio_dev)
+static int stm32_dfsdm_adc_init(struct device *dev, struct iio_dev *indio_dev)
{
struct iio_chan_spec *ch;
struct stm32_dfsdm_adc *adc = iio_priv(indio_dev);
init_completion(&adc->completion);
/* Optionally request DMA */
- ret = stm32_dfsdm_dma_request(indio_dev);
+ ret = stm32_dfsdm_dma_request(dev, indio_dev);
if (ret) {
if (ret != -ENODEV) {
if (ret != -EPROBE_DEFER)
- dev_err(&indio_dev->dev,
+ dev_err(dev,
"DMA channel request failed with %d\n",
ret);
return ret;
}
- dev_dbg(&indio_dev->dev, "No DMA support\n");
+ dev_dbg(dev, "No DMA support\n");
return 0;
}
adc->dfsdm->fl_list[adc->fl_id].sync_mode = val;
adc->dev_data = dev_data;
- ret = dev_data->init(iio);
+ ret = dev_data->init(dev, iio);
if (ret < 0)
return ret;
u8 rx_buf[3];
};
-#define ADS8344_VOLTAGE_CHANNEL(chan, si) \
+#define ADS8344_VOLTAGE_CHANNEL(chan, addr) \
{ \
.type = IIO_VOLTAGE, \
.indexed = 1, \
.channel = chan, \
.info_mask_separate = BIT(IIO_CHAN_INFO_RAW), \
.info_mask_shared_by_type = BIT(IIO_CHAN_INFO_SCALE), \
+ .address = addr, \
}
-#define ADS8344_VOLTAGE_CHANNEL_DIFF(chan1, chan2, si) \
+#define ADS8344_VOLTAGE_CHANNEL_DIFF(chan1, chan2, addr) \
{ \
.type = IIO_VOLTAGE, \
.indexed = 1, \
.differential = 1, \
.info_mask_separate = BIT(IIO_CHAN_INFO_RAW), \
.info_mask_shared_by_type = BIT(IIO_CHAN_INFO_SCALE), \
+ .address = addr, \
}
static const struct iio_chan_spec ads8344_channels[] = {
switch (mask) {
case IIO_CHAN_INFO_RAW:
mutex_lock(&adc->lock);
- *value = ads8344_adc_conversion(adc, channel->scan_index,
+ *value = ads8344_adc_conversion(adc, channel->address,
channel->differential);
mutex_unlock(&adc->lock);
if (*value < 0)
};
static const struct iio_chan_spec atlas_do_channels[] = {
- ATLAS_CONCENTRATION_CHANNEL(0, ATLAS_REG_DO_DATA),
+ {
+ .type = IIO_CONCENTRATION,
+ .address = ATLAS_REG_DO_DATA,
+ .info_mask_separate =
+ BIT(IIO_CHAN_INFO_RAW) | BIT(IIO_CHAN_INFO_SCALE),
+ .scan_index = 0,
+ .scan_type = {
+ .sign = 'u',
+ .realbits = 32,
+ .storagebits = 32,
+ .endianness = IIO_BE,
+ },
+ },
IIO_CHAN_SOFT_TIMESTAMP(1),
{
.type = IIO_TEMP,
return 0;
error_iio_device_register:
+ vf610_dac_exit(info);
clk_disable_unprepare(info->clk);
return ret;
ref_sensor = iio_priv(hw->iio_devs[ST_LSM6DSX_ID_ACC]);
odr = st_lsm6dsx_check_odr(ref_sensor, val, &odr_val);
- if (odr < 0)
- return odr;
+ if (odr < 0) {
+ err = odr;
+ goto release;
+ }
sensor->ext_info.slv_odr = val;
sensor->odr = odr;
break;
}
+release:
iio_device_release_direct_mode(iio_dev);
return err;
if (err)
return err;
- rdma_for_each_port (device, p)
- ib_cache_update(device, p, true);
+ rdma_for_each_port (device, p) {
+ err = ib_cache_update(device, p, true);
+ if (err)
+ return err;
+ }
return 0;
}
has_cap_net_admin = netlink_capable(skb, CAP_NET_ADMIN);
ret = fill_func(msg, has_cap_net_admin, res, port);
-
- rdma_restrack_put(res);
if (ret)
goto err_free;
+ rdma_restrack_put(res);
nlmsg_end(msg, nlh);
ib_device_put(device);
return rdma_nl_unicast(sock_net(skb->sk), msg, NETLINK_CB(skb).portid);
uobj->context = NULL;
/*
- * For DESTROY the usecnt is held write locked, the caller is expected
- * to put it unlock and put the object when done with it. Only DESTROY
- * can remove the IDR handle.
+ * For DESTROY the usecnt is not changed, the caller is expected to
+ * manage it via uobj_put_destroy(). Only DESTROY can remove the IDR
+ * handle.
*/
if (reason != RDMA_REMOVE_DESTROY)
atomic_set(&uobj->usecnt, 0);
/*
* This calls uverbs_destroy_uobject() using the RDMA_REMOVE_DESTROY
* sequence. It should only be used from command callbacks. On success the
- * caller must pair this with rdma_lookup_put_uobject(LOOKUP_WRITE). This
+ * caller must pair this with uobj_put_destroy(). This
* version requires the caller to have already obtained an
* LOOKUP_DESTROY uobject kref.
*/
down_read(&ufile->hw_destroy_rwsem);
+ /*
+ * Once the uobject is destroyed by RDMA_REMOVE_DESTROY then it is left
+ * write locked as the callers put it back with UVERBS_LOOKUP_DESTROY.
+ * This is because any other concurrent thread can still see the object
+ * in the xarray due to RCU. Leaving it locked ensures nothing else will
+ * touch it.
+ */
ret = uverbs_try_lock_object(uobj, UVERBS_LOOKUP_WRITE);
if (ret)
goto out_unlock;
/*
* uobj_get_destroy destroys the HW object and returns a handle to the uobj
* with a NULL object pointer. The caller must pair this with
- * uverbs_put_destroy.
+ * uobj_put_destroy().
*/
struct ib_uobject *__uobj_get_destroy(const struct uverbs_api_object *obj,
u32 id, struct uverbs_attr_bundle *attrs)
uobj = __uobj_get_destroy(obj, id, attrs);
if (IS_ERR(uobj))
return PTR_ERR(uobj);
-
- rdma_lookup_put_uobject(uobj, UVERBS_LOOKUP_WRITE);
+ uobj_put_destroy(uobj);
return 0;
}
struct ib_uobject *uobj;
struct file *filp;
- if (WARN_ON(fd_type->fops->release != &uverbs_uobject_fd_release))
+ if (WARN_ON(fd_type->fops->release != &uverbs_uobject_fd_release &&
+ fd_type->fops->release != &uverbs_async_event_release))
return ERR_PTR(-EINVAL);
new_fd = get_unused_fd_flags(O_CLOEXEC);
void ib_uverbs_init_async_event_file(struct ib_uverbs_async_event_file *ev_file);
void ib_uverbs_free_event_queue(struct ib_uverbs_event_queue *event_queue);
void ib_uverbs_flow_resources_free(struct ib_uflow_resources *uflow_res);
+int uverbs_async_event_release(struct inode *inode, struct file *filp);
int ib_alloc_ucontext(struct uverbs_attr_bundle *attrs);
int ib_init_ucontext(struct uverbs_attr_bundle *attrs);
struct ib_ucq_object *uobj);
void ib_uverbs_release_uevent(struct ib_uevent_object *uobj);
void ib_uverbs_release_file(struct kref *ref);
+void ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file,
+ __u64 element, __u64 event,
+ struct list_head *obj_list, u32 *counter);
void ib_uverbs_comp_handler(struct ib_cq *cq, void *cq_context);
void ib_uverbs_cq_event_handler(struct ib_event *event, void *context_ptr);
.owner = THIS_MODULE,
.read = ib_uverbs_async_event_read,
.poll = ib_uverbs_async_event_poll,
- .release = uverbs_uobject_fd_release,
+ .release = uverbs_async_event_release,
.fasync = ib_uverbs_async_event_fasync,
.llseek = no_llseek,
};
kill_fasync(&ev_queue->async_queue, SIGIO, POLL_IN);
}
-static void
-ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file,
- __u64 element, __u64 event, struct list_head *obj_list,
- u32 *counter)
+void ib_uverbs_async_handler(struct ib_uverbs_async_event_file *async_file,
+ __u64 element, __u64 event,
+ struct list_head *obj_list, u32 *counter)
{
struct ib_uverbs_event *entry;
unsigned long flags;
*/
mutex_unlock(&uverbs_dev->lists_mutex);
- ib_uverbs_async_handler(READ_ONCE(file->async_file), 0,
- IB_EVENT_DEVICE_FATAL, NULL, NULL);
-
uverbs_destroy_ufile_hw(file, RDMA_REMOVE_DRIVER_REMOVE);
kref_put(&file->ref, ib_uverbs_release_file);
container_of(uobj, struct ib_uverbs_async_event_file, uobj);
ib_unregister_event_handler(&event_file->event_handler);
- ib_uverbs_free_event_queue(&event_file->ev_queue);
+
+ if (why == RDMA_REMOVE_DRIVER_REMOVE)
+ ib_uverbs_async_handler(event_file, 0, IB_EVENT_DEVICE_FATAL,
+ NULL, NULL);
return 0;
}
+int uverbs_async_event_release(struct inode *inode, struct file *filp)
+{
+ struct ib_uverbs_async_event_file *event_file;
+ struct ib_uobject *uobj = filp->private_data;
+ int ret;
+
+ if (!uobj)
+ return uverbs_uobject_fd_release(inode, filp);
+
+ event_file =
+ container_of(uobj, struct ib_uverbs_async_event_file, uobj);
+
+ /*
+ * The async event FD has to deliver IB_EVENT_DEVICE_FATAL even after
+ * disassociation, so cleaning the event list must only happen after
+ * release. The user knows it has reached the end of the event stream
+ * when it sees IB_EVENT_DEVICE_FATAL.
+ */
+ uverbs_uobject_get(uobj);
+ ret = uverbs_uobject_fd_release(inode, filp);
+ ib_uverbs_free_event_queue(&event_file->ev_queue);
+ uverbs_uobject_put(uobj);
+ return ret;
+}
+
DECLARE_UVERBS_NAMED_METHOD(
UVERBS_METHOD_ASYNC_EVENT_ALLOC,
UVERBS_ATTR_FD(UVERBS_ATTR_ASYNC_EVENT_ALLOC_FD_HANDLE,
srqidx = ABORT_RSS_SRQIDX_G(
be32_to_cpu(req->srqidx_status));
if (srqidx) {
- complete_cached_srq_buffers(ep,
- req->srqidx_status);
+ complete_cached_srq_buffers(ep, srqidx);
} else {
/* Hold ep ref until finish_peer_abort() */
c4iw_get_ep(&ep->com);
return 0;
}
- ep->srqe_idx = t4_tcb_get_field32(tcb, TCB_RQ_START_W, TCB_RQ_START_W,
- TCB_RQ_START_S);
+ ep->srqe_idx = t4_tcb_get_field32(tcb, TCB_RQ_START_W, TCB_RQ_START_M,
+ TCB_RQ_START_S);
cleanup:
pr_debug("ep %p tid %u %016x\n", ep, ep->hwtid, ep->srqe_idx);
set_comp_state(pq, cq, info.comp_idx, QUEUED, 0);
pq->state = SDMA_PKT_Q_ACTIVE;
- /* Send the first N packets in the request to buy us some time */
- ret = user_sdma_send_pkts(req, pcount);
- if (unlikely(ret < 0 && ret != -EBUSY))
- goto free_req;
/*
* This is a somewhat blocking send implementation.
struct rtable *rt;
struct neighbour *neigh;
int rc = arpindex;
- struct net_device *netdev = iwdev->netdev;
__be32 dst_ipaddr = htonl(dst_ip);
__be32 src_ipaddr = htonl(src_ip);
return rc;
}
- if (netif_is_bond_slave(netdev))
- netdev = netdev_master_upper_dev_get(netdev);
-
neigh = dst_neigh_lookup(&rt->dst, &dst_ipaddr);
rcu_read_lock();
{
struct neighbour *neigh;
int rc = arpindex;
- struct net_device *netdev = iwdev->netdev;
struct dst_entry *dst;
struct sockaddr_in6 dst_addr;
struct sockaddr_in6 src_addr;
return rc;
}
- if (netif_is_bond_slave(netdev))
- netdev = netdev_master_upper_dev_get(netdev);
-
neigh = dst_neigh_lookup(dst, dst_addr.sin6_addr.in6_u.u6_addr32);
rcu_read_lock();
int arp_index;
arp_index = i40iw_arp_table(iwdev, ip_addr, ipv4, mac_addr, action);
- if (arp_index == -1)
+ if (arp_index < 0)
return;
cqp_request = i40iw_get_cqp_request(&iwdev->cqp, false);
if (!cqp_request)
int send_size;
int header_size;
int spc;
+ int err;
int i;
if (wr->wr.opcode != IB_WR_SEND)
sqp->ud_header.lrh.virtual_lane = 0;
sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED);
- ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey);
+ err = ib_get_cached_pkey(ib_dev, sqp->qp.port, 0, &pkey);
+ if (err)
+ return err;
sqp->ud_header.bth.pkey = cpu_to_be16(pkey);
if (sqp->qp.mlx4_ib_qp_type == MLX4_IB_QPT_TUN_SMI_OWNER)
sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn);
}
sqp->ud_header.bth.solicited_event = !!(wr->wr.send_flags & IB_SEND_SOLICITED);
if (!sqp->qp.ibqp.qp_num)
- ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index, &pkey);
+ err = ib_get_cached_pkey(ib_dev, sqp->qp.port, sqp->pkey_index,
+ &pkey);
else
- ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index, &pkey);
+ err = ib_get_cached_pkey(ib_dev, sqp->qp.port, wr->pkey_index,
+ &pkey);
+ if (err)
+ return err;
+
sqp->ud_header.bth.pkey = cpu_to_be16(pkey);
sqp->ud_header.bth.destination_qpn = cpu_to_be32(wr->remote_qpn);
sqp->ud_header.bth.psn = cpu_to_be32((sqp->send_psn++) & ((1 << 24) - 1));
if (is_odp_mr(mr)) {
to_ib_umem_odp(mr->umem)->private = mr;
+ init_waitqueue_head(&mr->q_deferred_work);
atomic_set(&mr->num_deferred_work, 0);
err = xa_err(xa_store(&dev->odp_mkeys,
mlx5_base_mkey(mr->mmkey.key), &mr->mmkey,
qib_dev_err(dd,
"Skipping linkcontrol sysfs info, (err %d) port %u\n",
ret, port_num);
- goto bail;
+ goto bail_link;
}
kobject_uevent(&ppd->pport_kobj, KOBJ_ADD);
qib_dev_err(dd,
"Skipping sl2vl sysfs info, (err %d) port %u\n",
ret, port_num);
- goto bail_link;
+ goto bail_sl;
}
kobject_uevent(&ppd->sl2vl_kobj, KOBJ_ADD);
qib_dev_err(dd,
"Skipping diag_counters sysfs info, (err %d) port %u\n",
ret, port_num);
- goto bail_sl;
+ goto bail_diagc;
}
kobject_uevent(&ppd->diagc_kobj, KOBJ_ADD);
qib_dev_err(dd,
"Skipping Congestion Control sysfs info, (err %d) port %u\n",
ret, port_num);
- goto bail_diagc;
+ goto bail_cc;
}
kobject_uevent(&ppd->pport_cc_kobj, KOBJ_ADD);
&cc_table_bin_attr);
kobject_put(&ppd->pport_cc_kobj);
}
+ kobject_put(&ppd->diagc_kobj);
kobject_put(&ppd->sl2vl_kobj);
kobject_put(&ppd->pport_kobj);
}
!(pci_resource_flags(pdev, 1) & IORESOURCE_MEM)) {
dev_err(&pdev->dev, "PCI BAR region not MMIO\n");
ret = -ENOMEM;
- goto err_free_device;
+ goto err_disable_pdev;
}
ret = pci_request_regions(pdev, DRV_NAME);
ip = kmalloc(sizeof(*ip), GFP_KERNEL);
if (!ip)
- return NULL;
+ return ERR_PTR(-ENOMEM);
size = PAGE_ALIGN(size);
if (outbuf) {
ip = rxe_create_mmap_info(rxe, buf_size, udata, buf);
- if (!ip)
+ if (IS_ERR(ip)) {
+ err = PTR_ERR(ip);
goto err1;
+ }
- err = copy_to_user(outbuf, &ip->info, sizeof(ip->info));
- if (err)
+ if (copy_to_user(outbuf, &ip->info, sizeof(ip->info))) {
+ err = -EFAULT;
goto err2;
+ }
spin_lock_bh(&rxe->pending_lock);
list_add(&ip->pending_mmaps, &rxe->pending_mmaps);
err2:
kfree(ip);
err1:
- return -EINVAL;
+ return err;
}
inline void rxe_queue_reset(struct rxe_queue *q)
struct ipoib_rx_buf *rx_ring;
struct ipoib_tx_buf *tx_ring;
+ /* cyclic ring variables for managing tx_ring, for UD only */
unsigned int tx_head;
unsigned int tx_tail;
+ /* cyclic ring variables for counting overall outstanding send WRs */
+ unsigned int global_tx_head;
+ unsigned int global_tx_tail;
struct ib_sge tx_sge[MAX_SKB_FRAGS + 1];
struct ib_ud_wr tx_wr;
struct ib_wc send_wc[MAX_SEND_CQE];
return;
}
- if ((priv->tx_head - priv->tx_tail) == ipoib_sendq_size - 1) {
+ if ((priv->global_tx_head - priv->global_tx_tail) ==
+ ipoib_sendq_size - 1) {
ipoib_dbg(priv, "TX ring 0x%x full, stopping kernel net queue\n",
tx->qp->qp_num);
netif_stop_queue(dev);
} else {
netif_trans_update(dev);
++tx->tx_head;
- ++priv->tx_head;
+ ++priv->global_tx_head;
}
}
netif_tx_lock(dev);
++tx->tx_tail;
- ++priv->tx_tail;
+ ++priv->global_tx_tail;
if (unlikely(netif_queue_stopped(dev) &&
- (priv->tx_head - priv->tx_tail) <= ipoib_sendq_size >> 1 &&
+ ((priv->global_tx_head - priv->global_tx_tail) <=
+ ipoib_sendq_size >> 1) &&
test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)))
netif_wake_queue(dev);
dev_kfree_skb_any(tx_req->skb);
netif_tx_lock_bh(p->dev);
++p->tx_tail;
- ++priv->tx_tail;
- if (unlikely(priv->tx_head - priv->tx_tail == ipoib_sendq_size >> 1) &&
+ ++priv->global_tx_tail;
+ if (unlikely((priv->global_tx_head - priv->global_tx_tail) <=
+ ipoib_sendq_size >> 1) &&
netif_queue_stopped(p->dev) &&
test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags))
netif_wake_queue(p->dev);
dev_kfree_skb_any(tx_req->skb);
++priv->tx_tail;
+ ++priv->global_tx_tail;
if (unlikely(netif_queue_stopped(dev) &&
- ((priv->tx_head - priv->tx_tail) <= ipoib_sendq_size >> 1) &&
+ ((priv->global_tx_head - priv->global_tx_tail) <=
+ ipoib_sendq_size >> 1) &&
test_bit(IPOIB_FLAG_ADMIN_UP, &priv->flags)))
netif_wake_queue(dev);
else
priv->tx_wr.wr.send_flags &= ~IB_SEND_IP_CSUM;
/* increase the tx_head after send success, but use it for queue state */
- if (priv->tx_head - priv->tx_tail == ipoib_sendq_size - 1) {
+ if ((priv->global_tx_head - priv->global_tx_tail) ==
+ ipoib_sendq_size - 1) {
ipoib_dbg(priv, "TX ring full, stopping kernel net queue\n");
netif_stop_queue(dev);
}
rc = priv->tx_head;
++priv->tx_head;
+ ++priv->global_tx_head;
}
return rc;
}
ipoib_dma_unmap_tx(priv, tx_req);
dev_kfree_skb_any(tx_req->skb);
++priv->tx_tail;
+ ++priv->global_tx_tail;
}
for (i = 0; i < ipoib_recvq_size; ++i) {
ipoib_warn(priv, "transmit timeout: latency %d msecs\n",
jiffies_to_msecs(jiffies - dev_trans_start(dev)));
- ipoib_warn(priv, "queue stopped %d, tx_head %u, tx_tail %u\n",
- netif_queue_stopped(dev),
- priv->tx_head, priv->tx_tail);
+ ipoib_warn(priv,
+ "queue stopped %d, tx_head %u, tx_tail %u, global_tx_head %u, global_tx_tail %u\n",
+ netif_queue_stopped(dev), priv->tx_head, priv->tx_tail,
+ priv->global_tx_head, priv->global_tx_tail);
+
/* XXX reset QP, etc. */
}
goto out_rx_ring_cleanup;
}
- /* priv->tx_head, tx_tail & tx_outstanding are already 0 */
+ /* priv->tx_head, tx_tail and global_tx_tail/head are already 0 */
if (ipoib_transport_dev_init(dev, priv->ca)) {
pr_warn("%s: ipoib_transport_dev_init failed\n",
return fasync_helper(fd, file, on, &client->fasync);
}
-static int evdev_flush(struct file *file, fl_owner_t id)
-{
- struct evdev_client *client = file->private_data;
- struct evdev *evdev = client->evdev;
-
- mutex_lock(&evdev->mutex);
-
- if (evdev->exist && !client->revoked)
- input_flush_device(&evdev->handle, file);
-
- mutex_unlock(&evdev->mutex);
- return 0;
-}
-
static void evdev_free(struct device *dev)
{
struct evdev *evdev = container_of(dev, struct evdev, dev);
unsigned int i;
mutex_lock(&evdev->mutex);
+
+ if (evdev->exist && !client->revoked)
+ input_flush_device(&evdev->handle, file);
+
evdev_ungrab(evdev, client);
mutex_unlock(&evdev->mutex);
.compat_ioctl = evdev_ioctl_compat,
#endif
.fasync = evdev_fasync,
- .flush = evdev_flush,
.llseek = no_llseek,
};
0x05, 0x20, 0x00, 0x01, 0x00
};
+/*
+ * This packet is required for Xbox One S (0x045e:0x02ea)
+ * and Xbox One Elite Series 2 (0x045e:0x0b00) pads to
+ * initialize the controller that was previously used in
+ * Bluetooth mode.
+ */
+static const u8 xboxone_s_init[] = {
+ 0x05, 0x20, 0x00, 0x0f, 0x06
+};
+
/*
* This packet is required for the Titanfall 2 Xbox One pads
* (0x0e6f:0x0165) to finish initialization and for Hori pads
XBOXONE_INIT_PKT(0x0e6f, 0x0165, xboxone_hori_init),
XBOXONE_INIT_PKT(0x0f0d, 0x0067, xboxone_hori_init),
XBOXONE_INIT_PKT(0x0000, 0x0000, xboxone_fw2015_init),
+ XBOXONE_INIT_PKT(0x045e, 0x02ea, xboxone_s_init),
+ XBOXONE_INIT_PKT(0x045e, 0x0b00, xboxone_s_init),
XBOXONE_INIT_PKT(0x0e6f, 0x0000, xboxone_pdp_init1),
XBOXONE_INIT_PKT(0x0e6f, 0x0000, xboxone_pdp_init2),
XBOXONE_INIT_PKT(0x24c6, 0x541a, xboxone_rumblebegin_init),
u8 number_of_fingers;
u8 clicked2;
u8 unknown3[16];
- struct tp_finger fingers[0];
+ struct tp_finger fingers[];
};
/**
params->info_type = info_type;
params->event_type = event_type;
- ret = cros_ec_cmd_xfer(ec_dev, msg);
- if (ret < 0) {
- dev_warn(ec_dev->dev, "Transfer error %d/%d: %d\n",
- (int)info_type, (int)event_type, ret);
- } else if (msg->result == EC_RES_INVALID_VERSION) {
+ ret = cros_ec_cmd_xfer_status(ec_dev, msg);
+ if (ret == -ENOTSUPP) {
/* With older ECs we just return 0 for everything */
memset(result, 0, result_size);
ret = 0;
- } else if (msg->result != EC_RES_SUCCESS) {
- dev_warn(ec_dev->dev, "Error getting info %d/%d: %d\n",
- (int)info_type, (int)event_type, msg->result);
- ret = -EPROTO;
+ } else if (ret < 0) {
+ dev_warn(ec_dev->dev, "Transfer error %d/%d: %d\n",
+ (int)info_type, (int)event_type, ret);
} else if (ret != result_size) {
dev_warn(ec_dev->dev, "Wrong size %d/%d: %d != %zu\n",
(int)info_type, (int)event_type,
static struct i2c_driver dir685_tk_i2c_driver = {
.driver = {
- .name = "dlin-dir685-touchkeys",
+ .name = "dlink-dir685-touchkeys",
.of_match_table = of_match_ptr(dir685_tk_of_match),
},
.probe = dir685_tk_probe,
static irqreturn_t axp20x_pek_irq(int irq, void *pwr)
{
- struct input_dev *idev = pwr;
- struct axp20x_pek *axp20x_pek = input_get_drvdata(idev);
+ struct axp20x_pek *axp20x_pek = pwr;
+ struct input_dev *idev = axp20x_pek->input;
+
+ if (!idev)
+ return IRQ_HANDLED;
/*
* The power-button is connected to ground so a falling edge (dbf)
static int axp20x_pek_probe_input_device(struct axp20x_pek *axp20x_pek,
struct platform_device *pdev)
{
- struct axp20x_dev *axp20x = axp20x_pek->axp20x;
struct input_dev *idev;
int error;
- axp20x_pek->irq_dbr = platform_get_irq_byname(pdev, "PEK_DBR");
- if (axp20x_pek->irq_dbr < 0)
- return axp20x_pek->irq_dbr;
- axp20x_pek->irq_dbr = regmap_irq_get_virq(axp20x->regmap_irqc,
- axp20x_pek->irq_dbr);
-
- axp20x_pek->irq_dbf = platform_get_irq_byname(pdev, "PEK_DBF");
- if (axp20x_pek->irq_dbf < 0)
- return axp20x_pek->irq_dbf;
- axp20x_pek->irq_dbf = regmap_irq_get_virq(axp20x->regmap_irqc,
- axp20x_pek->irq_dbf);
-
axp20x_pek->input = devm_input_allocate_device(&pdev->dev);
if (!axp20x_pek->input)
return -ENOMEM;
input_set_drvdata(idev, axp20x_pek);
- error = devm_request_any_context_irq(&pdev->dev, axp20x_pek->irq_dbr,
- axp20x_pek_irq, 0,
- "axp20x-pek-dbr", idev);
- if (error < 0) {
- dev_err(&pdev->dev, "Failed to request dbr IRQ#%d: %d\n",
- axp20x_pek->irq_dbr, error);
- return error;
- }
-
- error = devm_request_any_context_irq(&pdev->dev, axp20x_pek->irq_dbf,
- axp20x_pek_irq, 0,
- "axp20x-pek-dbf", idev);
- if (error < 0) {
- dev_err(&pdev->dev, "Failed to request dbf IRQ#%d: %d\n",
- axp20x_pek->irq_dbf, error);
- return error;
- }
-
error = input_register_device(idev);
if (error) {
dev_err(&pdev->dev, "Can't register input device: %d\n",
return error;
}
- device_init_wakeup(&pdev->dev, true);
-
return 0;
}
axp20x_pek->axp20x = dev_get_drvdata(pdev->dev.parent);
+ axp20x_pek->irq_dbr = platform_get_irq_byname(pdev, "PEK_DBR");
+ if (axp20x_pek->irq_dbr < 0)
+ return axp20x_pek->irq_dbr;
+ axp20x_pek->irq_dbr = regmap_irq_get_virq(
+ axp20x_pek->axp20x->regmap_irqc, axp20x_pek->irq_dbr);
+
+ axp20x_pek->irq_dbf = platform_get_irq_byname(pdev, "PEK_DBF");
+ if (axp20x_pek->irq_dbf < 0)
+ return axp20x_pek->irq_dbf;
+ axp20x_pek->irq_dbf = regmap_irq_get_virq(
+ axp20x_pek->axp20x->regmap_irqc, axp20x_pek->irq_dbf);
+
if (axp20x_pek_should_register_input(axp20x_pek, pdev)) {
error = axp20x_pek_probe_input_device(axp20x_pek, pdev);
if (error)
axp20x_pek->info = (struct axp20x_info *)match->driver_data;
+ error = devm_request_any_context_irq(&pdev->dev, axp20x_pek->irq_dbr,
+ axp20x_pek_irq, 0,
+ "axp20x-pek-dbr", axp20x_pek);
+ if (error < 0) {
+ dev_err(&pdev->dev, "Failed to request dbr IRQ#%d: %d\n",
+ axp20x_pek->irq_dbr, error);
+ return error;
+ }
+
+ error = devm_request_any_context_irq(&pdev->dev, axp20x_pek->irq_dbf,
+ axp20x_pek_irq, 0,
+ "axp20x-pek-dbf", axp20x_pek);
+ if (error < 0) {
+ dev_err(&pdev->dev, "Failed to request dbf IRQ#%d: %d\n",
+ axp20x_pek->irq_dbf, error);
+ return error;
+ }
+
+ device_init_wakeup(&pdev->dev, true);
+
platform_set_drvdata(pdev, axp20x_pek);
return 0;
"LEN005b", /* P50 */
"LEN005e", /* T560 */
"LEN006c", /* T470s */
+ "LEN007a", /* T470s */
"LEN0071", /* T480 */
"LEN0072", /* X1 Carbon Gen 5 (2017) - Elan/ALPS trackpoint */
"LEN0073", /* X1 Carbon G5 (Elantech) */
if (count) {
kfree(attn_data.data);
- attn_data.data = NULL;
+ drvdata->attn_data.data = NULL;
}
if (!kfifo_is_empty(&drvdata->attn_fifo))
if (data->input) {
rmi_driver_set_input_name(rmi_dev, data->input);
if (!rmi_dev->xport->input) {
- if (input_register_device(data->input)) {
+ retval = input_register_device(data->input);
+ if (retval) {
dev_err(dev, "%s: Failed to register input device.\n",
__func__);
goto err_destroy_functions;
DMI_MATCH(DMI_PRODUCT_NAME, "P65xRP"),
},
},
+ {
+ /* Lenovo ThinkPad Twist S230u */
+ .matches = {
+ DMI_MATCH(DMI_SYS_VENDOR, "LENOVO"),
+ DMI_MATCH(DMI_PRODUCT_NAME, "33474HU"),
+ },
+ },
{ }
};
*/
+#include <linux/bits.h>
#include <linux/module.h>
#include <linux/input.h>
#include <linux/interrupt.h>
#define FW_POS_STATE 1
#define FW_POS_TOTAL 2
#define FW_POS_XY 3
+#define FW_POS_TOOL_TYPE 33
#define FW_POS_CHECKSUM 34
#define FW_POS_WIDTH 35
#define FW_POS_PRESSURE 45
{
struct input_dev *input = ts->input;
unsigned int n_fingers;
+ unsigned int tool_type;
u16 finger_state;
int i;
dev_dbg(&ts->client->dev,
"n_fingers: %u, state: %04x\n", n_fingers, finger_state);
+ /* Note: all fingers have the same tool type */
+ tool_type = buf[FW_POS_TOOL_TYPE] & BIT(0) ?
+ MT_TOOL_FINGER : MT_TOOL_PALM;
+
for (i = 0; i < MAX_CONTACT_NUM && n_fingers; i++) {
if (finger_state & 1) {
unsigned int x, y, p, w;
i, x, y, p, w);
input_mt_slot(input, i);
- input_mt_report_slot_state(input, MT_TOOL_FINGER, true);
+ input_mt_report_slot_state(input, tool_type, true);
input_event(input, EV_ABS, ABS_MT_POSITION_X, x);
input_event(input, EV_ABS, ABS_MT_POSITION_Y, y);
input_event(input, EV_ABS, ABS_MT_PRESSURE, p);
input_set_abs_params(ts->input, ABS_MT_POSITION_Y, 0, ts->y_max, 0, 0);
input_set_abs_params(ts->input, ABS_MT_TOUCH_MAJOR, 0, 255, 0, 0);
input_set_abs_params(ts->input, ABS_MT_PRESSURE, 0, 255, 0, 0);
+ input_set_abs_params(ts->input, ABS_MT_TOOL_TYPE,
+ 0, MT_TOOL_PALM, 0, 0);
input_abs_set_res(ts->input, ABS_MT_POSITION_X, ts->x_res);
input_abs_set_res(ts->input, ABS_MT_POSITION_Y, ts->y_res);
input_abs_set_res(ts->input, ABS_MT_TOUCH_MAJOR, 1);
if (reg <= MMS114_MODE_CONTROL && reg + len > MMS114_MODE_CONTROL)
BUG();
- /* Write register: use repeated start */
+ /* Write register */
xfer[0].addr = client->addr;
- xfer[0].flags = I2C_M_TEN | I2C_M_NOSTART;
+ xfer[0].flags = client->flags & I2C_M_TEN;
xfer[0].len = 1;
xfer[0].buf = &buf;
/* Read data */
xfer[1].addr = client->addr;
- xfer[1].flags = I2C_M_RD;
+ xfer[1].flags = (client->flags & I2C_M_TEN) | I2C_M_RD;
xfer[1].len = len;
xfer[1].buf = val;
const void *match_data;
int error;
- if (!i2c_check_functionality(client->adapter,
- I2C_FUNC_PROTOCOL_MANGLING)) {
- dev_err(&client->dev,
- "Need i2c bus that supports protocol mangling\n");
+ if (!i2c_check_functionality(client->adapter, I2C_FUNC_I2C)) {
+ dev_err(&client->dev, "Not supported I2C adapter\n");
return -ENODEV;
}
#endif
#ifdef CONFIG_TOUCHSCREEN_USB_IRTOUCH
+ {USB_DEVICE(0x255e, 0x0001), .driver_info = DEVTYPE_IRTOUCH},
{USB_DEVICE(0x595a, 0x0001), .driver_info = DEVTYPE_IRTOUCH},
{USB_DEVICE(0x6615, 0x0001), .driver_info = DEVTYPE_IRTOUCH},
{USB_DEVICE(0x6615, 0x0012), .driver_info = DEVTYPE_IRTOUCH_HIRES},
return -ENODEV;
list_for_each_entry(p, &acpihid_map, list) {
- if (acpi_dev_hid_uid_match(adev, p->hid, p->uid)) {
+ if (acpi_dev_hid_uid_match(adev, p->hid,
+ p->uid[0] ? p->uid : NULL)) {
if (entry)
*entry = p;
return p->devid;
}
case IVHD_DEV_ACPI_HID: {
u16 devid;
- u8 hid[ACPIHID_HID_LEN] = {0};
- u8 uid[ACPIHID_UID_LEN] = {0};
+ u8 hid[ACPIHID_HID_LEN];
+ u8 uid[ACPIHID_UID_LEN];
int ret;
if (h->type != 0x40) {
break;
}
+ uid[0] = '\0';
switch (e->uidf) {
case UID_NOT_PRESENT:
break;
case UID_IS_CHARACTER:
- memcpy(uid, (u8 *)(&e->uid), ACPIHID_UID_LEN - 1);
- uid[ACPIHID_UID_LEN - 1] = '\0';
+ memcpy(uid, &e->uid, e->uidl);
+ uid[e->uidl] = '\0';
break;
default:
NULL, "%d", group->id);
if (ret) {
ida_simple_remove(&iommu_group_ida, group->id);
- kfree(group);
+ kobject_put(&group->kobj);
return ERR_PTR(ret);
}
return ret;
}
+static bool iommu_is_attach_deferred(struct iommu_domain *domain,
+ struct device *dev)
+{
+ if (domain->ops->is_attach_deferred)
+ return domain->ops->is_attach_deferred(domain, dev);
+
+ return false;
+}
+
/**
* iommu_group_add_device - add a device to an iommu group
* @group: the group into which to add the device (reference should be held)
mutex_lock(&group->mutex);
list_add_tail(&device->list, &group->devices);
- if (group->domain)
+ if (group->domain && !iommu_is_attach_deferred(group->domain, dev))
ret = __iommu_attach_device(group->domain, dev);
mutex_unlock(&group->mutex);
if (ret)
struct device *dev)
{
int ret;
- if ((domain->ops->is_attach_deferred != NULL) &&
- domain->ops->is_attach_deferred(domain, dev))
- return 0;
if (unlikely(domain->ops->attach_dev == NULL))
return -ENODEV;
static void __iommu_detach_device(struct iommu_domain *domain,
struct device *dev)
{
- if ((domain->ops->is_attach_deferred != NULL) &&
- domain->ops->is_attach_deferred(domain, dev))
+ if (iommu_is_attach_deferred(domain, dev))
return;
if (unlikely(domain->ops->detach_dev == NULL))
"(bn 0x%X, sn 0x%X) failed to map driver user space!",
tpci200->info->pdev->bus->number,
tpci200->info->pdev->devfn);
+ res = -ENOMEM;
goto out_release_mem8_space;
}
u8 *data = pulse8->data + 1;
u8 cmd[2];
int err;
- struct tm tm;
time64_t date;
pulse8->vers = 0;
if (err)
return err;
date = (data[0] << 24) | (data[1] << 16) | (data[2] << 8) | data[3];
- time64_to_tm(date, 0, &tm);
- dev_info(pulse8->dev, "Firmware build date %04ld.%02d.%02d %02d:%02d:%02d\n",
- tm.tm_year + 1900, tm.tm_mon + 1, tm.tm_mday,
- tm.tm_hour, tm.tm_min, tm.tm_sec);
+ dev_info(pulse8->dev, "Firmware build date %ptT\n", &date);
dev_dbg(pulse8->dev, "Persistent config:\n");
cmd[0] = MSGCODE_GET_AUTO_ENABLED;
Select this option to enable support for Samsung Exynos Low Power
Audio Subsystem.
+config MFD_GATEWORKS_GSC
+ tristate "Gateworks System Controller"
+ depends on (I2C && OF)
+ select MFD_CORE
+ select REGMAP_I2C
+ select REGMAP_IRQ
+ help
+ Enable support for the Gateworks System Controller (GSC) found
+ on Gateworks Single Board Computers supporting system functions
+ such as push-button monitor, multiple ADC's for voltage and
+ temperature monitoring, fan controller and watchdog monitor.
+ This driver provides common support for accessing the device.
+ Additional drivers must be enabled in order to use the
+ functionality of the device.
+
config MFD_MC13XXX
tristate
depends on (SPI_MASTER || I2C)
obj-$(CONFIG_MFD_BD9571MWV) += bd9571mwv.o
obj-$(CONFIG_MFD_CROS_EC_DEV) += cros_ec_dev.o
obj-$(CONFIG_MFD_EXYNOS_LPASS) += exynos-lpass.o
+obj-$(CONFIG_MFD_GATEWORKS_GSC) += gateworks-gsc.o
obj-$(CONFIG_HTC_PASIC3) += htc-pasic3.o
obj-$(CONFIG_HTC_I2CPLD) += htc-i2cpld.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * The Gateworks System Controller (GSC) is a multi-function
+ * device designed for use in Gateworks Single Board Computers.
+ * The control interface is I2C, with an interrupt. The device supports
+ * system functions such as push-button monitoring, multiple ADC's for
+ * voltage and temperature monitoring, fan controller and watchdog monitor.
+ *
+ * Copyright (C) 2020 Gateworks Corporation
+ */
+
+#include <linux/device.h>
+#include <linux/i2c.h>
+#include <linux/interrupt.h>
+#include <linux/mfd/gsc.h>
+#include <linux/module.h>
+#include <linux/mutex.h>
+#include <linux/of.h>
+#include <linux/of_platform.h>
+#include <linux/platform_device.h>
+#include <linux/regmap.h>
+
+#include <asm/unaligned.h>
+
+/*
+ * The GSC suffers from an errata where occasionally during
+ * ADC cycles the chip can NAK I2C transactions. To ensure we have reliable
+ * register access we place retries around register access.
+ */
+#define I2C_RETRIES 3
+
+int gsc_write(void *context, unsigned int reg, unsigned int val)
+{
+ struct i2c_client *client = context;
+ int retry, ret;
+
+ for (retry = 0; retry < I2C_RETRIES; retry++) {
+ ret = i2c_smbus_write_byte_data(client, reg, val);
+ /*
+ * -EAGAIN returned when the i2c host controller is busy
+ * -EIO returned when i2c device is busy
+ */
+ if (ret != -EAGAIN && ret != -EIO)
+ break;
+ }
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(gsc_write);
+
+int gsc_read(void *context, unsigned int reg, unsigned int *val)
+{
+ struct i2c_client *client = context;
+ int retry, ret;
+
+ for (retry = 0; retry < I2C_RETRIES; retry++) {
+ ret = i2c_smbus_read_byte_data(client, reg);
+ /*
+ * -EAGAIN returned when the i2c host controller is busy
+ * -EIO returned when i2c device is busy
+ */
+ if (ret != -EAGAIN && ret != -EIO)
+ break;
+ }
+ *val = ret & 0xff;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(gsc_read);
+
+/*
+ * gsc_powerdown - API to use GSC to power down board for a specific time
+ *
+ * secs - number of seconds to remain powered off
+ */
+static int gsc_powerdown(struct gsc_dev *gsc, unsigned long secs)
+{
+ int ret;
+ unsigned char regs[4];
+
+ dev_info(&gsc->i2c->dev, "GSC powerdown for %ld seconds\n",
+ secs);
+
+ put_unaligned_le32(secs, regs);
+ ret = regmap_bulk_write(gsc->regmap, GSC_TIME_ADD, regs, 4);
+ if (ret)
+ return ret;
+
+ ret = regmap_update_bits(gsc->regmap, GSC_CTRL_1,
+ BIT(GSC_CTRL_1_SLEEP_ADD),
+ BIT(GSC_CTRL_1_SLEEP_ADD));
+ if (ret)
+ return ret;
+
+ ret = regmap_update_bits(gsc->regmap, GSC_CTRL_1,
+ BIT(GSC_CTRL_1_SLEEP_ACTIVATE) |
+ BIT(GSC_CTRL_1_SLEEP_ENABLE),
+ BIT(GSC_CTRL_1_SLEEP_ACTIVATE) |
+ BIT(GSC_CTRL_1_SLEEP_ENABLE));
+
+
+ return ret;
+}
+
+static ssize_t gsc_show(struct device *dev, struct device_attribute *attr,
+ char *buf)
+{
+ struct gsc_dev *gsc = dev_get_drvdata(dev);
+ const char *name = attr->attr.name;
+ int rz = 0;
+
+ if (strcasecmp(name, "fw_version") == 0)
+ rz = sprintf(buf, "%d\n", gsc->fwver);
+ else if (strcasecmp(name, "fw_crc") == 0)
+ rz = sprintf(buf, "0x%04x\n", gsc->fwcrc);
+ else
+ dev_err(dev, "invalid command: '%s'\n", name);
+
+ return rz;
+}
+
+static ssize_t gsc_store(struct device *dev, struct device_attribute *attr,
+ const char *buf, size_t count)
+{
+ struct gsc_dev *gsc = dev_get_drvdata(dev);
+ const char *name = attr->attr.name;
+ long value;
+
+ if (strcasecmp(name, "powerdown") == 0) {
+ if (kstrtol(buf, 0, &value) == 0)
+ gsc_powerdown(gsc, value);
+ } else {
+ dev_err(dev, "invalid command: '%s\n", name);
+ }
+
+ return count;
+}
+
+static struct device_attribute attr_fwver =
+ __ATTR(fw_version, 0440, gsc_show, NULL);
+static struct device_attribute attr_fwcrc =
+ __ATTR(fw_crc, 0440, gsc_show, NULL);
+static struct device_attribute attr_pwrdown =
+ __ATTR(powerdown, 0220, NULL, gsc_store);
+
+static struct attribute *gsc_attrs[] = {
+ &attr_fwver.attr,
+ &attr_fwcrc.attr,
+ &attr_pwrdown.attr,
+ NULL,
+};
+
+static struct attribute_group attr_group = {
+ .attrs = gsc_attrs,
+};
+
+static const struct of_device_id gsc_of_match[] = {
+ { .compatible = "gw,gsc", },
+ { }
+};
+MODULE_DEVICE_TABLE(of, gsc_of_match);
+
+static struct regmap_bus gsc_regmap_bus = {
+ .reg_read = gsc_read,
+ .reg_write = gsc_write,
+};
+
+static const struct regmap_config gsc_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .cache_type = REGCACHE_NONE,
+ .max_register = GSC_WP,
+};
+
+static const struct regmap_irq gsc_irqs[] = {
+ REGMAP_IRQ_REG(GSC_IRQ_PB, 0, BIT(GSC_IRQ_PB)),
+ REGMAP_IRQ_REG(GSC_IRQ_KEY_ERASED, 0, BIT(GSC_IRQ_KEY_ERASED)),
+ REGMAP_IRQ_REG(GSC_IRQ_EEPROM_WP, 0, BIT(GSC_IRQ_EEPROM_WP)),
+ REGMAP_IRQ_REG(GSC_IRQ_RESV, 0, BIT(GSC_IRQ_RESV)),
+ REGMAP_IRQ_REG(GSC_IRQ_GPIO, 0, BIT(GSC_IRQ_GPIO)),
+ REGMAP_IRQ_REG(GSC_IRQ_TAMPER, 0, BIT(GSC_IRQ_TAMPER)),
+ REGMAP_IRQ_REG(GSC_IRQ_WDT_TIMEOUT, 0, BIT(GSC_IRQ_WDT_TIMEOUT)),
+ REGMAP_IRQ_REG(GSC_IRQ_SWITCH_HOLD, 0, BIT(GSC_IRQ_SWITCH_HOLD)),
+};
+
+static const struct regmap_irq_chip gsc_irq_chip = {
+ .name = "gateworks-gsc",
+ .irqs = gsc_irqs,
+ .num_irqs = ARRAY_SIZE(gsc_irqs),
+ .num_regs = 1,
+ .status_base = GSC_IRQ_STATUS,
+ .mask_base = GSC_IRQ_ENABLE,
+ .mask_invert = true,
+ .ack_base = GSC_IRQ_STATUS,
+ .ack_invert = true,
+};
+
+static int gsc_probe(struct i2c_client *client)
+{
+ struct device *dev = &client->dev;
+ struct gsc_dev *gsc;
+ struct regmap_irq_chip_data *irq_data;
+ int ret;
+ unsigned int reg;
+
+ gsc = devm_kzalloc(dev, sizeof(*gsc), GFP_KERNEL);
+ if (!gsc)
+ return -ENOMEM;
+
+ gsc->dev = &client->dev;
+ gsc->i2c = client;
+ i2c_set_clientdata(client, gsc);
+
+ gsc->regmap = devm_regmap_init(dev, &gsc_regmap_bus, client,
+ &gsc_regmap_config);
+ if (IS_ERR(gsc->regmap))
+ return PTR_ERR(gsc->regmap);
+
+ if (regmap_read(gsc->regmap, GSC_FW_VER, ®))
+ return -EIO;
+ gsc->fwver = reg;
+
+ regmap_read(gsc->regmap, GSC_FW_CRC, ®);
+ gsc->fwcrc = reg;
+ regmap_read(gsc->regmap, GSC_FW_CRC + 1, ®);
+ gsc->fwcrc |= reg << 8;
+
+ gsc->i2c_hwmon = devm_i2c_new_dummy_device(dev, client->adapter,
+ GSC_HWMON);
+ if (IS_ERR(gsc->i2c_hwmon)) {
+ dev_err(dev, "Failed to allocate I2C device for HWMON\n");
+ return PTR_ERR(gsc->i2c_hwmon);
+ }
+
+ ret = devm_regmap_add_irq_chip(dev, gsc->regmap, client->irq,
+ IRQF_ONESHOT | IRQF_SHARED |
+ IRQF_TRIGGER_FALLING, 0,
+ &gsc_irq_chip, &irq_data);
+ if (ret)
+ return ret;
+
+ dev_info(dev, "Gateworks System Controller v%d: fw 0x%04x\n",
+ gsc->fwver, gsc->fwcrc);
+
+ ret = sysfs_create_group(&dev->kobj, &attr_group);
+ if (ret)
+ dev_err(dev, "failed to create sysfs attrs\n");
+
+ ret = devm_of_platform_populate(dev);
+ if (ret) {
+ sysfs_remove_group(&dev->kobj, &attr_group);
+ return ret;
+ }
+
+ return 0;
+}
+
+static int gsc_remove(struct i2c_client *client)
+{
+ sysfs_remove_group(&client->dev.kobj, &attr_group);
+
+ return 0;
+}
+
+static struct i2c_driver gsc_driver = {
+ .driver = {
+ .name = "gateworks-gsc",
+ .of_match_table = gsc_of_match,
+ },
+ .probe_new = gsc_probe,
+ .remove = gsc_remove,
+};
+module_i2c_driver(gsc_driver);
+
+MODULE_AUTHOR("Tim Harvey <tharvey@gateworks.com>");
+MODULE_DESCRIPTION("I2C Core interface for GSC");
+MODULE_LICENSE("GPL v2");
rtsx_disable_aspm(pcr);
+ /* Fixes DMA transfer timout issue after disabling ASPM on RTS5260 */
+ msleep(1);
+
if (option->ltr_enabled)
rtsx_set_ltr_latency(pcr, option->ltr_active_latency);
down_write(&dev->me_clients_rwsem);
me_cl = __mei_me_cl_by_uuid(dev, uuid);
__mei_me_cl_del(dev, me_cl);
+ mei_me_cl_put(me_cl);
up_write(&dev->me_clients_rwsem);
}
down_write(&dev->me_clients_rwsem);
me_cl = __mei_me_cl_by_uuid_id(dev, uuid, id);
__mei_me_cl_del(dev, me_cl);
+ mei_me_cl_put(me_cl);
up_write(&dev->me_clients_rwsem);
}
struct mmc_request *mrq = &mqrq->brq.mrq;
struct request_queue *q = req->q;
struct mmc_host *host = mq->card->host;
+ enum mmc_issue_type issue_type = mmc_issue_type(mq, req);
unsigned long flags;
bool put_card;
int err;
spin_lock_irqsave(&mq->lock, flags);
- mq->in_flight[mmc_issue_type(mq, req)] -= 1;
+ mq->in_flight[issue_type] -= 1;
put_card = (mmc_tot_in_flight(mq) == 0);
struct mmc_rpmb_data *rpmb = container_of(inode->i_cdev,
struct mmc_rpmb_data, chrdev);
- put_device(&rpmb->dev);
mmc_blk_put(rpmb->md);
+ put_device(&rpmb->dev);
return 0;
}
case MMC_ISSUE_DCMD:
if (host->cqe_ops->cqe_timeout(host, mrq, &recovery_needed)) {
if (recovery_needed)
- __mmc_cqe_recovery_notifier(mq);
+ mmc_cqe_recovery_notifier(mrq);
return BLK_EH_RESET_TIMER;
}
- /* No timeout (XXX: huh? comment doesn't make much sense) */
- blk_mq_complete_request(req);
+ /* The request has gone already */
return BLK_EH_DONE;
default:
/* Timeout is handled by mmc core */
struct mmc_card *card = mq->card;
struct mmc_host *host = card->host;
unsigned long flags;
- int ret;
+ bool ignore_tout;
spin_lock_irqsave(&mq->lock, flags);
-
- if (mq->recovery_needed || !mq->use_cqe || host->hsq_enabled)
- ret = BLK_EH_RESET_TIMER;
- else
- ret = mmc_cqe_timed_out(req);
-
+ ignore_tout = mq->recovery_needed || !mq->use_cqe || host->hsq_enabled;
spin_unlock_irqrestore(&mq->lock, flags);
- return ret;
+ return ignore_tout ? BLK_EH_RESET_TIMER : mmc_cqe_timed_out(req);
}
static void mmc_mq_recovery_handler(struct work_struct *work)
if (ret) {
dev_err(&pdev->dev, "Failed to get irq for data line\n");
- return ret;
+ goto free_host;
}
mutex_init(&host->cmd_mutex);
dev_set_drvdata(&pdev->dev, host);
mmc_add_host(mmc);
return 0;
+
+free_host:
+ mmc_free_host(mmc);
+ return ret;
}
static int alcor_pci_sdmmc_drv_remove(struct platform_device *pdev)
}
static const struct sdhci_acpi_slot sdhci_acpi_slot_amd_emmc = {
- .chip = &sdhci_acpi_chip_amd,
- .caps = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE,
- .quirks = SDHCI_QUIRK_32BIT_DMA_ADDR | SDHCI_QUIRK_32BIT_DMA_SIZE |
- SDHCI_QUIRK_32BIT_ADMA_SIZE,
+ .chip = &sdhci_acpi_chip_amd,
+ .caps = MMC_CAP_8_BIT_DATA | MMC_CAP_NONREMOVABLE,
+ .quirks = SDHCI_QUIRK_32BIT_DMA_ADDR |
+ SDHCI_QUIRK_32BIT_DMA_SIZE |
+ SDHCI_QUIRK_32BIT_ADMA_SIZE,
+ .quirks2 = SDHCI_QUIRK2_BROKEN_64_BIT_DMA,
.probe_slot = sdhci_acpi_emmc_amd_probe_slot,
};
#define SDHCI_GLI_9750_DRIVING_2 GENMASK(27, 26)
#define GLI_9750_DRIVING_1_VALUE 0xFFF
#define GLI_9750_DRIVING_2_VALUE 0x3
+#define SDHCI_GLI_9750_SEL_1 BIT(29)
+#define SDHCI_GLI_9750_SEL_2 BIT(31)
+#define SDHCI_GLI_9750_ALL_RST (BIT(24)|BIT(25)|BIT(28)|BIT(30))
#define SDHCI_GLI_9750_PLL 0x864
#define SDHCI_GLI_9750_PLL_TX2_INV BIT(23)
GLI_9750_DRIVING_1_VALUE);
driving_value |= FIELD_PREP(SDHCI_GLI_9750_DRIVING_2,
GLI_9750_DRIVING_2_VALUE);
+ driving_value &= ~(SDHCI_GLI_9750_SEL_1|SDHCI_GLI_9750_SEL_2|SDHCI_GLI_9750_ALL_RST);
+ driving_value |= SDHCI_GLI_9750_SEL_2;
sdhci_writel(host, driving_value, SDHCI_GLI_9750_DRIVING);
sw_ctrl_value &= ~SDHCI_GLI_9750_SW_CTRL_4;
return value;
}
+#ifdef CONFIG_PM_SLEEP
+static int sdhci_pci_gli_resume(struct sdhci_pci_chip *chip)
+{
+ struct sdhci_pci_slot *slot = chip->slots[0];
+
+ pci_free_irq_vectors(slot->chip->pdev);
+ gli_pcie_enable_msi(slot);
+
+ return sdhci_pci_resume_host(chip);
+}
+#endif
+
static const struct sdhci_ops sdhci_gl9755_ops = {
.set_clock = sdhci_set_clock,
.enable_dma = sdhci_pci_enable_dma,
.quirks2 = SDHCI_QUIRK2_BROKEN_DDR50,
.probe_slot = gli_probe_slot_gl9755,
.ops = &sdhci_gl9755_ops,
+#ifdef CONFIG_PM_SLEEP
+ .resume = sdhci_pci_gli_resume,
+#endif
};
static const struct sdhci_ops sdhci_gl9750_ops = {
.quirks2 = SDHCI_QUIRK2_BROKEN_DDR50,
.probe_slot = gli_probe_slot_gl9750,
.ops = &sdhci_gl9750_ops,
+#ifdef CONFIG_PM_SLEEP
+ .resume = sdhci_pci_gli_resume,
+#endif
};
mmc_hostname(mmc), host->version);
}
- if (host->quirks & SDHCI_QUIRK_BROKEN_CQE)
- mmc->caps2 &= ~MMC_CAP2_CQE;
-
if (host->quirks & SDHCI_QUIRK_FORCE_DMA)
host->flags |= SDHCI_USE_SDMA;
else if (!(host->caps & SDHCI_CAN_DO_SDMA))
struct mmc_host *mmc = host->mmc;
int ret;
+ if ((mmc->caps2 & MMC_CAP2_CQE) &&
+ (host->quirks & SDHCI_QUIRK_BROKEN_CQE)) {
+ mmc->caps2 &= ~MMC_CAP2_CQE;
+ mmc->cqe_ops = NULL;
+ }
+
host->complete_wq = alloc_workqueue("sdhci", flags, 0);
if (!host->complete_wq)
return -ENOMEM;
buffer in a flash partition where it can be read back at some
later point.
+config MTD_PSTORE
+ tristate "Log panic/oops to an MTD buffer based on pstore"
+ depends on PSTORE_BLK
+ help
+ This enables panic and oops messages to be logged to a circular
+ buffer in a flash partition where it can be read back as files after
+ mounting pstore filesystem.
+
+ If unsure, say N.
+
config MTD_SWAP
tristate "Swap on MTD device support"
depends on MTD && SWAP
obj-$(CONFIG_SSFDC) += ssfdc.o
obj-$(CONFIG_SM_FTL) += sm_ftl.o
obj-$(CONFIG_MTD_OOPS) += mtdoops.o
+obj-$(CONFIG_MTD_PSTORE) += mtdpstore.o
obj-$(CONFIG_MTD_SWAP) += mtdswap.o
nftl-objs := nftlcore.o nftlmount.o
config.id = -1;
config.dev = &mtd->dev;
- config.name = mtd->name;
+ config.name = dev_name(&mtd->dev);
config.owner = THIS_MODULE;
config.reg_read = mtd_nvmem_reg_read;
config.size = mtd->size;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+
+#define dev_fmt(fmt) "mtdoops-pstore: " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/pstore_blk.h>
+#include <linux/mtd/mtd.h>
+#include <linux/bitops.h>
+
+static struct mtdpstore_context {
+ int index;
+ struct pstore_blk_config info;
+ struct pstore_device_info dev;
+ struct mtd_info *mtd;
+ unsigned long *rmmap; /* removed bit map */
+ unsigned long *usedmap; /* used bit map */
+ /*
+ * used for panic write
+ * As there are no block_isbad for panic case, we should keep this
+ * status before panic to ensure panic_write not failed.
+ */
+ unsigned long *badmap; /* bad block bit map */
+} oops_cxt;
+
+static int mtdpstore_block_isbad(struct mtdpstore_context *cxt, loff_t off)
+{
+ int ret;
+ struct mtd_info *mtd = cxt->mtd;
+ u64 blknum;
+
+ off = ALIGN_DOWN(off, mtd->erasesize);
+ blknum = div_u64(off, mtd->erasesize);
+
+ if (test_bit(blknum, cxt->badmap))
+ return true;
+ ret = mtd_block_isbad(mtd, off);
+ if (ret < 0) {
+ dev_err(&mtd->dev, "mtd_block_isbad failed, aborting\n");
+ return ret;
+ } else if (ret > 0) {
+ set_bit(blknum, cxt->badmap);
+ return true;
+ }
+ return false;
+}
+
+static inline int mtdpstore_panic_block_isbad(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u64 blknum;
+
+ off = ALIGN_DOWN(off, mtd->erasesize);
+ blknum = div_u64(off, mtd->erasesize);
+ return test_bit(blknum, cxt->badmap);
+}
+
+static inline void mtdpstore_mark_used(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u64 zonenum = div_u64(off, cxt->info.kmsg_size);
+
+ dev_dbg(&mtd->dev, "mark zone %llu used\n", zonenum);
+ set_bit(zonenum, cxt->usedmap);
+}
+
+static inline void mtdpstore_mark_unused(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u64 zonenum = div_u64(off, cxt->info.kmsg_size);
+
+ dev_dbg(&mtd->dev, "mark zone %llu unused\n", zonenum);
+ clear_bit(zonenum, cxt->usedmap);
+}
+
+static inline void mtdpstore_block_mark_unused(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size;
+ u64 zonenum;
+
+ off = ALIGN_DOWN(off, mtd->erasesize);
+ zonenum = div_u64(off, cxt->info.kmsg_size);
+ while (zonecnt > 0) {
+ dev_dbg(&mtd->dev, "mark zone %llu unused\n", zonenum);
+ clear_bit(zonenum, cxt->usedmap);
+ zonenum++;
+ zonecnt--;
+ }
+}
+
+static inline int mtdpstore_is_used(struct mtdpstore_context *cxt, loff_t off)
+{
+ u64 zonenum = div_u64(off, cxt->info.kmsg_size);
+ u64 blknum = div_u64(off, cxt->mtd->erasesize);
+
+ if (test_bit(blknum, cxt->badmap))
+ return true;
+ return test_bit(zonenum, cxt->usedmap);
+}
+
+static int mtdpstore_block_is_used(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size;
+ u64 zonenum;
+
+ off = ALIGN_DOWN(off, mtd->erasesize);
+ zonenum = div_u64(off, cxt->info.kmsg_size);
+ while (zonecnt > 0) {
+ if (test_bit(zonenum, cxt->usedmap))
+ return true;
+ zonenum++;
+ zonecnt--;
+ }
+ return false;
+}
+
+static int mtdpstore_is_empty(struct mtdpstore_context *cxt, char *buf,
+ size_t size)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ size_t sz;
+ int i;
+
+ sz = min_t(uint32_t, size, mtd->writesize / 4);
+ for (i = 0; i < sz; i++) {
+ if (buf[i] != (char)0xFF)
+ return false;
+ }
+ return true;
+}
+
+static void mtdpstore_mark_removed(struct mtdpstore_context *cxt, loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u64 zonenum = div_u64(off, cxt->info.kmsg_size);
+
+ dev_dbg(&mtd->dev, "mark zone %llu removed\n", zonenum);
+ set_bit(zonenum, cxt->rmmap);
+}
+
+static void mtdpstore_block_clear_removed(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size;
+ u64 zonenum;
+
+ off = ALIGN_DOWN(off, mtd->erasesize);
+ zonenum = div_u64(off, cxt->info.kmsg_size);
+ while (zonecnt > 0) {
+ clear_bit(zonenum, cxt->rmmap);
+ zonenum++;
+ zonecnt--;
+ }
+}
+
+static int mtdpstore_block_is_removed(struct mtdpstore_context *cxt,
+ loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u32 zonecnt = mtd->erasesize / cxt->info.kmsg_size;
+ u64 zonenum;
+
+ off = ALIGN_DOWN(off, mtd->erasesize);
+ zonenum = div_u64(off, cxt->info.kmsg_size);
+ while (zonecnt > 0) {
+ if (test_bit(zonenum, cxt->rmmap))
+ return true;
+ zonenum++;
+ zonecnt--;
+ }
+ return false;
+}
+
+static int mtdpstore_erase_do(struct mtdpstore_context *cxt, loff_t off)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ struct erase_info erase;
+ int ret;
+
+ off = ALIGN_DOWN(off, cxt->mtd->erasesize);
+ dev_dbg(&mtd->dev, "try to erase off 0x%llx\n", off);
+ erase.len = cxt->mtd->erasesize;
+ erase.addr = off;
+ ret = mtd_erase(cxt->mtd, &erase);
+ if (!ret)
+ mtdpstore_block_clear_removed(cxt, off);
+ else
+ dev_err(&mtd->dev, "erase of region [0x%llx, 0x%llx] on \"%s\" failed\n",
+ (unsigned long long)erase.addr,
+ (unsigned long long)erase.len, cxt->info.device);
+ return ret;
+}
+
+/*
+ * called while removing file
+ *
+ * Avoiding over erasing, do erase block only when the whole block is unused.
+ * If the block contains valid log, do erase lazily on flush_removed() when
+ * unregister.
+ */
+static ssize_t mtdpstore_erase(size_t size, loff_t off)
+{
+ struct mtdpstore_context *cxt = &oops_cxt;
+
+ if (mtdpstore_block_isbad(cxt, off))
+ return -EIO;
+
+ mtdpstore_mark_unused(cxt, off);
+
+ /* If the block still has valid data, mtdpstore do erase lazily */
+ if (likely(mtdpstore_block_is_used(cxt, off))) {
+ mtdpstore_mark_removed(cxt, off);
+ return 0;
+ }
+
+ /* all zones are unused, erase it */
+ return mtdpstore_erase_do(cxt, off);
+}
+
+/*
+ * What is security for mtdpstore?
+ * As there is no erase for panic case, we should ensure at least one zone
+ * is writable. Otherwise, panic write will fail.
+ * If zone is used, write operation will return -ENOMSG, which means that
+ * pstore/blk will try one by one until gets an empty zone. So, it is not
+ * needed to ensure the next zone is empty, but at least one.
+ */
+static int mtdpstore_security(struct mtdpstore_context *cxt, loff_t off)
+{
+ int ret = 0, i;
+ struct mtd_info *mtd = cxt->mtd;
+ u32 zonenum = (u32)div_u64(off, cxt->info.kmsg_size);
+ u32 zonecnt = (u32)div_u64(cxt->mtd->size, cxt->info.kmsg_size);
+ u32 blkcnt = (u32)div_u64(cxt->mtd->size, cxt->mtd->erasesize);
+ u32 erasesize = cxt->mtd->erasesize;
+
+ for (i = 0; i < zonecnt; i++) {
+ u32 num = (zonenum + i) % zonecnt;
+
+ /* found empty zone */
+ if (!test_bit(num, cxt->usedmap))
+ return 0;
+ }
+
+ /* If there is no any empty zone, we have no way but to do erase */
+ while (blkcnt--) {
+ div64_u64_rem(off + erasesize, cxt->mtd->size, (u64 *)&off);
+
+ if (mtdpstore_block_isbad(cxt, off))
+ continue;
+
+ ret = mtdpstore_erase_do(cxt, off);
+ if (!ret) {
+ mtdpstore_block_mark_unused(cxt, off);
+ break;
+ }
+ }
+
+ if (ret)
+ dev_err(&mtd->dev, "all blocks bad!\n");
+ dev_dbg(&mtd->dev, "end security\n");
+ return ret;
+}
+
+static ssize_t mtdpstore_write(const char *buf, size_t size, loff_t off)
+{
+ struct mtdpstore_context *cxt = &oops_cxt;
+ struct mtd_info *mtd = cxt->mtd;
+ size_t retlen;
+ int ret;
+
+ if (mtdpstore_block_isbad(cxt, off))
+ return -ENOMSG;
+
+ /* zone is used, please try next one */
+ if (mtdpstore_is_used(cxt, off))
+ return -ENOMSG;
+
+ dev_dbg(&mtd->dev, "try to write off 0x%llx size %zu\n", off, size);
+ ret = mtd_write(cxt->mtd, off, size, &retlen, (u_char *)buf);
+ if (ret < 0 || retlen != size) {
+ dev_err(&mtd->dev, "write failure at %lld (%zu of %zu written), err %d\n",
+ off, retlen, size, ret);
+ return -EIO;
+ }
+ mtdpstore_mark_used(cxt, off);
+
+ mtdpstore_security(cxt, off);
+ return retlen;
+}
+
+static inline bool mtdpstore_is_io_error(int ret)
+{
+ return ret < 0 && !mtd_is_bitflip(ret) && !mtd_is_eccerr(ret);
+}
+
+/*
+ * All zones will be read as pstore/blk will read zone one by one when do
+ * recover.
+ */
+static ssize_t mtdpstore_read(char *buf, size_t size, loff_t off)
+{
+ struct mtdpstore_context *cxt = &oops_cxt;
+ struct mtd_info *mtd = cxt->mtd;
+ size_t retlen, done;
+ int ret;
+
+ if (mtdpstore_block_isbad(cxt, off))
+ return -ENOMSG;
+
+ dev_dbg(&mtd->dev, "try to read off 0x%llx size %zu\n", off, size);
+ for (done = 0, retlen = 0; done < size; done += retlen) {
+ retlen = 0;
+
+ ret = mtd_read(cxt->mtd, off + done, size - done, &retlen,
+ (u_char *)buf + done);
+ if (mtdpstore_is_io_error(ret)) {
+ dev_err(&mtd->dev, "read failure at %lld (%zu of %zu read), err %d\n",
+ off + done, retlen, size - done, ret);
+ /* the zone may be broken, try next one */
+ return -ENOMSG;
+ }
+
+ /*
+ * ECC error. The impact on log data is so small. Maybe we can
+ * still read it and try to understand. So mtdpstore just hands
+ * over what it gets and user can judge whether the data is
+ * valid or not.
+ */
+ if (mtd_is_eccerr(ret)) {
+ dev_err(&mtd->dev, "ecc error at %lld (%zu of %zu read), err %d\n",
+ off + done, retlen, size - done, ret);
+ /* driver may not set retlen when ecc error */
+ retlen = retlen == 0 ? size - done : retlen;
+ }
+ }
+
+ if (mtdpstore_is_empty(cxt, buf, size))
+ mtdpstore_mark_unused(cxt, off);
+ else
+ mtdpstore_mark_used(cxt, off);
+
+ mtdpstore_security(cxt, off);
+ return retlen;
+}
+
+static ssize_t mtdpstore_panic_write(const char *buf, size_t size, loff_t off)
+{
+ struct mtdpstore_context *cxt = &oops_cxt;
+ struct mtd_info *mtd = cxt->mtd;
+ size_t retlen;
+ int ret;
+
+ if (mtdpstore_panic_block_isbad(cxt, off))
+ return -ENOMSG;
+
+ /* zone is used, please try next one */
+ if (mtdpstore_is_used(cxt, off))
+ return -ENOMSG;
+
+ ret = mtd_panic_write(cxt->mtd, off, size, &retlen, (u_char *)buf);
+ if (ret < 0 || size != retlen) {
+ dev_err(&mtd->dev, "panic write failure at %lld (%zu of %zu read), err %d\n",
+ off, retlen, size, ret);
+ return -EIO;
+ }
+ mtdpstore_mark_used(cxt, off);
+
+ return retlen;
+}
+
+static void mtdpstore_notify_add(struct mtd_info *mtd)
+{
+ int ret;
+ struct mtdpstore_context *cxt = &oops_cxt;
+ struct pstore_blk_config *info = &cxt->info;
+ unsigned long longcnt;
+
+ if (!strcmp(mtd->name, info->device))
+ cxt->index = mtd->index;
+
+ if (mtd->index != cxt->index || cxt->index < 0)
+ return;
+
+ dev_dbg(&mtd->dev, "found matching MTD device %s\n", mtd->name);
+
+ if (mtd->size < info->kmsg_size * 2) {
+ dev_err(&mtd->dev, "MTD partition %d not big enough\n",
+ mtd->index);
+ return;
+ }
+ /*
+ * kmsg_size must be aligned to 4096 Bytes, which is limited by
+ * psblk. The default value of kmsg_size is 64KB. If kmsg_size
+ * is larger than erasesize, some errors will occur since mtdpsotre
+ * is designed on it.
+ */
+ if (mtd->erasesize < info->kmsg_size) {
+ dev_err(&mtd->dev, "eraseblock size of MTD partition %d too small\n",
+ mtd->index);
+ return;
+ }
+ if (unlikely(info->kmsg_size % mtd->writesize)) {
+ dev_err(&mtd->dev, "record size %lu KB must align to write size %d KB\n",
+ info->kmsg_size / 1024,
+ mtd->writesize / 1024);
+ return;
+ }
+
+ longcnt = BITS_TO_LONGS(div_u64(mtd->size, info->kmsg_size));
+ cxt->rmmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL);
+ cxt->usedmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL);
+
+ longcnt = BITS_TO_LONGS(div_u64(mtd->size, mtd->erasesize));
+ cxt->badmap = kcalloc(longcnt, sizeof(long), GFP_KERNEL);
+
+ cxt->dev.total_size = mtd->size;
+ /* just support dmesg right now */
+ cxt->dev.flags = PSTORE_FLAGS_DMESG;
+ cxt->dev.read = mtdpstore_read;
+ cxt->dev.write = mtdpstore_write;
+ cxt->dev.erase = mtdpstore_erase;
+ cxt->dev.panic_write = mtdpstore_panic_write;
+
+ ret = register_pstore_device(&cxt->dev);
+ if (ret) {
+ dev_err(&mtd->dev, "mtd%d register to psblk failed\n",
+ mtd->index);
+ return;
+ }
+ cxt->mtd = mtd;
+ dev_info(&mtd->dev, "Attached to MTD device %d\n", mtd->index);
+}
+
+static int mtdpstore_flush_removed_do(struct mtdpstore_context *cxt,
+ loff_t off, size_t size)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ u_char *buf;
+ int ret;
+ size_t retlen;
+ struct erase_info erase;
+
+ buf = kmalloc(mtd->erasesize, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+
+ /* 1st. read to cache */
+ ret = mtd_read(mtd, off, mtd->erasesize, &retlen, buf);
+ if (mtdpstore_is_io_error(ret))
+ goto free;
+
+ /* 2nd. erase block */
+ erase.len = mtd->erasesize;
+ erase.addr = off;
+ ret = mtd_erase(mtd, &erase);
+ if (ret)
+ goto free;
+
+ /* 3rd. write back */
+ while (size) {
+ unsigned int zonesize = cxt->info.kmsg_size;
+
+ /* there is valid data on block, write back */
+ if (mtdpstore_is_used(cxt, off)) {
+ ret = mtd_write(mtd, off, zonesize, &retlen, buf);
+ if (ret)
+ dev_err(&mtd->dev, "write failure at %lld (%zu of %u written), err %d\n",
+ off, retlen, zonesize, ret);
+ }
+
+ off += zonesize;
+ size -= min_t(unsigned int, zonesize, size);
+ }
+
+free:
+ kfree(buf);
+ return ret;
+}
+
+/*
+ * What does mtdpstore_flush_removed() do?
+ * When user remove any log file on pstore filesystem, mtdpstore should do
+ * something to ensure log file removed. If the whole block is no longer used,
+ * it's nice to erase the block. However if the block still contains valid log,
+ * what mtdpstore can do is to erase and write the valid log back.
+ */
+static int mtdpstore_flush_removed(struct mtdpstore_context *cxt)
+{
+ struct mtd_info *mtd = cxt->mtd;
+ int ret;
+ loff_t off;
+ u32 blkcnt = (u32)div_u64(mtd->size, mtd->erasesize);
+
+ for (off = 0; blkcnt > 0; blkcnt--, off += mtd->erasesize) {
+ ret = mtdpstore_block_isbad(cxt, off);
+ if (ret)
+ continue;
+
+ ret = mtdpstore_block_is_removed(cxt, off);
+ if (!ret)
+ continue;
+
+ ret = mtdpstore_flush_removed_do(cxt, off, mtd->erasesize);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
+static void mtdpstore_notify_remove(struct mtd_info *mtd)
+{
+ struct mtdpstore_context *cxt = &oops_cxt;
+
+ if (mtd->index != cxt->index || cxt->index < 0)
+ return;
+
+ mtdpstore_flush_removed(cxt);
+
+ unregister_pstore_device(&cxt->dev);
+ kfree(cxt->badmap);
+ kfree(cxt->usedmap);
+ kfree(cxt->rmmap);
+ cxt->mtd = NULL;
+ cxt->index = -1;
+}
+
+static struct mtd_notifier mtdpstore_notifier = {
+ .add = mtdpstore_notify_add,
+ .remove = mtdpstore_notify_remove,
+};
+
+static int __init mtdpstore_init(void)
+{
+ int ret;
+ struct mtdpstore_context *cxt = &oops_cxt;
+ struct pstore_blk_config *info = &cxt->info;
+
+ ret = pstore_blk_get_config(info);
+ if (unlikely(ret))
+ return ret;
+
+ if (strlen(info->device) == 0) {
+ pr_err("mtd device must be supplied (device name is empty)\n");
+ return -EINVAL;
+ }
+ if (!info->kmsg_size) {
+ pr_err("no backend enabled (kmsg_size is 0)\n");
+ return -EINVAL;
+ }
+
+ /* Setup the MTD device to use */
+ ret = kstrtoint((char *)info->device, 0, &cxt->index);
+ if (ret)
+ cxt->index = -1;
+
+ register_mtd_user(&mtdpstore_notifier);
+ return 0;
+}
+module_init(mtdpstore_init);
+
+static void __exit mtdpstore_exit(void)
+{
+ unregister_mtd_user(&mtdpstore_notifier);
+}
+module_exit(mtdpstore_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
+MODULE_DESCRIPTION("MTD backend for pstore/blk");
flash_dma_writel(ctrl, FLASH_DMA_ERROR_STATUS, 0);
}
- if (has_edu(ctrl))
+ if (has_edu(ctrl)) {
ctrl->edu_config = edu_readl(ctrl, EDU_CONFIG);
- else {
edu_writel(ctrl, EDU_CONFIG, ctrl->edu_config);
edu_readl(ctrl, EDU_CONFIG);
brcmnand_edu_init(ctrl);
mtd->oobavail = ret;
+ /* Propagate ECC information to mtd_info */
+ mtd->ecc_strength = nand->eccreq.strength;
+ mtd->ecc_step_size = nand->eccreq.step_size;
+
return 0;
err_cleanup_nanddev:
{
struct ubi_device *ubi = s->private;
- if (*pos == 0)
- return SEQ_START_TOKEN;
-
if (*pos < ubi->peb_count)
return pos;
{
struct ubi_device *ubi = s->private;
- if (v == SEQ_START_TOKEN)
- return pos;
(*pos)++;
if (*pos < ubi->peb_count)
int err;
/* If this is the start, print a header */
- if (iter == SEQ_START_TOKEN) {
- seq_puts(s,
- "physical_block_number\terase_count\tblock_status\tread_status\n");
- return 0;
- }
+ if (*block_number == 0)
+ seq_puts(s, "physical_block_number\terase_count\n");
err = ubi_io_is_bad(ubi, *block_number);
if (err)
oiph = skb_network_header(skb);
skb_reset_network_header(skb);
- if (family == AF_INET)
+ if (!IS_ENABLED(CONFIG_IPV6) || family == AF_INET)
err = IP_ECN_decapsulate(oiph, skb);
-#if IS_ENABLED(CONFIG_IPV6)
else
err = IP6_ECN_decapsulate(oiph, skb);
-#endif
if (unlikely(err)) {
if (log_ecn_error) {
- if (family == AF_INET)
+ if (!IS_ENABLED(CONFIG_IPV6) || family == AF_INET)
net_info_ratelimited("non-ECT from %pI4 "
"with TOS=%#x\n",
&((struct iphdr *)oiph)->saddr,
((struct iphdr *)oiph)->tos);
-#if IS_ENABLED(CONFIG_IPV6)
else
net_info_ratelimited("non-ECT from %pI6\n",
&((struct ipv6hdr *)oiph)->saddr);
-#endif
}
if (err > 1) {
++bareudp->dev->stats.rx_frame_errors;
return err;
}
-#if IS_ENABLED(CONFIG_IPV6)
static int bareudp6_xmit_skb(struct sk_buff *skb, struct net_device *dev,
struct bareudp_dev *bareudp,
const struct ip_tunnel_info *info)
dst_release(dst);
return err;
}
-#endif
static netdev_tx_t bareudp_xmit(struct sk_buff *skb, struct net_device *dev)
{
}
rcu_read_lock();
-#if IS_ENABLED(CONFIG_IPV6)
- if (info->mode & IP_TUNNEL_INFO_IPV6)
+ if (IS_ENABLED(CONFIG_IPV6) && info->mode & IP_TUNNEL_INFO_IPV6)
err = bareudp6_xmit_skb(skb, dev, bareudp, info);
else
-#endif
err = bareudp_xmit_skb(skb, dev, bareudp, info);
rcu_read_unlock();
use_cache = ip_tunnel_dst_cache_usable(skb, info);
- if (ip_tunnel_info_af(info) == AF_INET) {
+ if (!IS_ENABLED(CONFIG_IPV6) || ip_tunnel_info_af(info) == AF_INET) {
struct rtable *rt;
__be32 saddr;
ip_rt_put(rt);
info->key.u.ipv4.src = saddr;
-#if IS_ENABLED(CONFIG_IPV6)
} else if (ip_tunnel_info_af(info) == AF_INET6) {
struct dst_entry *dst;
struct in6_addr saddr;
dst_release(dst);
info->key.u.ipv6.src = saddr;
-#endif
} else {
return -EINVAL;
}
err = kobject_init_and_add(&slave->kobj, &slave_ktype,
&(slave->dev->dev.kobj), "bonding_slave");
- if (err)
+ if (err) {
+ kobject_put(&slave->kobj);
return err;
+ }
for (a = slave_attrs; *a; ++a) {
err = sysfs_create_file(&slave->kobj, &((*a)->attr));
u32 id, rev;
addr = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(addr))
+ return PTR_ERR(addr);
+
irq = platform_get_irq(pdev, 0);
- if (IS_ERR(addr) || irq < 0)
+ if (irq < 0)
return -EINVAL;
id = readl(addr + IFI_CANFD_IP_ID);
addr = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(addr)) {
- err = -EBUSY;
+ err = PTR_ERR(addr);
goto exit;
}
priv->regs = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(priv->regs))
- return -ENOMEM;
+ return PTR_ERR(priv->regs);
dev = b53_switch_alloc(&pdev->dev, &b53_srab_ops, priv);
if (!dev)
}
module_exit(dsa_loop_exit);
+MODULE_SOFTDEP("pre: dsa_loop_bdinfo");
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Florian Fainelli");
MODULE_DESCRIPTION("DSA loopback driver");
mt7530_write(priv, MT7530_PVC_P(port),
PORT_SPEC_TAG);
- /* Disable auto learning on the cpu port */
- mt7530_set(priv, MT7530_PSC_P(port), SA_DIS);
-
- /* Unknown unicast frame fordwarding to the cpu port */
- mt7530_set(priv, MT7530_MFC, UNU_FFP(BIT(port)));
+ /* Unknown multicast frame forwarding to the cpu port */
+ mt7530_rmw(priv, MT7530_MFC, UNM_FFP_MASK, UNM_FFP(BIT(port)));
/* Set CPU port number */
if (priv->id == ID_MT7621)
/* Enable and reset MIB counters */
mt7530_mib_reset(ds);
- mt7530_clear(priv, MT7530_MFC, UNU_FFP_MASK);
-
for (i = 0; i < MT7530_NUM_PORTS; i++) {
/* Disable forwarding by default on all ports */
mt7530_rmw(priv, MT7530_PCR_P(i), PCR_MATRIX_MASK,
#define MT7530_MFC 0x10
#define BC_FFP(x) (((x) & 0xff) << 24)
#define UNM_FFP(x) (((x) & 0xff) << 16)
+#define UNM_FFP_MASK UNM_FFP(~0)
#define UNU_FFP(x) (((x) & 0xff) << 8)
#define UNU_FFP_MASK UNU_FFP(~0)
#define CPU_EN BIT(7)
const struct switchdev_obj_port_vlan *vlan)
{
struct ocelot *ocelot = ds->priv;
+ u16 flags = vlan->flags;
u16 vid;
int err;
+ if (dsa_is_cpu_port(ds, port))
+ flags &= ~BRIDGE_VLAN_INFO_UNTAGGED;
+
for (vid = vlan->vid_begin; vid <= vlan->vid_end; vid++) {
err = ocelot_vlan_add(ocelot, port, vid,
- vlan->flags & BRIDGE_VLAN_INFO_PVID,
- vlan->flags & BRIDGE_VLAN_INFO_UNTAGGED);
+ flags & BRIDGE_VLAN_INFO_PVID,
+ flags & BRIDGE_VLAN_INFO_UNTAGGED);
if (err) {
dev_err(ds->dev, "Failed to add VLAN %d to port %d: %d\n",
vid, port, err);
struct ocelot *ocelot = &felix->ocelot;
phy_interface_t *port_phy_modes;
resource_size_t switch_base;
+ struct resource res;
int port, i, err;
ocelot->num_phys_ports = num_phys_ports;
for (i = 0; i < TARGET_MAX; i++) {
struct regmap *target;
- struct resource *res;
if (!felix->info->target_io_res[i].name)
continue;
- res = &felix->info->target_io_res[i];
- res->flags = IORESOURCE_MEM;
- res->start += switch_base;
- res->end += switch_base;
+ memcpy(&res, &felix->info->target_io_res[i], sizeof(res));
+ res.flags = IORESOURCE_MEM;
+ res.start += switch_base;
+ res.end += switch_base;
- target = ocelot_regmap_init(ocelot, res);
+ target = ocelot_regmap_init(ocelot, &res);
if (IS_ERR(target)) {
dev_err(ocelot->dev,
"Failed to map device memory space\n");
for (port = 0; port < num_phys_ports; port++) {
struct ocelot_port *ocelot_port;
void __iomem *port_regs;
- struct resource *res;
ocelot_port = devm_kzalloc(ocelot->dev,
sizeof(struct ocelot_port),
return -ENOMEM;
}
- res = &felix->info->port_io_res[port];
- res->flags = IORESOURCE_MEM;
- res->start += switch_base;
- res->end += switch_base;
+ memcpy(&res, &felix->info->port_io_res[port], sizeof(res));
+ res.flags = IORESOURCE_MEM;
+ res.start += switch_base;
+ res.end += switch_base;
- port_regs = devm_ioremap_resource(ocelot->dev, res);
+ port_regs = devm_ioremap_resource(ocelot->dev, &res);
if (IS_ERR(port_regs)) {
dev_err(ocelot->dev,
"failed to map registers for port %d\n", port);
/* Platform-specific information */
struct felix_info {
- struct resource *target_io_res;
- struct resource *port_io_res;
- struct resource *imdio_res;
+ const struct resource *target_io_res;
+ const struct resource *port_io_res;
+ const struct resource *imdio_res;
const struct reg_field *regfields;
const u32 *const *map;
const struct ocelot_ops *ops;
[GCB] = vsc9959_gcb_regmap,
};
-/* Addresses are relative to the PCI device's base address and
- * will be fixed up at ioremap time.
- */
-static struct resource vsc9959_target_io_res[] = {
+/* Addresses are relative to the PCI device's base address */
+static const struct resource vsc9959_target_io_res[] = {
[ANA] = {
.start = 0x0280000,
.end = 0x028ffff,
},
};
-static struct resource vsc9959_port_io_res[] = {
+static const struct resource vsc9959_port_io_res[] = {
{
.start = 0x0100000,
.end = 0x010ffff,
/* Port MAC 0 Internal MDIO bus through which the SerDes acting as an
* SGMII/QSGMII MAC PCS can be found.
*/
-static struct resource vsc9959_imdio_res = {
+static const struct resource vsc9959_imdio_res = {
.start = 0x8030,
.end = 0x8040,
.name = "imdio",
struct device *dev = ocelot->dev;
resource_size_t imdio_base;
void __iomem *imdio_regs;
- struct resource *res;
+ struct resource res;
struct enetc_hw *hw;
struct mii_bus *bus;
int port;
imdio_base = pci_resource_start(felix->pdev,
felix->info->imdio_pci_bar);
- res = felix->info->imdio_res;
- res->flags = IORESOURCE_MEM;
- res->start += imdio_base;
- res->end += imdio_base;
+ memcpy(&res, felix->info->imdio_res, sizeof(res));
+ res.flags = IORESOURCE_MEM;
+ res.start += imdio_base;
+ res.end += imdio_base;
- imdio_regs = devm_ioremap_resource(dev, res);
+ imdio_regs = devm_ioremap_resource(dev, &res);
if (IS_ERR(imdio_regs)) {
dev_err(dev, "failed to map internal MDIO registers\n");
return PTR_ERR(imdio_regs);
int i;
unsigned short data;
- for (i = 0; i < 6; i++)
+ for (i = 0; i < 3; i++)
{
reset_and_select_srom(dev);
data = read_srom(dev, i + EnetAddressOffset/2, SROMAddressBits);
select BCM7XXX_PHY
select MDIO_BCM_UNIMAC
select DIMLIB
+ select BROADCOM_PHY if ARCH_BCM2835
help
This driver supports the built-in Ethernet MACs found in the
Broadcom BCM7xxx Set Top Box family chipset.
int i, intr_process, rc, tmo_count;
struct input *req = msg;
u32 *data = msg;
- __le32 *resp_len;
u8 *valid;
u16 cp_ring_id, len = 0;
struct hwrm_err_output *resp = bp->hwrm_cmd_resp_addr;
u16 max_req_len = BNXT_HWRM_MAX_REQ_LEN;
struct hwrm_short_input short_input = {0};
u32 doorbell_offset = BNXT_GRCPF_REG_CHIMP_COMM_TRIGGER;
- u8 *resp_addr = (u8 *)bp->hwrm_cmd_resp_addr;
u32 bar_offset = BNXT_GRCPF_REG_CHIMP_COMM;
u16 dst = BNXT_HWRM_CHNL_CHIMP;
bar_offset = BNXT_GRCPF_REG_KONG_COMM;
doorbell_offset = BNXT_GRCPF_REG_KONG_COMM_TRIGGER;
resp = bp->hwrm_cmd_kong_resp_addr;
- resp_addr = (u8 *)bp->hwrm_cmd_kong_resp_addr;
}
memset(resp, 0, PAGE_SIZE);
tmo_count = HWRM_SHORT_TIMEOUT_COUNTER;
timeout = timeout - HWRM_SHORT_MIN_TIMEOUT * HWRM_SHORT_TIMEOUT_COUNTER;
tmo_count += DIV_ROUND_UP(timeout, HWRM_MIN_TIMEOUT);
- resp_len = (__le32 *)(resp_addr + HWRM_RESP_LEN_OFFSET);
if (intr_process) {
u16 seq_id = bp->hwrm_intr_seq_id;
le16_to_cpu(req->req_type));
return -EBUSY;
}
- len = (le32_to_cpu(*resp_len) & HWRM_RESP_LEN_MASK) >>
- HWRM_RESP_LEN_SFT;
- valid = resp_addr + len - 1;
+ len = le16_to_cpu(resp->resp_len);
+ valid = ((u8 *)resp) + len - 1;
} else {
int j;
*/
if (test_bit(BNXT_STATE_FW_FATAL_COND, &bp->state))
return -EBUSY;
- len = (le32_to_cpu(*resp_len) & HWRM_RESP_LEN_MASK) >>
- HWRM_RESP_LEN_SFT;
+ len = le16_to_cpu(resp->resp_len);
if (len)
break;
/* on first few passes, just barely sleep */
}
/* Last byte of resp contains valid bit */
- valid = resp_addr + len - 1;
+ valid = ((u8 *)resp) + len - 1;
for (j = 0; j < HWRM_VALID_BIT_DELAY_USEC; j++) {
/* make sure we read from updated DMA memory */
dma_rmb();
bnxt_free_skbs(bp);
/* Save ring stats before shutdown */
- if (bp->bnapi)
+ if (bp->bnapi && irq_re_init)
bnxt_get_ring_stats(bp, &bp->net_stats_prev);
if (irq_re_init) {
bnxt_free_irq(bp);
#define HWRM_CMD_TIMEOUT (bp->hwrm_cmd_timeout)
#define HWRM_RESET_TIMEOUT ((HWRM_CMD_TIMEOUT) * 4)
#define HWRM_COREDUMP_TIMEOUT ((HWRM_CMD_TIMEOUT) * 12)
-#define HWRM_RESP_ERR_CODE_MASK 0xffff
-#define HWRM_RESP_LEN_OFFSET 4
-#define HWRM_RESP_LEN_MASK 0xffff0000
-#define HWRM_RESP_LEN_SFT 16
-#define HWRM_RESP_VALID_MASK 0xff000000
#define BNXT_HWRM_REQ_MAX_SIZE 128
#define BNXT_HWRM_REQS_PER_PAGE (BNXT_PAGE_SIZE / \
BNXT_HWRM_REQ_MAX_SIZE)
bnxt_hwrm_fw_set_time(bp);
- if (bnxt_find_nvram_item(dev, BNX_DIR_TYPE_UPDATE,
- BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
- &index, &item_len, NULL) != 0) {
+ rc = bnxt_find_nvram_item(dev, BNX_DIR_TYPE_UPDATE,
+ BNX_DIR_ORDINAL_FIRST, BNX_DIR_EXT_NONE,
+ &index, &item_len, NULL);
+ if (rc) {
netdev_err(dev, "PKG update area not created in nvram\n");
- return -ENOBUFS;
+ return rc;
}
rc = request_firmware(&fw, filename, &dev->dev);
depends on QUICC_ENGINE && PPC32
select FSL_PQ_MDIO
select PHYLIB
+ select FIXED_PHY
---help---
This driver supports the Gigabit Ethernet mode of the QUICC Engine,
which is available on some Freescale SOCs.
depends on HAS_DMA
select FSL_PQ_MDIO
select PHYLIB
+ select FIXED_PHY
select CRC32
---help---
This driver supports the Gigabit TSEC on the MPC83xx, MPC85xx,
tristate "DPAA Ethernet"
depends on FSL_DPAA && FSL_FMAN
select PHYLIB
+ select FIXED_PHY
select FSL_FMAN_MAC
---help---
Data Path Acceleration Architecture Ethernet driver,
}
/* Do this here, so we can be verbose early */
- SET_NETDEV_DEV(net_dev, dev);
+ SET_NETDEV_DEV(net_dev, dev->parent);
dev_set_drvdata(dev, net_dev);
priv = netdev_priv(net_dev);
for (i = 1; i < DPAA2_ETH_MAX_SG_ENTRIES; i++) {
addr = dpaa2_sg_get_addr(&sgt[i]);
sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr);
- dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE,
+ dma_unmap_page(dev, addr, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
free_pages((unsigned long)sg_vaddr, 0);
/* Get the address and length from the S/G entry */
sg_addr = dpaa2_sg_get_addr(sge);
sg_vaddr = dpaa2_iova_to_virt(priv->iommu_domain, sg_addr);
- dma_unmap_page(dev, sg_addr, DPAA2_ETH_RX_BUF_SIZE,
+ dma_unmap_page(dev, sg_addr, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
sg_length = dpaa2_sg_get_len(sge);
(page_address(page) - page_address(head_page));
skb_add_rx_frag(skb, i - 1, head_page, page_offset,
- sg_length, DPAA2_ETH_RX_BUF_SIZE);
+ sg_length, priv->rx_buf_size);
}
if (dpaa2_sg_is_final(sge))
for (i = 0; i < count; i++) {
vaddr = dpaa2_iova_to_virt(priv->iommu_domain, buf_array[i]);
- dma_unmap_page(dev, buf_array[i], DPAA2_ETH_RX_BUF_SIZE,
+ dma_unmap_page(dev, buf_array[i], priv->rx_buf_size,
DMA_BIDIRECTIONAL);
free_pages((unsigned long)vaddr, 0);
}
break;
case XDP_REDIRECT:
dma_unmap_page(priv->net_dev->dev.parent, addr,
- DPAA2_ETH_RX_BUF_SIZE, DMA_BIDIRECTIONAL);
+ priv->rx_buf_size, DMA_BIDIRECTIONAL);
ch->buf_count--;
xdp.data_hard_start = vaddr;
err = xdp_do_redirect(priv->net_dev, &xdp, xdp_prog);
trace_dpaa2_rx_fd(priv->net_dev, fd);
vaddr = dpaa2_iova_to_virt(priv->iommu_domain, addr);
- dma_sync_single_for_cpu(dev, addr, DPAA2_ETH_RX_BUF_SIZE,
+ dma_sync_single_for_cpu(dev, addr, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
fas = dpaa2_get_fas(vaddr, false);
return;
}
- dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE,
+ dma_unmap_page(dev, addr, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
skb = build_linear_skb(ch, fd, vaddr);
} else if (fd_format == dpaa2_fd_sg) {
WARN_ON(priv->xdp_prog);
- dma_unmap_page(dev, addr, DPAA2_ETH_RX_BUF_SIZE,
+ dma_unmap_page(dev, addr, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
skb = build_frag_skb(priv, ch, buf_data);
free_pages((unsigned long)vaddr, 0);
if (!page)
goto err_alloc;
- addr = dma_map_page(dev, page, 0, DPAA2_ETH_RX_BUF_SIZE,
+ addr = dma_map_page(dev, page, 0, priv->rx_buf_size,
DMA_BIDIRECTIONAL);
if (unlikely(dma_mapping_error(dev, addr)))
goto err_map;
/* tracing point */
trace_dpaa2_eth_buf_seed(priv->net_dev,
page, DPAA2_ETH_RX_BUF_RAW_SIZE,
- addr, DPAA2_ETH_RX_BUF_SIZE,
+ addr, priv->rx_buf_size,
bpid);
}
int mfl, linear_mfl;
mfl = DPAA2_ETH_L2_MAX_FRM(mtu);
- linear_mfl = DPAA2_ETH_RX_BUF_SIZE - DPAA2_ETH_RX_HWA_SIZE -
+ linear_mfl = priv->rx_buf_size - DPAA2_ETH_RX_HWA_SIZE -
dpaa2_eth_rx_head_room(priv) - XDP_PACKET_HEADROOM;
if (mfl > linear_mfl) {
else
rx_buf_align = DPAA2_ETH_RX_BUF_ALIGN;
+ /* We need to ensure that the buffer size seen by WRIOP is a multiple
+ * of 64 or 256 bytes depending on the WRIOP version.
+ */
+ priv->rx_buf_size = ALIGN_DOWN(DPAA2_ETH_RX_BUF_SIZE, rx_buf_align);
+
/* tx buffer */
buf_layout.private_data_size = DPAA2_ETH_SWA_SIZE;
buf_layout.pass_timestamp = true;
pools_params.num_dpbp = 1;
pools_params.pools[0].dpbp_id = priv->dpbp_dev->obj_desc.id;
pools_params.pools[0].backup_pool = 0;
- pools_params.pools[0].buffer_size = DPAA2_ETH_RX_BUF_SIZE;
+ pools_params.pools[0].buffer_size = priv->rx_buf_size;
err = dpni_set_pools(priv->mc_io, 0, priv->mc_token, &pools_params);
if (err) {
dev_err(dev, "dpni_set_pools() failed\n");
u16 tx_data_offset;
struct fsl_mc_device *dpbp_dev;
+ u16 rx_buf_size;
u16 bpid;
struct iommu_domain *iommu_domain;
static int update_cls_rule(struct net_device *net_dev,
struct ethtool_rx_flow_spec *new_fs,
- int location)
+ unsigned int location)
{
struct dpaa2_eth_priv *priv = netdev_priv(net_dev);
struct dpaa2_eth_cls_rule *rule;
#include <soc/fsl/qe/ucc.h>
#include <soc/fsl/qe/ucc_fast.h>
#include <asm/machdep.h>
+#include <net/sch_generic.h>
#include "ucc_geth.h"
static void ugeth_quiesce(struct ucc_geth_private *ugeth)
{
- /* Prevent any further xmits, plus detach the device. */
- netif_device_detach(ugeth->ndev);
-
- /* Wait for any current xmits to finish. */
- netif_tx_disable(ugeth->ndev);
+ /* Prevent any further xmits */
+ netif_tx_stop_all_queues(ugeth->ndev);
/* Disable the interrupt to avoid NAPI rescheduling. */
disable_irq(ugeth->ug_info->uf_info.irq);
{
napi_enable(&ugeth->napi);
enable_irq(ugeth->ug_info->uf_info.irq);
- netif_device_attach(ugeth->ndev);
+
+ /* allow to xmit again */
+ netif_tx_wake_all_queues(ugeth->ndev);
+ __netdev_watchdog_up(ugeth->ndev);
}
/* Called every time the controller might need to be made
the PHY
config HNS
- tristate "Hisilicon Network Subsystem Support (Framework)"
+ tristate
---help---
This selects the framework support for Hisilicon Network Subsystem. It
is needed by any driver which provides HNS acceleration engine or make
#define MGMT_MSG_TIMEOUT 5000
+#define SET_FUNC_PORT_MGMT_TIMEOUT 25000
+
#define mgmt_to_pfhwdev(pf_mgmt) \
container_of(pf_mgmt, struct hinic_pfhwdev, pf_to_mgmt)
u8 *buf_in, u16 in_size,
u8 *buf_out, u16 *out_size,
enum mgmt_direction_type direction,
- u16 resp_msg_id)
+ u16 resp_msg_id, u32 timeout)
{
struct hinic_hwif *hwif = pf_to_mgmt->hwif;
struct pci_dev *pdev = hwif->pdev;
struct hinic_recv_msg *recv_msg;
struct completion *recv_done;
+ unsigned long timeo;
u16 msg_id;
int err;
goto unlock_sync_msg;
}
- if (!wait_for_completion_timeout(recv_done,
- msecs_to_jiffies(MGMT_MSG_TIMEOUT))) {
+ timeo = msecs_to_jiffies(timeout ? timeout : MGMT_MSG_TIMEOUT);
+
+ if (!wait_for_completion_timeout(recv_done, timeo)) {
dev_err(&pdev->dev, "MGMT timeout, MSG id = %d\n", msg_id);
err = -ETIMEDOUT;
goto unlock_sync_msg;
{
struct hinic_hwif *hwif = pf_to_mgmt->hwif;
struct pci_dev *pdev = hwif->pdev;
+ u32 timeout = 0;
if (sync != HINIC_MGMT_MSG_SYNC) {
dev_err(&pdev->dev, "Invalid MGMT msg type\n");
return -EINVAL;
}
+ if (cmd == HINIC_PORT_CMD_SET_FUNC_STATE)
+ timeout = SET_FUNC_PORT_MGMT_TIMEOUT;
+
return msg_to_mgmt_sync(pf_to_mgmt, mod, cmd, buf_in, in_size,
buf_out, out_size, MGMT_DIRECT_SEND,
- MSG_NOT_RESP);
+ MSG_NOT_RESP, timeout);
}
/**
{
struct hinic_dev *nic_dev = netdev_priv(netdev);
unsigned int flags;
- int err;
down(&nic_dev->mgmt_lock);
up(&nic_dev->mgmt_lock);
- err = hinic_port_set_func_state(nic_dev, HINIC_FUNC_PORT_DISABLE);
- if (err) {
- netif_err(nic_dev, drv, netdev,
- "Failed to set func port state\n");
- nic_dev->flags |= (flags & HINIC_INTF_UP);
- return err;
- }
+ hinic_port_set_state(nic_dev, HINIC_PORT_DISABLE);
- err = hinic_port_set_state(nic_dev, HINIC_PORT_DISABLE);
- if (err) {
- netif_err(nic_dev, drv, netdev, "Failed to set port state\n");
- nic_dev->flags |= (flags & HINIC_INTF_UP);
- return err;
- }
+ hinic_port_set_func_state(nic_dev, HINIC_FUNC_PORT_DISABLE);
if (nic_dev->flags & HINIC_RSS_ENABLE) {
hinic_rss_deinit(nic_dev);
dev_err(dev, "Error %ld in VERSION_EXCHG_RSP\n", rc);
break;
}
- dev_info(dev, "Partner protocol version is %d\n",
- crq->version_exchange_rsp.version);
- if (be16_to_cpu(crq->version_exchange_rsp.version) <
- ibmvnic_version)
- ibmvnic_version =
+ ibmvnic_version =
be16_to_cpu(crq->version_exchange_rsp.version);
+ dev_info(dev, "Partner protocol version is %d\n",
+ ibmvnic_version);
send_cap_queries(adapter);
break;
case QUERY_CAPABILITY_RSP:
(port->first_rxq >> MVPP2_CLS_OVERSIZE_RXQ_LOW_BITS));
val = mvpp2_read(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG);
- val |= MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
+ val &= ~MVPP2_CLS_SWFWD_PCTRL_MASK(port->id);
mvpp2_write(port->priv, MVPP2_CLS_SWFWD_PCTRL_REG, val);
}
hw->irq_name = devm_kmalloc_array(&hw->pdev->dev, num_vec, NAME_SIZE,
GFP_KERNEL);
- if (!hw->irq_name)
+ if (!hw->irq_name) {
+ err = -ENOMEM;
goto err_free_netdev;
+ }
hw->affinity_mask = devm_kcalloc(&hw->pdev->dev, num_vec,
sizeof(cpumask_var_t), GFP_KERNEL);
- if (!hw->affinity_mask)
+ if (!hw->affinity_mask) {
+ err = -ENOMEM;
goto err_free_netdev;
+ }
err = pci_alloc_irq_vectors(hw->pdev, num_vec, num_vec, PCI_IRQ_MSIX);
if (err < 0) {
pep->base = devm_platform_ioremap_resource(pdev, 0);
if (IS_ERR(pep->base)) {
- err = -ENOMEM;
+ err = PTR_ERR(pep->base);
goto err_netdev;
}
if (err) {
mlx4_err(dev, "Failed to retrieve required operation: %d\n",
err);
- return;
+ goto out;
}
MLX4_GET(modifier, outbox, GET_OP_REQ_MODIFIER_OFFSET);
MLX4_GET(token, outbox, GET_OP_REQ_TOKEN_OFFSET);
config MLX5_TC_CT
bool "MLX5 TC connection tracking offload support"
- depends on MLX5_CORE_EN && NET_SWITCHDEV && NF_FLOW_TABLE && NET_ACT_CT && NET_TC_SKB_EXT
+ depends on MLX5_ESWITCH && NF_FLOW_TABLE && NET_ACT_CT && NET_TC_SKB_EXT
default y
help
Say Y here if you want to support offloading connection tracking rules
static void mlx5_free_cmd_msg(struct mlx5_core_dev *dev,
struct mlx5_cmd_msg *msg);
+static bool opcode_allowed(struct mlx5_cmd *cmd, u16 opcode)
+{
+ if (cmd->allowed_opcode == CMD_ALLOWED_OPCODE_ALL)
+ return true;
+
+ return cmd->allowed_opcode == opcode;
+}
+
static void cmd_work_handler(struct work_struct *work)
{
struct mlx5_cmd_work_ent *ent = container_of(work, struct mlx5_cmd_work_ent, work);
int alloc_ret;
int cmd_mode;
+ complete(&ent->handling);
sem = ent->page_queue ? &cmd->pages_sem : &cmd->sem;
down(sem);
if (!ent->page_queue) {
/* Skip sending command to fw if internal error */
if (pci_channel_offline(dev->pdev) ||
- dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
+ dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR ||
+ cmd->state != MLX5_CMDIF_STATE_UP ||
+ !opcode_allowed(&dev->cmd, ent->op)) {
u8 status = 0;
u32 drv_synd;
struct mlx5_cmd *cmd = &dev->cmd;
int err;
+ if (!wait_for_completion_timeout(&ent->handling, timeout) &&
+ cancel_work_sync(&ent->work)) {
+ ent->ret = -ECANCELED;
+ goto out_err;
+ }
if (cmd->mode == CMD_MODE_POLLING || ent->polling) {
wait_for_completion(&ent->done);
} else if (!wait_for_completion_timeout(&ent->done, timeout)) {
mlx5_cmd_comp_handler(dev, 1UL << ent->idx, true);
}
+out_err:
err = ent->ret;
if (err == -ETIMEDOUT) {
mlx5_core_warn(dev, "%s(0x%x) timeout. Will cause a leak of a command resource\n",
mlx5_command_str(msg_to_opcode(ent->in)),
msg_to_opcode(ent->in));
+ } else if (err == -ECANCELED) {
+ mlx5_core_warn(dev, "%s(0x%x) canceled on out of queue timeout.\n",
+ mlx5_command_str(msg_to_opcode(ent->in)),
+ msg_to_opcode(ent->in));
}
mlx5_core_dbg(dev, "err %d, delivery status %s(%d)\n",
err, deliv_status_to_str(ent->status), ent->status);
ent->token = token;
ent->polling = force_polling;
+ init_completion(&ent->handling);
if (!callback)
init_completion(&ent->done);
err = wait_func(dev, ent);
if (err == -ETIMEDOUT)
goto out;
+ if (err == -ECANCELED)
+ goto out_free;
ds = ent->ts2 - ent->ts1;
op = MLX5_GET(mbox_in, in->first.data, opcode);
mlx5_cmdif_debugfs_init(dev);
}
+void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode)
+{
+ struct mlx5_cmd *cmd = &dev->cmd;
+ int i;
+
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ down(&cmd->sem);
+ down(&cmd->pages_sem);
+
+ cmd->allowed_opcode = opcode;
+
+ up(&cmd->pages_sem);
+ for (i = 0; i < cmd->max_reg_cmds; i++)
+ up(&cmd->sem);
+}
+
static void mlx5_cmd_change_mod(struct mlx5_core_dev *dev, int mode)
{
struct mlx5_cmd *cmd = &dev->cmd;
int err;
u8 status = 0;
u32 drv_synd;
+ u16 opcode;
u8 token;
+ opcode = MLX5_GET(mbox_in, in, opcode);
if (pci_channel_offline(dev->pdev) ||
- dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR) {
- u16 opcode = MLX5_GET(mbox_in, in, opcode);
-
+ dev->state == MLX5_DEVICE_STATE_INTERNAL_ERROR ||
+ dev->cmd.state != MLX5_CMDIF_STATE_UP ||
+ !opcode_allowed(&dev->cmd, opcode)) {
err = mlx5_internal_err_ret_value(dev, opcode, &drv_synd, &status);
MLX5_SET(mbox_out, out, status, status);
MLX5_SET(mbox_out, out, syndrome, drv_synd);
goto err_free_page;
}
+ cmd->state = MLX5_CMDIF_STATE_DOWN;
cmd->checksum_disabled = 1;
cmd->max_reg_cmds = (1 << cmd->log_sz) - 1;
cmd->bitmask = (1UL << cmd->max_reg_cmds) - 1;
mlx5_core_dbg(dev, "descriptor at dma 0x%llx\n", (unsigned long long)(cmd->dma));
cmd->mode = CMD_MODE_POLLING;
+ cmd->allowed_opcode = CMD_ALLOWED_OPCODE_ALL;
create_msg_cache(dev);
dma_pool_destroy(cmd->pool);
}
EXPORT_SYMBOL(mlx5_cmd_cleanup);
+
+void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
+ enum mlx5_cmdif_state cmdif_state)
+{
+ dev->cmd.state = cmdif_state;
+}
+EXPORT_SYMBOL(mlx5_cmd_set_state);
void mlx5e_build_default_indir_rqt(u32 *indirection_rqt, int len,
int num_channels);
-void mlx5e_set_tx_cq_mode_params(struct mlx5e_params *params,
- u8 cq_period_mode);
-void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params,
- u8 cq_period_mode);
+
+void mlx5e_reset_tx_moderation(struct mlx5e_params *params, u8 cq_period_mode);
+void mlx5e_reset_rx_moderation(struct mlx5e_params *params, u8 cq_period_mode);
+void mlx5e_set_tx_cq_mode_params(struct mlx5e_params *params, u8 cq_period_mode);
+void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params, u8 cq_period_mode);
+
void mlx5e_set_rq_type(struct mlx5_core_dev *mdev, struct mlx5e_params *params);
void mlx5e_init_rq_type_params(struct mlx5_core_dev *mdev,
struct mlx5e_params *params);
int mlx5e_create_indirect_rqt(struct mlx5e_priv *priv);
int mlx5e_create_indirect_tirs(struct mlx5e_priv *priv, bool inner_ttc);
-void mlx5e_destroy_indirect_tirs(struct mlx5e_priv *priv, bool inner_ttc);
+void mlx5e_destroy_indirect_tirs(struct mlx5e_priv *priv);
int mlx5e_create_direct_rqts(struct mlx5e_priv *priv, struct mlx5e_tir *tirs);
void mlx5e_destroy_direct_rqts(struct mlx5e_priv *priv, struct mlx5e_tir *tirs);
*_policy = MLX5_GET(pplm_reg, _buf, fec_override_admin_##link); \
} while (0)
-#define MLX5E_FEC_OVERRIDE_ADMIN_50G_POLICY(buf, policy, write, link) \
- do { \
- u16 *__policy = &(policy); \
- bool _write = (write); \
- \
- if (_write && *__policy) \
- *__policy = find_first_bit((u_long *)__policy, \
- sizeof(u16) * BITS_PER_BYTE);\
- MLX5E_FEC_OVERRIDE_ADMIN_POLICY(buf, *__policy, _write, link); \
- if (!_write && *__policy) \
- *__policy = 1 << *__policy; \
+#define MLX5E_FEC_OVERRIDE_ADMIN_50G_POLICY(buf, policy, write, link) \
+ do { \
+ unsigned long policy_long; \
+ u16 *__policy = &(policy); \
+ bool _write = (write); \
+ \
+ policy_long = *__policy; \
+ if (_write && *__policy) \
+ *__policy = find_first_bit(&policy_long, \
+ sizeof(policy_long) * BITS_PER_BYTE);\
+ MLX5E_FEC_OVERRIDE_ADMIN_POLICY(buf, *__policy, _write, link); \
+ if (!_write && *__policy) \
+ *__policy = 1 << *__policy; \
} while (0)
/* get/set FEC admin field for a given speed */
struct netlink_ext_ack *extack)
{
struct mlx5_tc_ct_priv *ct_priv = mlx5_tc_ct_get_ct_priv(priv);
+ struct flow_rule *rule = flow_cls_offload_flow_rule(f);
struct flow_dissector_key_ct *mask, *key;
bool trk, est, untrk, unest, new;
u32 ctstate = 0, ctstate_mask = 0;
u16 ct_state, ct_state_mask;
struct flow_match_ct match;
- if (!flow_rule_match_key(f->rule, FLOW_DISSECTOR_KEY_CT))
+ if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CT))
return 0;
if (!ct_priv) {
return -EOPNOTSUPP;
}
- flow_rule_match_ct(f->rule, &match);
+ flow_rule_match_ct(rule, &match);
key = match.key;
mask = match.mask;
struct flow_cls_offload *f,
struct netlink_ext_ack *extack)
{
- if (!flow_rule_match_key(f->rule, FLOW_DISSECTOR_KEY_CT))
+ struct flow_rule *rule = flow_cls_offload_flow_rule(f);
+
+ if (!flow_rule_match_key(rule, FLOW_DISSECTOR_KEY_CT))
return 0;
NL_SET_ERR_MSG_MOD(extack, "mlx5 tc ct offload isn't enabled.");
struct mlx5e_ktls_offload_context_tx *tx_priv =
mlx5e_get_ktls_tx_priv_ctx(tls_ctx);
- mlx5_ktls_destroy_key(priv->mdev, tx_priv->key_id);
mlx5e_destroy_tis(priv->mdev, tx_priv->tisn);
+ mlx5_ktls_destroy_key(priv->mdev, tx_priv->key_id);
kvfree(tx_priv);
}
struct dim_cq_moder *rx_moder, *tx_moder;
struct mlx5_core_dev *mdev = priv->mdev;
struct mlx5e_channels new_channels = {};
+ bool reset_rx, reset_tx;
int err = 0;
- bool reset;
if (!MLX5_CAP_GEN(mdev, cq_moderation))
return -EOPNOTSUPP;
}
/* we are opened */
- reset = (!!coal->use_adaptive_rx_coalesce != priv->channels.params.rx_dim_enabled) ||
- (!!coal->use_adaptive_tx_coalesce != priv->channels.params.tx_dim_enabled);
+ reset_rx = !!coal->use_adaptive_rx_coalesce != priv->channels.params.rx_dim_enabled;
+ reset_tx = !!coal->use_adaptive_tx_coalesce != priv->channels.params.tx_dim_enabled;
- if (!reset) {
+ if (!reset_rx && !reset_tx) {
mlx5e_set_priv_channels_coalesce(priv, coal);
priv->channels.params = new_channels.params;
goto out;
}
+ if (reset_rx) {
+ u8 mode = MLX5E_GET_PFLAG(&new_channels.params,
+ MLX5E_PFLAG_RX_CQE_BASED_MODER);
+
+ mlx5e_reset_rx_moderation(&new_channels.params, mode);
+ }
+ if (reset_tx) {
+ u8 mode = MLX5E_GET_PFLAG(&new_channels.params,
+ MLX5E_PFLAG_TX_CQE_BASED_MODER);
+
+ mlx5e_reset_tx_moderation(&new_channels.params, mode);
+ }
+
err = mlx5e_safe_switch_channels(priv, &new_channels, NULL, NULL);
out:
static int get_fec_supported_advertised(struct mlx5_core_dev *dev,
struct ethtool_link_ksettings *link_ksettings)
{
- u_long active_fec = 0;
+ unsigned long active_fec_long;
+ u32 active_fec;
u32 bitn;
int err;
- err = mlx5e_get_fec_mode(dev, (u32 *)&active_fec, NULL);
+ err = mlx5e_get_fec_mode(dev, &active_fec, NULL);
if (err)
return (err == -EOPNOTSUPP) ? 0 : err;
MLX5E_ADVERTISE_SUPPORTED_FEC(MLX5E_FEC_LLRS_272_257_1,
ETHTOOL_LINK_MODE_FEC_LLRS_BIT);
+ active_fec_long = active_fec;
/* active fec is a bit set, find out which bit is set and
* advertise the corresponding ethtool bit
*/
- bitn = find_first_bit(&active_fec, sizeof(u32) * BITS_PER_BYTE);
+ bitn = find_first_bit(&active_fec_long, sizeof(active_fec_long) * BITS_PER_BYTE);
if (bitn < ARRAY_SIZE(pplm_fec_2_ethtool_linkmodes))
__set_bit(pplm_fec_2_ethtool_linkmodes[bitn],
link_ksettings->link_modes.advertising);
{
struct mlx5e_priv *priv = netdev_priv(netdev);
struct mlx5_core_dev *mdev = priv->mdev;
- u16 fec_configured = 0;
- u32 fec_active = 0;
+ u16 fec_configured;
+ u32 fec_active;
int err;
err = mlx5e_get_fec_mode(mdev, &fec_active, &fec_configured);
if (err)
return err;
- fecparam->active_fec = pplm2ethtool_fec((u_long)fec_active,
- sizeof(u32) * BITS_PER_BYTE);
+ fecparam->active_fec = pplm2ethtool_fec((unsigned long)fec_active,
+ sizeof(unsigned long) * BITS_PER_BYTE);
if (!fecparam->active_fec)
return -EOPNOTSUPP;
- fecparam->fec = pplm2ethtool_fec((u_long)fec_configured,
- sizeof(u16) * BITS_PER_BYTE);
+ fecparam->fec = pplm2ethtool_fec((unsigned long)fec_configured,
+ sizeof(unsigned long) * BITS_PER_BYTE);
return 0;
}
mlx5_core_modify_tir(mdev, priv->indir_tir[tt].tirn, in, inlen);
}
- if (!mlx5e_tunnel_inner_ft_supported(priv->mdev))
+ /* Verify inner tirs resources allocated */
+ if (!priv->inner_indir_tir[0].tirn)
return;
for (tt = 0; tt < MLX5E_NUM_INDIR_TIRS; tt++) {
return err;
}
-void mlx5e_destroy_indirect_tirs(struct mlx5e_priv *priv, bool inner_ttc)
+void mlx5e_destroy_indirect_tirs(struct mlx5e_priv *priv)
{
int i;
for (i = 0; i < MLX5E_NUM_INDIR_TIRS; i++)
mlx5e_destroy_tir(priv->mdev, &priv->indir_tir[i]);
- if (!inner_ttc || !mlx5e_tunnel_inner_ft_supported(priv->mdev))
+ /* Verify inner tirs resources allocated */
+ if (!priv->inner_indir_tir[0].tirn)
return;
for (i = 0; i < MLX5E_NUM_INDIR_TIRS; i++)
DIM_CQ_PERIOD_MODE_START_FROM_EQE;
}
-void mlx5e_set_tx_cq_mode_params(struct mlx5e_params *params, u8 cq_period_mode)
+void mlx5e_reset_tx_moderation(struct mlx5e_params *params, u8 cq_period_mode)
{
if (params->tx_dim_enabled) {
u8 dim_period_mode = mlx5_to_net_dim_cq_period_mode(cq_period_mode);
} else {
params->tx_cq_moderation = mlx5e_get_def_tx_moderation(cq_period_mode);
}
-
- MLX5E_SET_PFLAG(params, MLX5E_PFLAG_TX_CQE_BASED_MODER,
- params->tx_cq_moderation.cq_period_mode ==
- MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
}
-void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params, u8 cq_period_mode)
+void mlx5e_reset_rx_moderation(struct mlx5e_params *params, u8 cq_period_mode)
{
if (params->rx_dim_enabled) {
u8 dim_period_mode = mlx5_to_net_dim_cq_period_mode(cq_period_mode);
} else {
params->rx_cq_moderation = mlx5e_get_def_rx_moderation(cq_period_mode);
}
+}
+void mlx5e_set_tx_cq_mode_params(struct mlx5e_params *params, u8 cq_period_mode)
+{
+ mlx5e_reset_tx_moderation(params, cq_period_mode);
+ MLX5E_SET_PFLAG(params, MLX5E_PFLAG_TX_CQE_BASED_MODER,
+ params->tx_cq_moderation.cq_period_mode ==
+ MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
+}
+
+void mlx5e_set_rx_cq_mode_params(struct mlx5e_params *params, u8 cq_period_mode)
+{
+ mlx5e_reset_rx_moderation(params, cq_period_mode);
MLX5E_SET_PFLAG(params, MLX5E_PFLAG_RX_CQE_BASED_MODER,
params->rx_cq_moderation.cq_period_mode ==
MLX5_CQ_PERIOD_MODE_START_FROM_CQE);
err_destroy_direct_tirs:
mlx5e_destroy_direct_tirs(priv, priv->direct_tir);
err_destroy_indirect_tirs:
- mlx5e_destroy_indirect_tirs(priv, true);
+ mlx5e_destroy_indirect_tirs(priv);
err_destroy_direct_rqts:
mlx5e_destroy_direct_rqts(priv, priv->direct_tir);
err_destroy_indirect_rqts:
mlx5e_destroy_direct_tirs(priv, priv->xsk_tir);
mlx5e_destroy_direct_rqts(priv, priv->xsk_tir);
mlx5e_destroy_direct_tirs(priv, priv->direct_tir);
- mlx5e_destroy_indirect_tirs(priv, true);
+ mlx5e_destroy_indirect_tirs(priv);
mlx5e_destroy_direct_rqts(priv, priv->direct_tir);
mlx5e_destroy_rqt(priv, &priv->indir_rqt);
mlx5e_close_drop_rq(&priv->drop_rq);
return netdev->netdev_ops == &mlx5e_netdev_ops_uplink_rep;
}
-bool mlx5e_eswitch_rep(struct net_device *netdev)
+bool mlx5e_eswitch_vf_rep(struct net_device *netdev)
{
- if (netdev->netdev_ops == &mlx5e_netdev_ops_rep ||
- netdev->netdev_ops == &mlx5e_netdev_ops_uplink_rep)
- return true;
-
- return false;
+ return netdev->netdev_ops == &mlx5e_netdev_ops_rep;
}
static void mlx5e_build_rep_params(struct net_device *netdev)
err_destroy_direct_tirs:
mlx5e_destroy_direct_tirs(priv, priv->direct_tir);
err_destroy_indirect_tirs:
- mlx5e_destroy_indirect_tirs(priv, false);
+ mlx5e_destroy_indirect_tirs(priv);
err_destroy_direct_rqts:
mlx5e_destroy_direct_rqts(priv, priv->direct_tir);
err_destroy_indirect_rqts:
mlx5e_destroy_rep_root_ft(priv);
mlx5e_destroy_ttc_table(priv, &priv->fs.ttc);
mlx5e_destroy_direct_tirs(priv, priv->direct_tir);
- mlx5e_destroy_indirect_tirs(priv, false);
+ mlx5e_destroy_indirect_tirs(priv);
mlx5e_destroy_direct_rqts(priv, priv->direct_tir);
mlx5e_destroy_rqt(priv, &priv->indir_rqt);
mlx5e_close_drop_rq(&priv->drop_rq);
void mlx5e_rep_queue_neigh_stats_work(struct mlx5e_priv *priv);
-bool mlx5e_eswitch_rep(struct net_device *netdev);
+bool mlx5e_eswitch_vf_rep(struct net_device *netdev);
bool mlx5e_eswitch_uplink_rep(struct net_device *netdev);
+static inline bool mlx5e_eswitch_rep(struct net_device *netdev)
+{
+ return mlx5e_eswitch_vf_rep(netdev) ||
+ mlx5e_eswitch_uplink_rep(netdev);
+}
#else /* CONFIG_MLX5_ESWITCH */
static inline bool mlx5e_is_uplink_rep(struct mlx5e_priv *priv) { return false; }
flow_rule_match_meta(rule, &match);
if (match.mask->ingress_ifindex != 0xFFFFFFFF) {
NL_SET_ERR_MSG_MOD(extack, "Unsupported ingress ifindex mask");
- return -EINVAL;
+ return -EOPNOTSUPP;
}
ingress_dev = __dev_get_by_index(dev_net(filter_dev),
if (!ingress_dev) {
NL_SET_ERR_MSG_MOD(extack,
"Can't find the ingress port to match on");
- return -EINVAL;
+ return -ENOENT;
}
if (ingress_dev != filter_dev) {
NL_SET_ERR_MSG_MOD(extack,
"Can't match on the ingress filter port");
- return -EINVAL;
+ return -EOPNOTSUPP;
}
return 0;
return true;
}
+static bool same_port_devs(struct mlx5e_priv *priv, struct mlx5e_priv *peer_priv)
+{
+ return priv->mdev == peer_priv->mdev;
+}
+
static bool same_hw_devs(struct mlx5e_priv *priv, struct mlx5e_priv *peer_priv)
{
struct mlx5_core_dev *fmdev, *pmdev;
}
-static bool is_merged_eswitch_dev(struct mlx5e_priv *priv,
+static bool is_merged_eswitch_vfs(struct mlx5e_priv *priv,
struct net_device *peer_netdev)
{
struct mlx5e_priv *peer_priv;
peer_priv = netdev_priv(peer_netdev);
return (MLX5_CAP_ESW(priv->mdev, merged_eswitch) &&
- mlx5e_eswitch_rep(priv->netdev) &&
- mlx5e_eswitch_rep(peer_netdev) &&
+ mlx5e_eswitch_vf_rep(priv->netdev) &&
+ mlx5e_eswitch_vf_rep(peer_netdev) &&
same_hw_devs(priv, peer_priv));
}
-
-
bool mlx5e_encap_take(struct mlx5e_encap_entry *e)
{
return refcount_inc_not_zero(&e->refcnt);
return err;
}
+static bool same_hw_reps(struct mlx5e_priv *priv,
+ struct net_device *peer_netdev)
+{
+ struct mlx5e_priv *peer_priv;
+
+ peer_priv = netdev_priv(peer_netdev);
+
+ return mlx5e_eswitch_rep(priv->netdev) &&
+ mlx5e_eswitch_rep(peer_netdev) &&
+ same_hw_devs(priv, peer_priv);
+}
+
+static bool is_lag_dev(struct mlx5e_priv *priv,
+ struct net_device *peer_netdev)
+{
+ return ((mlx5_lag_is_sriov(priv->mdev) ||
+ mlx5_lag_is_multipath(priv->mdev)) &&
+ same_hw_reps(priv, peer_netdev));
+}
+
bool mlx5e_is_valid_eswitch_fwd_dev(struct mlx5e_priv *priv,
struct net_device *out_dev)
{
- if (is_merged_eswitch_dev(priv, out_dev))
+ if (is_merged_eswitch_vfs(priv, out_dev))
+ return true;
+
+ if (is_lag_dev(priv, out_dev))
return true;
return mlx5e_eswitch_rep(out_dev) &&
- same_hw_devs(priv, netdev_priv(out_dev));
+ same_port_devs(priv, netdev_priv(out_dev));
}
static bool is_duplicated_output_device(struct net_device *dev,
if (!mlx5e_is_valid_eswitch_fwd_dev(priv, out_dev)) {
NL_SET_ERR_MSG_MOD(extack,
"devices are not on same switch HW, can't offload forwarding");
- netdev_warn(priv->netdev,
- "devices %s %s not on same switch HW, can't offload forwarding\n",
- priv->netdev->name,
- out_dev->name);
return -EOPNOTSUPP;
}
dpkts = cur_stats.rx_packets - rpriv->prev_vf_vport_stats.rx_packets;
dbytes = cur_stats.rx_bytes - rpriv->prev_vf_vport_stats.rx_bytes;
rpriv->prev_vf_vport_stats = cur_stats;
- flow_stats_update(&ma->stats, dpkts, dbytes, jiffies,
+ flow_stats_update(&ma->stats, dbytes, dpkts, jiffies,
FLOW_ACTION_HW_STATS_DELAYED);
}
void mlx5e_free_txqsq_descs(struct mlx5e_txqsq *sq)
{
struct mlx5e_tx_wqe_info *wi;
+ u32 dma_fifo_cc, nbytes = 0;
+ u16 ci, sqcc, npkts = 0;
struct sk_buff *skb;
- u32 dma_fifo_cc;
- u16 sqcc;
- u16 ci;
int i;
sqcc = sq->cc;
}
dev_kfree_skb_any(skb);
+ npkts++;
+ nbytes += wi->num_bytes;
sqcc += wi->num_wqebbs;
}
sq->dma_fifo_cc = dma_fifo_cc;
sq->cc = sqcc;
+
+ netdev_tx_completed_queue(sq->txq, npkts, nbytes);
}
#ifdef CONFIG_MLX5_CORE_IPOIB
.nent = MLX5_NUM_CMD_EQE,
.mask[0] = 1ull << MLX5_EVENT_TYPE_CMD,
};
+ mlx5_cmd_allowed_opcode(dev, MLX5_CMD_OP_CREATE_EQ);
err = setup_async_eq(dev, &table->cmd_eq, ¶m, "cmd");
if (err)
goto err1;
mlx5_cmd_use_events(dev);
+ mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
param = (struct mlx5_eq_param) {
.irq_index = 0,
mlx5_cmd_use_polling(dev);
cleanup_async_eq(dev, &table->cmd_eq, "cmd");
err1:
+ mlx5_cmd_allowed_opcode(dev, CMD_ALLOWED_OPCODE_ALL);
mlx5_eq_notifier_unregister(dev, &table->cq_err_nb);
return err;
}
events->dev = dev;
dev->priv.events = events;
events->wq = create_singlethread_workqueue("mlx5_events");
- if (!events->wq)
+ if (!events->wq) {
+ kfree(events);
return -ENOMEM;
+ }
INIT_WORK(&events->pcie_core_work, mlx5_pcie_event);
return 0;
if (node->del_hw_func)
node->del_hw_func(node);
if (parent_node) {
- /* Only root namespace doesn't have parent and we just
- * need to free its node.
- */
down_write_ref_node(parent_node, locked);
list_del_init(&node->list);
- if (node->del_sw_func)
- node->del_sw_func(node);
- up_write_ref_node(parent_node, locked);
- } else {
- kfree(node);
}
+ node->del_sw_func(node);
+ if (parent_node)
+ up_write_ref_node(parent_node, locked);
node = NULL;
}
if (!node && parent_node)
fs_get_obj(ft, node);
rhltable_destroy(&ft->fgs_hash);
- fs_get_obj(prio, ft->node.parent);
- prio->num_ft--;
+ if (ft->node.parent) {
+ fs_get_obj(prio, ft->node.parent);
+ prio->num_ft--;
+ }
kfree(ft);
}
return 0;
}
+static void del_sw_root_ns(struct fs_node *node)
+{
+ struct mlx5_flow_root_namespace *root_ns;
+ struct mlx5_flow_namespace *ns;
+
+ fs_get_obj(ns, node);
+ root_ns = container_of(ns, struct mlx5_flow_root_namespace, ns);
+ mutex_destroy(&root_ns->chain_lock);
+ kfree(node);
+}
+
static struct mlx5_flow_root_namespace
*create_root_ns(struct mlx5_flow_steering *steering,
enum fs_flow_table_type table_type)
ns = &root_ns->ns;
fs_init_namespace(ns);
mutex_init(&root_ns->chain_lock);
- tree_init_node(&ns->node, NULL, NULL);
+ tree_init_node(&ns->node, NULL, del_sw_root_ns);
tree_add_node(&ns->node, NULL);
return root_ns;
err_destroy_direct_tirs:
mlx5e_destroy_direct_tirs(priv, priv->direct_tir);
err_destroy_indirect_tirs:
- mlx5e_destroy_indirect_tirs(priv, true);
+ mlx5e_destroy_indirect_tirs(priv);
err_destroy_direct_rqts:
mlx5e_destroy_direct_rqts(priv, priv->direct_tir);
err_destroy_indirect_rqts:
{
mlx5i_destroy_flow_steering(priv);
mlx5e_destroy_direct_tirs(priv, priv->direct_tir);
- mlx5e_destroy_indirect_tirs(priv, true);
+ mlx5e_destroy_indirect_tirs(priv);
mlx5e_destroy_direct_rqts(priv, priv->direct_tir);
mlx5e_destroy_rqt(priv, &priv->indir_rqt);
mlx5e_close_drop_rq(&priv->drop_rq);
goto err_cmd_cleanup;
}
+ mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_UP);
+
err = mlx5_core_enable_hca(dev, 0);
if (err) {
mlx5_core_err(dev, "enable hca failed\n");
err_disable_hca:
mlx5_core_disable_hca(dev, 0);
err_cmd_cleanup:
+ mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN);
mlx5_cmd_cleanup(dev);
return err;
}
mlx5_reclaim_startup_pages(dev);
mlx5_core_disable_hca(dev, 0);
+ mlx5_cmd_set_state(dev, MLX5_CMDIF_STATE_DOWN);
mlx5_cmd_cleanup(dev);
return 0;
err = mlx5_function_setup(dev, boot);
if (err)
- goto out;
+ goto err_function;
if (boot) {
err = mlx5_init_once(dev);
mlx5_cleanup_once(dev);
function_teardown:
mlx5_function_teardown(dev, boot);
+err_function:
dev->state = MLX5_DEVICE_STATE_INTERNAL_ERROR;
mutex_unlock(&dev->intf_state_mutex);
mlx5_pci_disable_device(dev);
}
+static int mlx5_suspend(struct pci_dev *pdev, pm_message_t state)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+
+ mlx5_unload_one(dev, false);
+
+ return 0;
+}
+
+static int mlx5_resume(struct pci_dev *pdev)
+{
+ struct mlx5_core_dev *dev = pci_get_drvdata(pdev);
+
+ return mlx5_load_one(dev, false);
+}
+
static const struct pci_device_id mlx5_core_pci_table[] = {
{ PCI_VDEVICE(MELLANOX, PCI_DEVICE_ID_MELLANOX_CONNECTIB) },
{ PCI_VDEVICE(MELLANOX, 0x1012), MLX5_PCI_DEV_IS_VF}, /* Connect-IB VF */
.id_table = mlx5_core_pci_table,
.probe = init_one,
.remove = remove_one,
+ .suspend = mlx5_suspend,
+ .resume = mlx5_resume,
.shutdown = shutdown,
.err_handler = &mlx5_err_handler,
.sriov_configure = mlx5_core_sriov_configure,
mlxsw_sp_port_remove(mlxsw_sp, i);
mlxsw_sp_cpu_port_remove(mlxsw_sp);
kfree(mlxsw_sp->ports);
+ mlxsw_sp->ports = NULL;
}
static int mlxsw_sp_ports_create(struct mlxsw_sp *mlxsw_sp)
mlxsw_sp_cpu_port_remove(mlxsw_sp);
err_cpu_port_create:
kfree(mlxsw_sp->ports);
+ mlxsw_sp->ports = NULL;
return err;
}
return mlxsw_core_res_get(mlxsw_core, local_ports_in_x_res_id);
}
+static struct mlxsw_sp_port *
+mlxsw_sp_port_get_by_local_port(struct mlxsw_sp *mlxsw_sp, u8 local_port)
+{
+ if (mlxsw_sp->ports && mlxsw_sp->ports[local_port])
+ return mlxsw_sp->ports[local_port];
+ return NULL;
+}
+
static int mlxsw_sp_port_split(struct mlxsw_core *mlxsw_core, u8 local_port,
unsigned int count,
struct netlink_ext_ack *extack)
int i;
int err;
- mlxsw_sp_port = mlxsw_sp->ports[local_port];
+ mlxsw_sp_port = mlxsw_sp_port_get_by_local_port(mlxsw_sp, local_port);
if (!mlxsw_sp_port) {
dev_err(mlxsw_sp->bus_info->dev, "Port number \"%d\" does not exist\n",
local_port);
int offset;
int i;
- mlxsw_sp_port = mlxsw_sp->ports[local_port];
+ mlxsw_sp_port = mlxsw_sp_port_get_by_local_port(mlxsw_sp, local_port);
if (!mlxsw_sp_port) {
dev_err(mlxsw_sp->bus_info->dev, "Port number \"%d\" does not exist\n",
local_port);
if (mlxsw_sx_port_created(mlxsw_sx, i))
mlxsw_sx_port_remove(mlxsw_sx, i);
kfree(mlxsw_sx->ports);
+ mlxsw_sx->ports = NULL;
}
static int mlxsw_sx_ports_create(struct mlxsw_sx *mlxsw_sx)
if (mlxsw_sx_port_created(mlxsw_sx, i))
mlxsw_sx_port_remove(mlxsw_sx, i);
kfree(mlxsw_sx->ports);
+ mlxsw_sx->ports = NULL;
return err;
}
u8 module, width;
int err;
+ if (!mlxsw_sx->ports || !mlxsw_sx->ports[local_port]) {
+ dev_err(mlxsw_sx->bus_info->dev, "Port number \"%d\" does not exist\n",
+ local_port);
+ return -EINVAL;
+ }
+
if (new_type == DEVLINK_PORT_TYPE_AUTO)
return -EOPNOTSUPP;
if (unlikely(ret)) {
netif_err(priv, probe, ndev, "Error %d initializing card encx24j600 card\n",
ret);
- goto out_free;
+ goto out_stop;
}
eidled = encx24j600_read_reg(priv, EIDLED);
out_unregister:
unregister_netdev(priv->ndev);
+out_stop:
+ kthread_stop(priv->kworker_task);
out_free:
free_netdev(ndev);
struct encx24j600_priv *priv = dev_get_drvdata(&spi->dev);
unregister_netdev(priv->ndev);
+ kthread_stop(priv->kworker_task);
free_netdev(priv->ndev);
unsigned long ageing_clock_t)
{
unsigned long ageing_jiffies = clock_t_to_jiffies(ageing_clock_t);
- u32 ageing_time = jiffies_to_msecs(ageing_jiffies) / 1000;
+ u32 ageing_time = jiffies_to_msecs(ageing_jiffies);
ocelot_set_ageing_time(ocelot, ageing_time);
}
goto err_free_alink;
alink->prio_map = kzalloc(abm->prio_map_len, GFP_KERNEL);
- if (!alink->prio_map)
+ if (!alink->prio_map) {
+ err = -ENOMEM;
goto err_free_alink;
+ }
/* This is a multi-host app, make sure MAC/PHY is up, but don't
* make the MAC/PHY state follow the state of any of the ports.
ctx_id = be32_to_cpu(sub_flow->meta.host_ctx_id);
priv->stats[ctx_id].pkts += pkts;
priv->stats[ctx_id].bytes += bytes;
- max_t(u64, priv->stats[ctx_id].used, used);
+ priv->stats[ctx_id].used = max_t(u64, used,
+ priv->stats[ctx_id].used);
}
}
dev_info(ionic->dev, "FW Up: restarting LIFs\n");
ionic_init_devinfo(ionic);
+ ionic_port_init(ionic);
err = ionic_qcqs_alloc(lif);
if (err)
goto err_out;
if (is_zero_ether_addr(ctx.comp.lif_getattr.mac))
return 0;
- if (!ether_addr_equal(ctx.comp.lif_getattr.mac, netdev->dev_addr)) {
+ if (!is_zero_ether_addr(netdev->dev_addr)) {
+ /* If the netdev mac is non-zero and doesn't match the default
+ * device address, it was set by something earlier and we're
+ * likely here again after a fw-upgrade reset. We need to be
+ * sure the netdev mac is in our filter list.
+ */
+ if (!ether_addr_equal(ctx.comp.lif_getattr.mac,
+ netdev->dev_addr))
+ ionic_lif_addr(lif, netdev->dev_addr, true);
+ } else {
+ /* Update the netdev mac with the device's mac */
memcpy(addr.sa_data, ctx.comp.lif_getattr.mac, netdev->addr_len);
addr.sa_family = AF_INET;
err = eth_prepare_mac_addr_change(netdev, &addr);
return 0;
}
- if (!is_zero_ether_addr(netdev->dev_addr)) {
- netdev_dbg(lif->netdev, "deleting station MAC addr %pM\n",
- netdev->dev_addr);
- ionic_lif_addr(lif, netdev->dev_addr, false);
- }
-
eth_commit_mac_addr_change(netdev, &addr);
}
size_t sz;
int err;
- if (idev->port_info)
- return 0;
-
- idev->port_info_sz = ALIGN(sizeof(*idev->port_info), PAGE_SIZE);
- idev->port_info = dma_alloc_coherent(ionic->dev, idev->port_info_sz,
- &idev->port_info_pa,
- GFP_KERNEL);
if (!idev->port_info) {
- dev_err(ionic->dev, "Failed to allocate port info, aborting\n");
- return -ENOMEM;
+ idev->port_info_sz = ALIGN(sizeof(*idev->port_info), PAGE_SIZE);
+ idev->port_info = dma_alloc_coherent(ionic->dev,
+ idev->port_info_sz,
+ &idev->port_info_pa,
+ GFP_KERNEL);
+ if (!idev->port_info) {
+ dev_err(ionic->dev, "Failed to allocate port info\n");
+ return -ENOMEM;
+ }
}
sz = min(sizeof(ident->port.config), sizeof(idev->dev_cmd_regs->data));
ahw->diag_cnt = 0;
ret = qlcnic_alloc_mbx_args(&cmd, adapter, QLCNIC_CMD_INTRPT_TEST);
if (ret)
- goto fail_diag_irq;
+ goto fail_mbx_args;
if (adapter->flags & QLCNIC_MSIX_ENABLED)
intrpt_id = ahw->intr_tbl[0].id;
done:
qlcnic_free_mbx_args(&cmd);
+
+fail_mbx_args:
qlcnic_83xx_diag_free_res(netdev, drv_sds_rings);
fail_diag_irq:
RTL_R32(tp, EPHYAR) & EPHYAR_DATA_MASK : ~0;
}
+static void r8168fp_adjust_ocp_cmd(struct rtl8169_private *tp, u32 *cmd, int type)
+{
+ /* based on RTL8168FP_OOBMAC_BASE in vendor driver */
+ if (tp->mac_version == RTL_GIGA_MAC_VER_52 && type == ERIAR_OOB)
+ *cmd |= 0x7f0 << 18;
+}
+
DECLARE_RTL_COND(rtl_eriar_cond)
{
return RTL_R32(tp, ERIAR) & ERIAR_FLAG;
static void _rtl_eri_write(struct rtl8169_private *tp, int addr, u32 mask,
u32 val, int type)
{
+ u32 cmd = ERIAR_WRITE_CMD | type | mask | addr;
+
BUG_ON((addr & 3) || (mask == 0));
RTL_W32(tp, ERIDR, val);
- RTL_W32(tp, ERIAR, ERIAR_WRITE_CMD | type | mask | addr);
+ r8168fp_adjust_ocp_cmd(tp, &cmd, type);
+ RTL_W32(tp, ERIAR, cmd);
rtl_udelay_loop_wait_low(tp, &rtl_eriar_cond, 100, 100);
}
static u32 _rtl_eri_read(struct rtl8169_private *tp, int addr, int type)
{
- RTL_W32(tp, ERIAR, ERIAR_READ_CMD | type | ERIAR_MASK_1111 | addr);
+ u32 cmd = ERIAR_READ_CMD | type | ERIAR_MASK_1111 | addr;
+
+ r8168fp_adjust_ocp_cmd(tp, &cmd, type);
+ RTL_W32(tp, ERIAR, cmd);
return rtl_udelay_loop_wait_high(tp, &rtl_eriar_cond, 100, 100) ?
RTL_R32(tp, ERIDR) : ~0;
{ 0x7cf, 0x348, RTL_GIGA_MAC_VER_07 },
{ 0x7cf, 0x248, RTL_GIGA_MAC_VER_07 },
{ 0x7cf, 0x340, RTL_GIGA_MAC_VER_13 },
+ /* RTL8401, reportedly works if treated as RTL8101e */
+ { 0x7cf, 0x240, RTL_GIGA_MAC_VER_13 },
{ 0x7cf, 0x343, RTL_GIGA_MAC_VER_10 },
{ 0x7cf, 0x342, RTL_GIGA_MAC_VER_16 },
{ 0x7c8, 0x348, RTL_GIGA_MAC_VER_09 },
ip = netdev_priv(dev);
ip->dma_dev = pdev->dev.parent;
ip->regs = devm_platform_ioremap_resource(pdev, 0);
- if (!ip->regs) {
- err = -ENOMEM;
+ if (IS_ERR(ip->regs)) {
+ err = PTR_ERR(ip->regs);
goto out_free;
}
ip->ssram = devm_platform_ioremap_resource(pdev, 1);
- if (!ip->ssram) {
- err = -ENOMEM;
+ if (IS_ERR(ip->ssram)) {
+ err = PTR_ERR(ip->ssram);
goto out_free;
}
retval = smsc911x_init(dev);
if (retval < 0)
- goto out_disable_resources;
+ goto out_init_fail;
netif_carrier_off(dev);
retval = smsc911x_mii_init(pdev, dev);
if (retval) {
SMSC_WARN(pdata, probe, "Error %i initialising mii", retval);
- goto out_disable_resources;
+ goto out_init_fail;
}
retval = register_netdev(dev);
if (retval) {
SMSC_WARN(pdata, probe, "Error %i registering device", retval);
- goto out_disable_resources;
+ goto out_init_fail;
} else {
SMSC_TRACE(pdata, probe,
"Network interface: \"%s\"", dev->name);
return 0;
-out_disable_resources:
+out_init_fail:
pm_runtime_put(&pdev->dev);
pm_runtime_disable(&pdev->dev);
+out_disable_resources:
(void)smsc911x_disable_resources(pdev);
out_enable_resources_fail:
smsc911x_free_resources(pdev);
/* Enable PTP clock */
regmap_read(gmac->nss_common, NSS_COMMON_CLK_GATE, &val);
val |= NSS_COMMON_CLK_GATE_PTP_EN(gmac->id);
+ switch (gmac->phy_mode) {
+ case PHY_INTERFACE_MODE_RGMII:
+ val |= NSS_COMMON_CLK_GATE_RGMII_RX_EN(gmac->id) |
+ NSS_COMMON_CLK_GATE_RGMII_TX_EN(gmac->id);
+ break;
+ case PHY_INTERFACE_MODE_SGMII:
+ val |= NSS_COMMON_CLK_GATE_GMII_RX_EN(gmac->id) |
+ NSS_COMMON_CLK_GATE_GMII_TX_EN(gmac->id);
+ break;
+ default:
+ /* We don't get here; the switch above will have errored out */
+ unreachable();
+ }
regmap_write(gmac->nss_common, NSS_COMMON_CLK_GATE, val);
if (gmac->phy_mode == PHY_INTERFACE_MODE_SGMII) {
unsigned int value;
};
+struct ethqos_emac_driver_data {
+ const struct ethqos_emac_por *por;
+ unsigned int num_por;
+};
+
struct qcom_ethqos {
struct platform_device *pdev;
void __iomem *rgmii_base;
{ .offset = RGMII_IO_MACRO_CONFIG2, .value = 0x00002060 },
};
+static const struct ethqos_emac_driver_data emac_v2_3_0_data = {
+ .por = emac_v2_3_0_por,
+ .num_por = ARRAY_SIZE(emac_v2_3_0_por),
+};
+
static int ethqos_dll_configure(struct qcom_ethqos *ethqos)
{
unsigned int val;
struct device_node *np = pdev->dev.of_node;
struct plat_stmmacenet_data *plat_dat;
struct stmmac_resources stmmac_res;
+ const struct ethqos_emac_driver_data *data;
struct qcom_ethqos *ethqos;
struct resource *res;
int ret;
goto err_mem;
}
- ethqos->por = of_device_get_match_data(&pdev->dev);
+ data = of_device_get_match_data(&pdev->dev);
+ ethqos->por = data->por;
+ ethqos->num_por = data->num_por;
ethqos->rgmii_clk = devm_clk_get(&pdev->dev, "rgmii");
if (IS_ERR(ethqos->rgmii_clk)) {
}
static const struct of_device_id qcom_ethqos_match[] = {
- { .compatible = "qcom,qcs404-ethqos", .data = &emac_v2_3_0_por},
+ { .compatible = "qcom,qcs404-ethqos", .data = &emac_v2_3_0_data},
{ }
};
MODULE_DEVICE_TABLE(of, qcom_ethqos_match);
config.rx_filter = HWTSTAMP_FILTER_PTP_V2_EVENT;
ptp_v2 = PTP_TCR_TSVER2ENA;
snap_type_sel = PTP_TCR_SNAPTYPSEL_1;
- ts_event_en = PTP_TCR_TSEVNTENA;
+ if (priv->synopsys_id != DWMAC_CORE_5_10)
+ ts_event_en = PTP_TCR_TSEVNTENA;
ptp_over_ipv4_udp = PTP_TCR_TSIPV4ENA;
ptp_over_ipv6_udp = PTP_TCR_TSIPV6ENA;
ptp_over_ethernet = PTP_TCR_TSIPENA;
return ret;
}
- netif_device_attach(ndev);
-
mutex_lock(&priv->lock);
stmmac_reset_queues_param(priv);
phylink_mac_change(priv->phylink, true);
+ netif_device_attach(ndev);
+
return 0;
}
EXPORT_SYMBOL_GPL(stmmac_resume);
cas_cacheline_size)) {
dev_err(&pdev->dev, "Could not set PCI cache "
"line size\n");
- goto err_write_cacheline;
+ goto err_out_free_res;
}
}
#endif
err_out_free_res:
pci_release_regions(pdev);
-err_write_cacheline:
/* Try to restore it in case the error occurred after we
* set it.
*/
config TI_CPSW
tristate "TI CPSW Switch Support"
depends on ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST
+ depends on TI_CPTS || !TI_CPTS
select TI_DAVINCI_MDIO
select MFD_SYSCON
select PAGE_POOL
tristate "TI CPSW Switch Support with switchdev"
depends on ARCH_DAVINCI || ARCH_OMAP2PLUS || COMPILE_TEST
depends on NET_SWITCHDEV
+ depends on TI_CPTS || !TI_CPTS
select PAGE_POOL
select TI_DAVINCI_MDIO
select MFD_SYSCON
will be called cpsw_new.
config TI_CPTS
- bool "TI Common Platform Time Sync (CPTS) Support"
- depends on TI_CPSW || TI_KEYSTONE_NETCP || TI_CPSW_SWITCHDEV || COMPILE_TEST
+ tristate "TI Common Platform Time Sync (CPTS) Support"
+ depends on ARCH_OMAP2PLUS || ARCH_KEYSTONE || COMPILE_TEST
depends on COMMON_CLK
- depends on POSIX_TIMERS
+ depends on PTP_1588_CLOCK
---help---
This driver supports the Common Platform Time Sync unit of
the CPSW Ethernet Switch and Keystone 2 1g/10g Switch Subsystem.
The unit can time stamp PTP UDP/IPv4 and Layer 2 packets, and the
driver offers a PTP Hardware Clock.
-config TI_CPTS_MOD
- tristate
- depends on TI_CPTS
- depends on PTP_1588_CLOCK
- default y if TI_CPSW=y || TI_KEYSTONE_NETCP=y || TI_CPSW_SWITCHDEV=y
- default m
-
config TI_K3_AM65_CPSW_NUSS
tristate "TI K3 AM654x/J721E CPSW Ethernet driver"
depends on ARCH_K3 && OF && TI_K3_UDMA_GLUE_LAYER
select TI_DAVINCI_MDIO
depends on OF
depends on KEYSTONE_NAVIGATOR_DMA && KEYSTONE_NAVIGATOR_QMSS
+ depends on TI_CPTS || !TI_CPTS
---help---
This driver supports TI's Keystone NETCP Core.
ti_davinci_emac-y := davinci_emac.o davinci_cpdma.o
obj-$(CONFIG_TI_DAVINCI_MDIO) += davinci_mdio.o
obj-$(CONFIG_TI_CPSW_PHY_SEL) += cpsw-phy-sel.o
-obj-$(CONFIG_TI_CPTS_MOD) += cpts.o
+obj-$(CONFIG_TI_CPTS) += cpts.o
obj-$(CONFIG_TI_CPSW) += ti_cpsw.o
ti_cpsw-y := cpsw.o davinci_cpdma.o cpsw_ale.o cpsw_priv.o cpsw_sl.o cpsw_ethtool.o
obj-$(CONFIG_TI_CPSW_SWITCHDEV) += ti_cpsw_new.o
ale_params.nu_switch_ale = true;
common->ale = cpsw_ale_create(&ale_params);
- if (!common->ale) {
+ if (IS_ERR(common->ale)) {
dev_err(dev, "error initializing ale engine\n");
+ ret = PTR_ERR(common->ale);
goto err_of_clear;
}
struct cpsw_common *cpsw = dev_get_drvdata(dev);
int i;
+ rtnl_lock();
+
for (i = 0; i < cpsw->data.slaves; i++)
if (cpsw->slaves[i].ndev)
if (netif_running(cpsw->slaves[i].ndev))
cpsw_ndo_stop(cpsw->slaves[i].ndev);
+ rtnl_unlock();
+
/* Select sleep pin state */
pinctrl_pm_select_sleep_state(dev);
ale = devm_kzalloc(params->dev, sizeof(*ale), GFP_KERNEL);
if (!ale)
- return NULL;
+ return ERR_PTR(-ENOMEM);
ale->p0_untag_vid_mask =
devm_kmalloc_array(params->dev, BITS_TO_LONGS(VLAN_N_VID),
ale_params.ale_ports = CPSW_ALE_PORTS_NUM;
cpsw->ale = cpsw_ale_create(&ale_params);
- if (!cpsw->ale) {
+ if (IS_ERR(cpsw->ale)) {
dev_err(dev, "error initializing ale engine\n");
- return -ENODEV;
+ return PTR_ERR(cpsw->ale);
}
dma_params.dev = dev;
ale_params.nu_switch_ale = true;
}
gbe_dev->ale = cpsw_ale_create(&ale_params);
- if (!gbe_dev->ale) {
+ if (IS_ERR(gbe_dev->ale)) {
dev_err(gbe_dev->dev, "error initializing ale engine\n");
- ret = -ENODEV;
+ ret = PTR_ERR(gbe_dev->ale);
goto free_sec_ports;
} else {
dev_dbg(gbe_dev->dev, "Created a gbe ale engine\n");
{
struct bpqdev *bpq;
- list_for_each_entry_rcu(bpq, &bpq_devices, bpq_list) {
+ list_for_each_entry_rcu(bpq, &bpq_devices, bpq_list,
+ lockdep_rtnl_is_held()) {
if (bpq->ethdev == dev)
return bpq->axdev;
}
while (count < budget) {
struct gsi_trans *trans;
+ count++;
trans = gsi_channel_poll_one(channel);
if (!trans)
break;
/* assert(which < trans->tre_count); */
/* Set the page information for the buffer. We also need to fill in
- * the DMA address for the buffer (something dma_map_sg() normally
- * does).
+ * the DMA address and length for the buffer (something dma_map_sg()
+ * normally does).
*/
sg = &trans->sgl[which];
sg_set_buf(sg, buf, size);
sg_dma_address(sg) = addr;
+ sg_dma_len(sg) = sg->length;
info = &trans->info[which];
info->opcode = opcode;
void ipa_cmd_tag_process_add(struct gsi_trans *trans)
{
- ipa_cmd_register_write_add(trans, 0, 0, 0, true);
-#if 1
- /* Reference these functions to avoid a compile error */
- (void)ipa_cmd_ip_packet_init_add;
- (void)ipa_cmd_ip_tag_status_add;
- (void) ipa_cmd_transfer_add;
-#else
struct ipa *ipa = container_of(trans->gsi, struct ipa, gsi);
- struct gsi_endpoint *endpoint;
+ struct ipa_endpoint *endpoint;
endpoint = ipa->name_map[IPA_ENDPOINT_AP_LAN_RX];
- ipa_cmd_ip_packet_init_add(trans, endpoint->endpoint_id);
+ ipa_cmd_register_write_add(trans, 0, 0, 0, true);
+ ipa_cmd_ip_packet_init_add(trans, endpoint->endpoint_id);
ipa_cmd_ip_tag_status_add(trans, 0xcba987654321);
-
ipa_cmd_transfer_add(trans, 4);
-#endif
}
/* Returns the number of commands required for the tag process */
* @clock_on: Whether IPA clock is on
* @notified: Whether modem has been notified of clock state
* @disabled: Whether setup ready interrupt handling is disabled
- * @mutex mutex: Motex protecting ready interrupt/shutdown interlock
+ * @mutex: Mutex protecting ready-interrupt/shutdown interlock
* @panic_notifier: Panic notifier structure
*/
struct ipa_smp2p {
return -EINVAL;
cnt = &nsim_dev->trap_data->trap_policers_cnt_arr[policer->id - 1];
- *p_drops = *cnt;
- *cnt += jiffies % 64;
+ *p_drops = (*cnt)++;
return 0;
}
else
val |= BCM54XX_SHD_SCR3_DLLAPD_DIS;
- if (phydev->dev_flags & PHY_BRCM_DIS_TXCRXC_NOENRGY)
- val |= BCM54XX_SHD_SCR3_TRDDAPD;
+ if (phydev->dev_flags & PHY_BRCM_DIS_TXCRXC_NOENRGY) {
+ if (BRCM_PHY_MODEL(phydev) == PHY_ID_BCM54810)
+ val |= BCM54810_SHD_SCR3_TRDDAPD;
+ else
+ val |= BCM54XX_SHD_SCR3_TRDDAPD;
+ }
if (orig != val)
bcm_phy_write_shadow(phydev, BCM54XX_SHD_SCR3, val);
u64 *stats;
int nstats;
bool pkg_init;
+ /* PHY address within the package. */
+ u8 addr;
/* For multiple port PHYs; the MDIO address of the base PHY in the
* package.
*/
#define MSCC_MAC_PAUSE_CFG_STATE_PAUSE_STATE BIT(0)
#define MSCC_MAC_PAUSE_CFG_STATE_MAC_TX_PAUSE_GEN BIT(4)
-#define MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL 0x2
-#define MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE(x) (x)
-#define MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE_M GENMASK(2, 0)
+#define MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL 0x2
+#define MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE(x) (x)
+#define MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE_M GENMASK(2, 0)
#endif /* _MSCC_PHY_LINE_MAC_H_ */
/* Must be called with mdio_lock taken */
static int __vsc8584_macsec_init(struct phy_device *phydev)
{
+ struct vsc8531_private *priv = phydev->priv;
+ enum macsec_bank proc_bank;
u32 val;
vsc8584_macsec_block_init(phydev, MACSEC_INGR);
val |= MSCC_FCBUF_ENA_CFG_TX_ENA | MSCC_FCBUF_ENA_CFG_RX_ENA;
vsc8584_macsec_phy_write(phydev, FC_BUFFER, MSCC_FCBUF_ENA_CFG, val);
- val = vsc8584_macsec_phy_read(phydev, IP_1588,
- MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL);
- val &= ~MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE_M;
- val |= MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE(4);
- vsc8584_macsec_phy_write(phydev, IP_1588,
- MSCC_PROC_0_IP_1588_TOP_CFG_STAT_MODE_CTL, val);
+ proc_bank = (priv->addr < 2) ? PROC_0 : PROC_2;
+
+ val = vsc8584_macsec_phy_read(phydev, proc_bank,
+ MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL);
+ val &= ~MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE_M;
+ val |= MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL_PROTOCOL_MODE(4);
+ vsc8584_macsec_phy_write(phydev, proc_bank,
+ MSCC_PROC_IP_1588_TOP_CFG_STAT_MODE_CTL, val);
return 0;
}
FC_BUFFER = 0x04,
HOST_MAC = 0x05,
LINE_MAC = 0x06,
- IP_1588 = 0x0e,
+ PROC_0 = 0x0e,
+ PROC_2 = 0x0f,
MACSEC_INGR = 0x38,
MACSEC_EGR = 0x3c,
};
else
vsc8531->base_addr = phydev->mdio.addr - addr;
+ vsc8531->addr = addr;
+
/* Some parts of the init sequence are identical for every PHY in the
* package. Some parts are modifying the GPIO register bank which is a
* set of registers that are affecting all PHYs, a few resetting the
else
vsc8531->base_addr = phydev->mdio.addr - addr;
+ vsc8531->addr = addr;
+
/* Some parts of the init sequence are identical for every PHY in the
* package. Some parts are modifying the GPIO register bank which is a
* set of registers that are affecting all PHYs, a few resetting the
/* Restart autonegotiation so the new modes get sent to the
* link partner.
*/
- ret = phy_restart_aneg(phydev);
- if (ret < 0)
- return ret;
+ if (phydev->autoneg == AUTONEG_ENABLE) {
+ ret = phy_restart_aneg(phydev);
+ if (ret < 0)
+ return ret;
+ }
}
return 0;
const struct sfp_upstream_ops *ops)
{
struct sfp_bus *bus;
- int ret;
+ int ret = 0;
if (phydev->mdio.dev.fwnode) {
bus = sfp_bus_find_fwnode(phydev->mdio.dev.fwnode);
ret = sfp_bus_add_upstream(bus, phydev, ops);
sfp_bus_put(bus);
}
- return 0;
+ return ret;
}
EXPORT_SYMBOL(phy_sfp_probe);
if (!skb)
goto out;
+ if (skb->pkt_type != PACKET_HOST)
+ goto abort;
+
if (!pskb_may_pull(skb, sizeof(struct pppoe_hdr)))
goto abort;
.driver_info = 0,
},
-/* Microsoft Surface 3 dock (based on Realtek RTL8153) */
+/* Microsoft Surface Ethernet Adapter (based on Realtek RTL8153) */
{
USB_DEVICE_AND_INTERFACE_INFO(MICROSOFT_VENDOR_ID, 0x07c6, USB_CLASS_COMM,
USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
.driver_info = 0,
},
- /* TP-LINK UE300 USB 3.0 Ethernet Adapters (based on Realtek RTL8153) */
+/* Microsoft Surface Ethernet Adapter (based on Realtek RTL8153B) */
+{
+ USB_DEVICE_AND_INTERFACE_INFO(MICROSOFT_VENDOR_ID, 0x0927, USB_CLASS_COMM,
+ USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
+ .driver_info = 0,
+},
+
+/* TP-LINK UE300 USB 3.0 Ethernet Adapters (based on Realtek RTL8153) */
{
USB_DEVICE_AND_INTERFACE_INFO(TPLINK_VENDOR_ID, 0x0601, USB_CLASS_COMM,
USB_CDC_SUBCLASS_ETHERNET, USB_CDC_PROTO_NONE),
if (!
(serial->out_endp =
hso_get_ep(interface, USB_ENDPOINT_XFER_BULK, USB_DIR_OUT))) {
- dev_err(&interface->dev, "Failed to find BULK IN ep\n");
+ dev_err(&interface->dev, "Failed to find BULK OUT ep\n");
goto exit2;
}
{QMI_FIXED_INTF(0x1bbb, 0x0203, 2)}, /* Alcatel L800MA */
{QMI_FIXED_INTF(0x2357, 0x0201, 4)}, /* TP-LINK HSUPA Modem MA180 */
{QMI_FIXED_INTF(0x2357, 0x9000, 4)}, /* TP-LINK MA260 */
+ {QMI_QUIRK_SET_DTR(0x1bc7, 0x1031, 3)}, /* Telit LE910C1-EUX */
{QMI_QUIRK_SET_DTR(0x1bc7, 0x1040, 2)}, /* Telit LE922A */
{QMI_QUIRK_SET_DTR(0x1bc7, 0x1050, 2)}, /* Telit FN980 */
{QMI_FIXED_INTF(0x1bc7, 0x1100, 3)}, /* Telit ME910 */
{REALTEK_USB_DEVICE(VENDOR_ID_REALTEK, 0x8153)},
{REALTEK_USB_DEVICE(VENDOR_ID_MICROSOFT, 0x07ab)},
{REALTEK_USB_DEVICE(VENDOR_ID_MICROSOFT, 0x07c6)},
+ {REALTEK_USB_DEVICE(VENDOR_ID_MICROSOFT, 0x0927)},
{REALTEK_USB_DEVICE(VENDOR_ID_SAMSUNG, 0xa101)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x304f)},
{REALTEK_USB_DEVICE(VENDOR_ID_LENOVO, 0x3062)},
break;
} while (rq->vq->num_free);
if (virtqueue_kick_prepare(rq->vq) && virtqueue_notify(rq->vq)) {
- u64_stats_update_begin(&rq->stats.syncp);
+ unsigned long flags;
+
+ flags = u64_stats_update_begin_irqsave(&rq->stats.syncp);
rq->stats.kicks++;
- u64_stats_update_end(&rq->stats.syncp);
+ u64_stats_update_end_irqrestore(&rq->stats.syncp, flags);
}
return !oom;
};
enum counter_values {
- COUNTER_BITS_TOTAL = 2048,
+ COUNTER_BITS_TOTAL = 8192,
COUNTER_REDUNDANT_BITS = BITS_PER_LONG,
COUNTER_WINDOW_SIZE = COUNTER_BITS_TOTAL - COUNTER_REDUNDANT_BITS
};
if (unlikely(!keypair))
return NULL;
+ spin_lock_init(&keypair->receiving_counter.lock);
keypair->internal_id = atomic64_inc_return(&keypair_counter);
keypair->entry.type = INDEX_HASHTABLE_KEYPAIR;
keypair->entry.peer = peer;
memzero_explicit(output, BLAKE2S_HASH_SIZE + 1);
}
-static void symmetric_key_init(struct noise_symmetric_key *key)
-{
- spin_lock_init(&key->counter.receive.lock);
- atomic64_set(&key->counter.counter, 0);
- memset(key->counter.receive.backtrack, 0,
- sizeof(key->counter.receive.backtrack));
- key->birthdate = ktime_get_coarse_boottime_ns();
- key->is_valid = true;
-}
-
static void derive_keys(struct noise_symmetric_key *first_dst,
struct noise_symmetric_key *second_dst,
const u8 chaining_key[NOISE_HASH_LEN])
{
+ u64 birthdate = ktime_get_coarse_boottime_ns();
kdf(first_dst->key, second_dst->key, NULL, NULL,
NOISE_SYMMETRIC_KEY_LEN, NOISE_SYMMETRIC_KEY_LEN, 0, 0,
chaining_key);
- symmetric_key_init(first_dst);
- symmetric_key_init(second_dst);
+ first_dst->birthdate = second_dst->birthdate = birthdate;
+ first_dst->is_valid = second_dst->is_valid = true;
}
static bool __must_check mix_dh(u8 chaining_key[NOISE_HASH_LEN],
u8 e[NOISE_PUBLIC_KEY_LEN];
u8 ephemeral_private[NOISE_PUBLIC_KEY_LEN];
u8 static_private[NOISE_PUBLIC_KEY_LEN];
+ u8 preshared_key[NOISE_SYMMETRIC_KEY_LEN];
down_read(&wg->static_identity.lock);
memcpy(chaining_key, handshake->chaining_key, NOISE_HASH_LEN);
memcpy(ephemeral_private, handshake->ephemeral_private,
NOISE_PUBLIC_KEY_LEN);
+ memcpy(preshared_key, handshake->preshared_key,
+ NOISE_SYMMETRIC_KEY_LEN);
up_read(&handshake->lock);
if (state != HANDSHAKE_CREATED_INITIATION)
goto fail;
/* psk */
- mix_psk(chaining_key, hash, key, handshake->preshared_key);
+ mix_psk(chaining_key, hash, key, preshared_key);
/* {} */
if (!message_decrypt(NULL, src->encrypted_nothing,
memzero_explicit(chaining_key, NOISE_HASH_LEN);
memzero_explicit(ephemeral_private, NOISE_PUBLIC_KEY_LEN);
memzero_explicit(static_private, NOISE_PUBLIC_KEY_LEN);
+ memzero_explicit(preshared_key, NOISE_SYMMETRIC_KEY_LEN);
up_read(&wg->static_identity.lock);
return ret_peer;
}
#include <linux/mutex.h>
#include <linux/kref.h>
-union noise_counter {
- struct {
- u64 counter;
- unsigned long backtrack[COUNTER_BITS_TOTAL / BITS_PER_LONG];
- spinlock_t lock;
- } receive;
- atomic64_t counter;
+struct noise_replay_counter {
+ u64 counter;
+ spinlock_t lock;
+ unsigned long backtrack[COUNTER_BITS_TOTAL / BITS_PER_LONG];
};
struct noise_symmetric_key {
u8 key[NOISE_SYMMETRIC_KEY_LEN];
- union noise_counter counter;
u64 birthdate;
bool is_valid;
};
struct noise_keypair {
struct index_hashtable_entry entry;
struct noise_symmetric_key sending;
+ atomic64_t sending_counter;
struct noise_symmetric_key receiving;
+ struct noise_replay_counter receiving_counter;
__le32 remote_index;
bool i_am_the_initiator;
struct kref refcount;
return real_protocol && skb->protocol == real_protocol;
}
-static inline void wg_reset_packet(struct sk_buff *skb)
+static inline void wg_reset_packet(struct sk_buff *skb, bool encapsulating)
{
+ u8 l4_hash = skb->l4_hash;
+ u8 sw_hash = skb->sw_hash;
+ u32 hash = skb->hash;
skb_scrub_packet(skb, true);
memset(&skb->headers_start, 0,
offsetof(struct sk_buff, headers_end) -
offsetof(struct sk_buff, headers_start));
+ if (encapsulating) {
+ skb->l4_hash = l4_hash;
+ skb->sw_hash = sw_hash;
+ skb->hash = hash;
+ }
skb->queue_mapping = 0;
skb->nohdr = 0;
skb->peeked = 0;
}
}
-static bool decrypt_packet(struct sk_buff *skb, struct noise_symmetric_key *key)
+static bool decrypt_packet(struct sk_buff *skb, struct noise_keypair *keypair)
{
struct scatterlist sg[MAX_SKB_FRAGS + 8];
struct sk_buff *trailer;
unsigned int offset;
int num_frags;
- if (unlikely(!key))
+ if (unlikely(!keypair))
return false;
- if (unlikely(!READ_ONCE(key->is_valid) ||
- wg_birthdate_has_expired(key->birthdate, REJECT_AFTER_TIME) ||
- key->counter.receive.counter >= REJECT_AFTER_MESSAGES)) {
- WRITE_ONCE(key->is_valid, false);
+ if (unlikely(!READ_ONCE(keypair->receiving.is_valid) ||
+ wg_birthdate_has_expired(keypair->receiving.birthdate, REJECT_AFTER_TIME) ||
+ keypair->receiving_counter.counter >= REJECT_AFTER_MESSAGES)) {
+ WRITE_ONCE(keypair->receiving.is_valid, false);
return false;
}
if (!chacha20poly1305_decrypt_sg_inplace(sg, skb->len, NULL, 0,
PACKET_CB(skb)->nonce,
- key->key))
+ keypair->receiving.key))
return false;
/* Another ugly situation of pushing and pulling the header so as to
}
/* This is RFC6479, a replay detection bitmap algorithm that avoids bitshifts */
-static bool counter_validate(union noise_counter *counter, u64 their_counter)
+static bool counter_validate(struct noise_replay_counter *counter, u64 their_counter)
{
unsigned long index, index_current, top, i;
bool ret = false;
- spin_lock_bh(&counter->receive.lock);
+ spin_lock_bh(&counter->lock);
- if (unlikely(counter->receive.counter >= REJECT_AFTER_MESSAGES + 1 ||
+ if (unlikely(counter->counter >= REJECT_AFTER_MESSAGES + 1 ||
their_counter >= REJECT_AFTER_MESSAGES))
goto out;
++their_counter;
if (unlikely((COUNTER_WINDOW_SIZE + their_counter) <
- counter->receive.counter))
+ counter->counter))
goto out;
index = their_counter >> ilog2(BITS_PER_LONG);
- if (likely(their_counter > counter->receive.counter)) {
- index_current = counter->receive.counter >> ilog2(BITS_PER_LONG);
+ if (likely(their_counter > counter->counter)) {
+ index_current = counter->counter >> ilog2(BITS_PER_LONG);
top = min_t(unsigned long, index - index_current,
COUNTER_BITS_TOTAL / BITS_PER_LONG);
for (i = 1; i <= top; ++i)
- counter->receive.backtrack[(i + index_current) &
+ counter->backtrack[(i + index_current) &
((COUNTER_BITS_TOTAL / BITS_PER_LONG) - 1)] = 0;
- counter->receive.counter = their_counter;
+ counter->counter = their_counter;
}
index &= (COUNTER_BITS_TOTAL / BITS_PER_LONG) - 1;
ret = !test_and_set_bit(their_counter & (BITS_PER_LONG - 1),
- &counter->receive.backtrack[index]);
+ &counter->backtrack[index]);
out:
- spin_unlock_bh(&counter->receive.lock);
+ spin_unlock_bh(&counter->lock);
return ret;
}
if (unlikely(state != PACKET_STATE_CRYPTED))
goto next;
- if (unlikely(!counter_validate(&keypair->receiving.counter,
+ if (unlikely(!counter_validate(&keypair->receiving_counter,
PACKET_CB(skb)->nonce))) {
net_dbg_ratelimited("%s: Packet has invalid nonce %llu (max %llu)\n",
peer->device->dev->name,
PACKET_CB(skb)->nonce,
- keypair->receiving.counter.receive.counter);
+ keypair->receiving_counter.counter);
goto next;
}
if (unlikely(wg_socket_endpoint_from_skb(&endpoint, skb)))
goto next;
- wg_reset_packet(skb);
+ wg_reset_packet(skb, false);
wg_packet_consume_data_done(peer, skb, &endpoint);
free = false;
struct sk_buff *skb;
while ((skb = ptr_ring_consume_bh(&queue->ring)) != NULL) {
- enum packet_state state = likely(decrypt_packet(skb,
- &PACKET_CB(skb)->keypair->receiving)) ?
+ enum packet_state state =
+ likely(decrypt_packet(skb, PACKET_CB(skb)->keypair)) ?
PACKET_STATE_CRYPTED : PACKET_STATE_DEAD;
wg_queue_enqueue_per_peer_napi(skb, state);
if (need_resched())
#ifdef DEBUG
bool __init wg_packet_counter_selftest(void)
{
+ struct noise_replay_counter *counter;
unsigned int test_num = 0, i;
- union noise_counter counter;
bool success = true;
-#define T_INIT do { \
- memset(&counter, 0, sizeof(union noise_counter)); \
- spin_lock_init(&counter.receive.lock); \
+ counter = kmalloc(sizeof(*counter), GFP_KERNEL);
+ if (unlikely(!counter)) {
+ pr_err("nonce counter self-test malloc: FAIL\n");
+ return false;
+ }
+
+#define T_INIT do { \
+ memset(counter, 0, sizeof(*counter)); \
+ spin_lock_init(&counter->lock); \
} while (0)
#define T_LIM (COUNTER_WINDOW_SIZE + 1)
#define T(n, v) do { \
++test_num; \
- if (counter_validate(&counter, n) != (v)) { \
+ if (counter_validate(counter, n) != (v)) { \
pr_err("nonce counter self-test %u: FAIL\n", \
test_num); \
success = false; \
if (success)
pr_info("nonce counter self-tests: pass\n");
+ kfree(counter);
return success;
}
#endif
rcu_read_lock_bh();
keypair = rcu_dereference_bh(peer->keypairs.current_keypair);
send = keypair && READ_ONCE(keypair->sending.is_valid) &&
- (atomic64_read(&keypair->sending.counter.counter) > REKEY_AFTER_MESSAGES ||
+ (atomic64_read(&keypair->sending_counter) > REKEY_AFTER_MESSAGES ||
(keypair->i_am_the_initiator &&
wg_birthdate_has_expired(keypair->sending.birthdate, REKEY_AFTER_TIME)));
rcu_read_unlock_bh();
struct sk_buff *trailer;
int num_frags;
+ /* Force hash calculation before encryption so that flow analysis is
+ * consistent over the inner packet.
+ */
+ skb_get_hash(skb);
+
/* Calculate lengths. */
padding_len = calculate_skb_padding(skb);
trailer_len = padding_len + noise_encrypted_len(0);
skb_list_walk_safe(first, skb, next) {
if (likely(encrypt_packet(skb,
PACKET_CB(first)->keypair))) {
- wg_reset_packet(skb);
+ wg_reset_packet(skb, true);
} else {
state = PACKET_STATE_DEAD;
break;
void wg_packet_send_staged_packets(struct wg_peer *peer)
{
- struct noise_symmetric_key *key;
struct noise_keypair *keypair;
struct sk_buff_head packets;
struct sk_buff *skb;
rcu_read_unlock_bh();
if (unlikely(!keypair))
goto out_nokey;
- key = &keypair->sending;
- if (unlikely(!READ_ONCE(key->is_valid)))
+ if (unlikely(!READ_ONCE(keypair->sending.is_valid)))
goto out_nokey;
- if (unlikely(wg_birthdate_has_expired(key->birthdate,
+ if (unlikely(wg_birthdate_has_expired(keypair->sending.birthdate,
REJECT_AFTER_TIME)))
goto out_invalid;
*/
PACKET_CB(skb)->ds = ip_tunnel_ecn_encap(0, ip_hdr(skb), skb);
PACKET_CB(skb)->nonce =
- atomic64_inc_return(&key->counter.counter) - 1;
+ atomic64_inc_return(&keypair->sending_counter) - 1;
if (unlikely(PACKET_CB(skb)->nonce >= REJECT_AFTER_MESSAGES))
goto out_invalid;
}
return;
out_invalid:
- WRITE_ONCE(key->is_valid, false);
+ WRITE_ONCE(keypair->sending.is_valid, false);
out_nokey:
wg_noise_keypair_put(keypair, false);
iwl_trans->cfg = &iwl_ax101_cfg_quz_hr;
else if (iwl_trans->cfg == &iwl_ax201_cfg_qu_hr)
iwl_trans->cfg = &iwl_ax201_cfg_quz_hr;
+ else if (iwl_trans->cfg == &killer1650s_2ax_cfg_qu_b0_hr_b0)
+ iwl_trans->cfg = &iwl_ax1650s_cfg_quz_hr;
+ else if (iwl_trans->cfg == &killer1650i_2ax_cfg_qu_b0_hr_b0)
+ iwl_trans->cfg = &iwl_ax1650i_cfg_quz_hr;
}
#endif
goto out;
}
- {
- SHASH_DESC_ON_STACK(desc, tfm);
-
- desc->tfm = tfm;
-
- ret = crypto_shash_digest(desc, fw->image, image_size,
- hash_data);
- shash_desc_zero(desc);
- }
+ ret = crypto_shash_tfm_digest(tfm, fw->image, image_size, hash_data);
crypto_free_shash(tfm);
if (ret) {
memcpy(atr_res->gbi, atr_req->gbi, gb_len);
r = nfc_set_remote_general_bytes(hdev->ndev, atr_res->gbi,
gb_len);
- if (r < 0)
+ if (r < 0) {
+ kfree_skb(skb);
return r;
+ }
}
info->dep_info.curr_nfc_dep_pni = 0;
while (nvme_cqe_pending(nvmeq)) {
found++;
+ /*
+ * load-load control dependency between phase and the rest of
+ * the cqe requires a full read memory barrier
+ */
+ dma_rmb();
nvme_handle_cqe(nvmeq, nvmeq->cq_head);
nvme_update_cq_head(nvmeq);
}
/*
* Called only on a device that has been disabled and after all other threads
- * that can check this device's completion queues have synced. This is the
- * last chance for the driver to see a natural completion before
- * nvme_cancel_request() terminates all incomplete requests.
+ * that can check this device's completion queues have synced, except
+ * nvme_poll(). This is the last chance for the driver to see a natural
+ * completion before nvme_cancel_request() terminates all incomplete requests.
*/
static void nvme_reap_pending_cqes(struct nvme_dev *dev)
{
int i;
- for (i = dev->ctrl.queue_count - 1; i > 0; i--)
+ for (i = dev->ctrl.queue_count - 1; i > 0; i--) {
+ spin_lock(&dev->queues[i].cq_poll_lock);
nvme_process_cq(&dev->queues[i]);
+ spin_unlock(&dev->queues[i].cq_poll_lock);
+ }
}
static int nvme_cmb_qdepth(struct nvme_dev *dev, int nr_io_queues,
{
int err = -EPERM;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EPERM;
if (test_and_set_bit_lock(0, &buffer_opened))
if (!attr->exclude_kernel)
reg |= BIT(SYS_PMSCR_EL1_E1SPE_SHIFT);
- if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && capable(CAP_SYS_ADMIN))
+ if (IS_ENABLED(CONFIG_PID_IN_CONTEXTIDR) && perfmon_capable())
reg |= BIT(SYS_PMSCR_EL1_CX_SHIFT);
return reg;
return -EOPNOTSUPP;
reg = arm_spe_event_to_pmscr(event);
- if (!capable(CAP_SYS_ADMIN) &&
+ if (!perfmon_capable() &&
(reg & (BIT(SYS_PMSCR_EL1_PA_SHIFT) |
BIT(SYS_PMSCR_EL1_CX_SHIFT) |
BIT(SYS_PMSCR_EL1_PCT_SHIFT))))
static const char * const i2c0_groups[] = {
"uart0_rx_mfp",
"uart0_tx_mfp",
- "i2c0_mfp_mfp",
+ "i2c0_mfp",
};
static const char * const i2c1_groups[] = {
.direction_output = byt_gpio_direction_output,
.get = byt_gpio_get,
.set = byt_gpio_set,
+ .set_config = gpiochip_generic_config,
.dbg_show = byt_gpio_dbg_show,
};
struct chv_pinctrl *pctrl = gpiochip_get_data(gc);
struct irq_chip *chip = irq_desc_get_chip(desc);
unsigned long pending;
+ unsigned long flags;
u32 intr_line;
chained_irq_enter(chip, desc);
+ raw_spin_lock_irqsave(&chv_lock, flags);
pending = readl(pctrl->regs + CHV_INTSTAT);
+ raw_spin_unlock_irqrestore(&chv_lock, flags);
+
for_each_set_bit(intr_line, &pending, pctrl->community->nirqs) {
unsigned int irq, offset;
#include "pinctrl-intel.h"
-#define SPT_PAD_OWN 0x020
-#define SPT_PADCFGLOCK 0x0a0
-#define SPT_HOSTSW_OWN 0x0d0
-#define SPT_GPI_IS 0x100
-#define SPT_GPI_IE 0x120
+#define SPT_PAD_OWN 0x020
+#define SPT_H_PADCFGLOCK 0x090
+#define SPT_LP_PADCFGLOCK 0x0a0
+#define SPT_HOSTSW_OWN 0x0d0
+#define SPT_GPI_IS 0x100
+#define SPT_GPI_IE 0x120
#define SPT_COMMUNITY(b, s, e) \
{ \
.barno = (b), \
.padown_offset = SPT_PAD_OWN, \
- .padcfglock_offset = SPT_PADCFGLOCK, \
+ .padcfglock_offset = SPT_LP_PADCFGLOCK, \
.hostown_offset = SPT_HOSTSW_OWN, \
.is_offset = SPT_GPI_IS, \
.ie_offset = SPT_GPI_IE, \
{ \
.barno = (b), \
.padown_offset = SPT_PAD_OWN, \
- .padcfglock_offset = SPT_PADCFGLOCK, \
+ .padcfglock_offset = SPT_H_PADCFGLOCK, \
.hostown_offset = SPT_HOSTSW_OWN, \
.is_offset = SPT_GPI_IS, \
.ie_offset = SPT_GPI_IE, \
case MTK_PIN_CONFIG_PU_ADV:
case MTK_PIN_CONFIG_PD_ADV:
if (hw->soc->adv_pull_get) {
- bool pullup;
-
pullup = param == MTK_PIN_CONFIG_PU_ADV;
err = hw->soc->adv_pull_get(hw, desc, pullup, &ret);
} else
pol = msm_readl_intr_cfg(pctrl, g);
pol ^= BIT(g->intr_polarity_bit);
- msm_writel_intr_cfg(val, pctrl, g);
+ msm_writel_intr_cfg(pol, pctrl, g);
val2 = msm_readl_io(pctrl, g) & BIT(g->in_bit);
intstat = msm_readl_intr_status(pctrl, g);
module_put(gc->owner);
}
+static int msm_gpio_irq_set_affinity(struct irq_data *d,
+ const struct cpumask *dest, bool force)
+{
+ struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+ struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+
+ if (d->parent_data && test_bit(d->hwirq, pctrl->skip_wake_irqs))
+ return irq_chip_set_affinity_parent(d, dest, force);
+
+ return 0;
+}
+
+static int msm_gpio_irq_set_vcpu_affinity(struct irq_data *d, void *vcpu_info)
+{
+ struct gpio_chip *gc = irq_data_get_irq_chip_data(d);
+ struct msm_pinctrl *pctrl = gpiochip_get_data(gc);
+
+ if (d->parent_data && test_bit(d->hwirq, pctrl->skip_wake_irqs))
+ return irq_chip_set_vcpu_affinity_parent(d, vcpu_info);
+
+ return 0;
+}
+
static void msm_gpio_irq_handler(struct irq_desc *desc)
{
struct gpio_chip *gc = irq_desc_get_handler_data(desc);
pctrl->irq_chip.irq_set_wake = msm_gpio_irq_set_wake;
pctrl->irq_chip.irq_request_resources = msm_gpio_irq_reqres;
pctrl->irq_chip.irq_release_resources = msm_gpio_irq_relres;
+ pctrl->irq_chip.irq_set_affinity = msm_gpio_irq_set_affinity;
+ pctrl->irq_chip.irq_set_vcpu_affinity = msm_gpio_irq_set_vcpu_affinity;
np = of_parse_phandle(pctrl->dev->of_node, "wakeup-parent", 0);
if (np) {
.record_size = 0x40000,
.console_size = 0x20000,
.ftrace_size = 0x20000,
- .dump_oops = 1,
+ .max_reason = KMSG_DUMP_OOPS,
};
static struct platform_device chromeos_ramoops = {
return 0;
}
-struct linear_range {
+struct bd70528_linear_range {
int min;
int step;
int vals;
int low_sel;
};
-static const struct linear_range current_limit_ranges[] = {
+static const struct bd70528_linear_range current_limit_ranges[] = {
{
.min = 5,
.step = 1,
* voltage for low temperatures. The driver currently only reads
* the charge current at room temperature. We do set both though.
*/
-static const struct linear_range warm_charge_curr[] = {
+static const struct bd70528_linear_range warm_charge_curr[] = {
{
.min = 10,
.step = 10,
#define MAX_WARM_CHG_CURR_SEL 0x1f
#define MIN_CHG_CURR_SEL 0x0
-static int find_value_for_selector_low(const struct linear_range *r,
+static int find_value_for_selector_low(const struct bd70528_linear_range *r,
int selectors, unsigned int sel,
unsigned int *val)
{
* I guess it is enough if we use voltage/current which is closest (below)
* the requested?
*/
-static int find_selector_for_value_low(const struct linear_range *r,
+static int find_selector_for_value_low(const struct bd70528_linear_range *r,
int selectors, unsigned int val,
unsigned int *sel, bool *found)
{
rmcd_error("pinned %ld out of %ld pages",
pinned, nr_pages);
ret = -EFAULT;
+ /*
+ * Set nr_pages up to mean "how many pages to unpin, in
+ * the error handler:
+ */
+ nr_pages = pinned;
goto err_pg;
}
.list_voltage = regulator_list_voltage_linear_range,
};
-static const struct regulator_linear_range pg86x_buck1_ranges[] = {
+static const struct linear_range pg86x_buck1_ranges[] = {
REGULATOR_LINEAR_RANGE( 0, 0, 10, 0),
REGULATOR_LINEAR_RANGE(1000000, 11, 34, 25000),
REGULATOR_LINEAR_RANGE(1600000, 35, 47, 50000),
};
-static const struct regulator_linear_range pg86x_buck2_ranges[] = {
+static const struct linear_range pg86x_buck2_ranges[] = {
REGULATOR_LINEAR_RANGE( 0, 0, 15, 0),
REGULATOR_LINEAR_RANGE(1000000, 16, 39, 25000),
REGULATOR_LINEAR_RANGE(1600000, 40, 52, 50000),
}
/* Ranges are sorted in ascending order. */
-static const struct regulator_linear_range buck1_volt_range[] = {
+static const struct linear_range buck1_volt_range[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 0x4f, 12500),
REGULATOR_LINEAR_RANGE(1600000, 0x50, 0x54, 50000),
};
/* BUCK 2~5 have same ranges. */
-static const struct regulator_linear_range buck2_5_volt_range[] = {
+static const struct linear_range buck2_5_volt_range[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 0x4f, 12500),
REGULATOR_LINEAR_RANGE(1600000, 0x50, 0x72, 50000),
};
# SPDX-License-Identifier: GPL-2.0-only
menuconfig REGULATOR
bool "Voltage and Current Regulator Support"
+ select LINEAR_RANGES
help
Generic Voltage and Current Regulator support.
Exynos5420/Exynos5800 SoCs to control various voltages.
It includes support for control of voltage and ramp speed.
+config REGULATOR_MAX77826
+ tristate "Maxim 77826 regulator"
+ depends on I2C
+ select REGMAP_I2C
+ help
+ This driver controls a Maxim 77826 regulator via I2C bus.
+ The regulator include 15 LDOs, BUCK and BUCK BOOST regulator.
+ It includes support for control of output voltage. This
+ regulator is found on the Samsung Galaxy S5 (klte) smartphone.
+
config REGULATOR_MC13XXX_CORE
tristate
obj-$(CONFIG_REGULATOR_MAX77686) += max77686-regulator.o
obj-$(CONFIG_REGULATOR_MAX77693) += max77693-regulator.o
obj-$(CONFIG_REGULATOR_MAX77802) += max77802-regulator.o
+obj-$(CONFIG_REGULATOR_MAX77826) += max77826-regulator.o
obj-$(CONFIG_REGULATOR_MC13783) += mc13783-regulator.o
obj-$(CONFIG_REGULATOR_MC13892) += mc13892-regulator.o
obj-$(CONFIG_REGULATOR_MC13XXX_CORE) += mc13xxx-regulator-core.o
1350000,
};
-static const unsigned int ldo_sdio_voltages[] = {
- 1160000,
- 1050000,
- 1100000,
- 1500000,
- 1800000,
- 2200000,
- 2910000,
- 3050000,
-};
-
static const unsigned int fixed_1200000_voltage[] = {
1200000,
};
2050000,
};
-static const unsigned int fixed_3300000_voltage[] = {
- 3300000,
-};
-
static const unsigned int ldo_vana_voltages[] = {
1050000,
1075000,
2600000, /* Duplicated in Vaudio and IsoUicc Control register. */
};
-static const unsigned int ldo_vdmic_voltages[] = {
- 1800000,
- 1900000,
- 2000000,
- 2850000,
-};
-
static DEFINE_MUTEX(shared_mode_mutex);
static struct ab8500_shared_mode ldo_anamic1_shared;
static struct ab8500_shared_mode ldo_anamic2_shared;
.val_bits = 8,
};
-static const struct regulator_linear_range act8865_voltage_ranges[] = {
+static const struct linear_range act8865_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 23, 25000),
REGULATOR_LINEAR_RANGE(1200000, 24, 47, 50000),
REGULATOR_LINEAR_RANGE(2400000, 48, 63, 100000),
};
-static const struct regulator_linear_range act8600_sudcdc_voltage_ranges[] = {
+static const struct linear_range act8600_sudcdc_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(3000000, 0, 63, 0),
REGULATOR_LINEAR_RANGE(3000000, 64, 159, 100000),
REGULATOR_LINEAR_RANGE(12600000, 160, 191, 200000),
u32 op_mode[ACT8945A_ID_MAX];
};
-static const struct regulator_linear_range act8945a_voltage_ranges[] = {
+static const struct linear_range act8945a_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 23, 25000),
REGULATOR_LINEAR_RANGE(1200000, 24, 47, 50000),
REGULATOR_LINEAR_RANGE(2400000, 48, 63, 100000),
.set_bypass = regulator_set_bypass_regmap,
};
-static const struct regulator_linear_range arizona_ldo1_hc_ranges[] = {
+static const struct linear_range arizona_ldo1_hc_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 0x6, 50000),
REGULATOR_LINEAR_RANGE(1800000, 0x7, 0x7, 0),
};
.set_bypass = arizona_micsupp_set_bypass,
};
-static const struct regulator_linear_range arizona_micsupp_ranges[] = {
+static const struct linear_range arizona_micsupp_ranges[] = {
REGULATOR_LINEAR_RANGE(1700000, 0, 0x1e, 50000),
REGULATOR_LINEAR_RANGE(3300000, 0x1f, 0x1f, 0),
};
.owner = THIS_MODULE,
};
-static const struct regulator_linear_range arizona_micsupp_ext_ranges[] = {
+static const struct linear_range arizona_micsupp_ext_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 0x14, 25000),
REGULATOR_LINEAR_RANGE(1500000, 0x15, 0x27, 100000),
};
.map_voltage = regulator_map_voltage_linear_range,
};
-static const struct regulator_linear_range as3711_sd_ranges[] = {
+static const struct linear_range as3711_sd_ranges[] = {
REGULATOR_LINEAR_RANGE(612500, 0x1, 0x40, 12500),
REGULATOR_LINEAR_RANGE(1425000, 0x41, 0x70, 25000),
REGULATOR_LINEAR_RANGE(2650000, 0x71, 0x7f, 50000),
};
-static const struct regulator_linear_range as3711_aldo_ranges[] = {
+static const struct linear_range as3711_aldo_ranges[] = {
REGULATOR_LINEAR_RANGE(1200000, 0, 0xf, 50000),
REGULATOR_LINEAR_RANGE(1800000, 0x10, 0x1f, 100000),
};
-static const struct regulator_linear_range as3711_dldo_ranges[] = {
+static const struct linear_range as3711_dldo_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 0x10, 50000),
REGULATOR_LINEAR_RANGE(1750000, 0x20, 0x3f, 50000),
};
.set_bypass = regulator_set_bypass_regmap,
};
-static const struct regulator_linear_range as3722_ldo_ranges[] = {
+static const struct linear_range as3722_ldo_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x00, 0x00, 0),
REGULATOR_LINEAR_RANGE(825000, 0x01, 0x24, 25000),
REGULATOR_LINEAR_RANGE(1725000, 0x40, 0x7F, 25000),
return false;
}
-static const struct regulator_linear_range as3722_sd2345_ranges[] = {
+static const struct linear_range as3722_sd2345_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x00, 0x00, 0),
REGULATOR_LINEAR_RANGE(612500, 0x01, 0x40, 12500),
REGULATOR_LINEAR_RANGE(1425000, 0x41, 0x70, 25000),
.is_enabled = regulator_is_enabled_regmap,
};
-static const struct regulator_linear_range axp20x_ldo4_ranges[] = {
+static const struct linear_range axp20x_ldo4_ranges[] = {
REGULATOR_LINEAR_RANGE(1250000,
AXP20X_LDO4_V_OUT_1250mV_START,
AXP20X_LDO4_V_OUT_1250mV_END,
};
/* DCDC ranges shared with AXP813 */
-static const struct regulator_linear_range axp803_dcdc234_ranges[] = {
+static const struct linear_range axp803_dcdc234_ranges[] = {
REGULATOR_LINEAR_RANGE(500000,
AXP803_DCDC234_500mV_START,
AXP803_DCDC234_500mV_END,
20000),
};
-static const struct regulator_linear_range axp803_dcdc5_ranges[] = {
+static const struct linear_range axp803_dcdc5_ranges[] = {
REGULATOR_LINEAR_RANGE(800000,
AXP803_DCDC5_800mV_START,
AXP803_DCDC5_800mV_END,
20000),
};
-static const struct regulator_linear_range axp803_dcdc6_ranges[] = {
+static const struct linear_range axp803_dcdc6_ranges[] = {
REGULATOR_LINEAR_RANGE(600000,
AXP803_DCDC6_600mV_START,
AXP803_DCDC6_600mV_END,
};
/* AXP806's CLDO2 and AXP809's DLDO1 share the same range */
-static const struct regulator_linear_range axp803_dldo2_ranges[] = {
+static const struct linear_range axp803_dldo2_ranges[] = {
REGULATOR_LINEAR_RANGE(700000,
AXP803_DLDO2_700mV_START,
AXP803_DLDO2_700mV_END,
AXP_DESC_FIXED(AXP803, RTC_LDO, "rtc-ldo", "ips", 3000),
};
-static const struct regulator_linear_range axp806_dcdca_ranges[] = {
+static const struct linear_range axp806_dcdca_ranges[] = {
REGULATOR_LINEAR_RANGE(600000,
AXP806_DCDCA_600mV_START,
AXP806_DCDCA_600mV_END,
20000),
};
-static const struct regulator_linear_range axp806_dcdcd_ranges[] = {
+static const struct linear_range axp806_dcdcd_ranges[] = {
REGULATOR_LINEAR_RANGE(600000,
AXP806_DCDCD_600mV_START,
AXP806_DCDCD_600mV_END,
AXP806_PWR_OUT_CTRL2, AXP806_PWR_OUT_SW_MASK),
};
-static const struct regulator_linear_range axp809_dcdc4_ranges[] = {
+static const struct linear_range axp809_dcdc4_ranges[] = {
REGULATOR_LINEAR_RANGE(600000,
AXP809_DCDC4_600mV_START,
AXP809_DCDC4_600mV_END,
};
/* DCDC group CSR: supported voltages in microvolts */
-static const struct regulator_linear_range dcdc_csr_ranges[] = {
+static const struct linear_range dcdc_csr_ranges[] = {
REGULATOR_LINEAR_RANGE(860000, 2, 50, 10000),
REGULATOR_LINEAR_RANGE(1360000, 51, 55, 20000),
REGULATOR_LINEAR_RANGE(900000, 56, 63, 0),
};
/* DCDC group IOSR1: supported voltages in microvolts */
-static const struct regulator_linear_range dcdc_iosr1_ranges[] = {
+static const struct linear_range dcdc_iosr1_ranges[] = {
REGULATOR_LINEAR_RANGE(860000, 2, 51, 10000),
REGULATOR_LINEAR_RANGE(1500000, 52, 52, 0),
REGULATOR_LINEAR_RANGE(1800000, 53, 53, 0),
};
/* DCDC group SDSR1: supported voltages in microvolts */
-static const struct regulator_linear_range dcdc_sdsr1_ranges[] = {
+static const struct linear_range dcdc_sdsr1_ranges[] = {
REGULATOR_LINEAR_RANGE(860000, 2, 50, 10000),
REGULATOR_LINEAR_RANGE(1340000, 51, 51, 0),
REGULATOR_LINEAR_RANGE(900000, 52, 63, 0),
u8 n_voltages;
const unsigned int *volt_table;
u8 n_linear_ranges;
- const struct regulator_linear_range *linear_ranges;
+ const struct linear_range *linear_ranges;
};
#define BCM590XX_REG_TABLE(_name, _table) \
#define BUCK_RAMPRATE_125MV 1
#define BUCK_RAMP_MAX 250
-static const struct regulator_linear_range bd70528_buck1_volts[] = {
+static const struct linear_range bd70528_buck1_volts[] = {
REGULATOR_LINEAR_RANGE(1200000, 0x00, 0x1, 600000),
REGULATOR_LINEAR_RANGE(2750000, 0x2, 0xf, 50000),
};
-static const struct regulator_linear_range bd70528_buck2_volts[] = {
+static const struct linear_range bd70528_buck2_volts[] = {
REGULATOR_LINEAR_RANGE(1200000, 0x00, 0x1, 300000),
REGULATOR_LINEAR_RANGE(1550000, 0x2, 0xd, 50000),
REGULATOR_LINEAR_RANGE(3000000, 0xe, 0xf, 300000),
};
-static const struct regulator_linear_range bd70528_buck3_volts[] = {
+static const struct linear_range bd70528_buck3_volts[] = {
REGULATOR_LINEAR_RANGE(800000, 0x00, 0xd, 50000),
REGULATOR_LINEAR_RANGE(1800000, 0xe, 0xf, 0),
};
/* All LDOs have same voltage ranges */
-static const struct regulator_linear_range bd70528_ldo_volts[] = {
+static const struct linear_range bd70528_ldo_volts[] = {
REGULATOR_LINEAR_RANGE(1650000, 0x0, 0x07, 50000),
REGULATOR_LINEAR_RANGE(2100000, 0x8, 0x0f, 100000),
REGULATOR_LINEAR_RANGE(2850000, 0x10, 0x19, 50000),
},
};
-static const struct regulator_linear_range bd71828_buck1267_volts[] = {
+static const struct linear_range bd71828_buck1267_volts[] = {
REGULATOR_LINEAR_RANGE(500000, 0x00, 0xef, 6250),
REGULATOR_LINEAR_RANGE(2000000, 0xf0, 0xff, 0),
};
-static const struct regulator_linear_range bd71828_buck3_volts[] = {
+static const struct linear_range bd71828_buck3_volts[] = {
REGULATOR_LINEAR_RANGE(1200000, 0x00, 0x0f, 50000),
REGULATOR_LINEAR_RANGE(2000000, 0x10, 0x1f, 0),
};
-static const struct regulator_linear_range bd71828_buck4_volts[] = {
+static const struct linear_range bd71828_buck4_volts[] = {
REGULATOR_LINEAR_RANGE(1000000, 0x00, 0x1f, 25000),
REGULATOR_LINEAR_RANGE(1800000, 0x20, 0x3f, 0),
};
-static const struct regulator_linear_range bd71828_buck5_volts[] = {
+static const struct linear_range bd71828_buck5_volts[] = {
REGULATOR_LINEAR_RANGE(2500000, 0x00, 0x0f, 50000),
REGULATOR_LINEAR_RANGE(3300000, 0x10, 0x1f, 0),
};
-static const struct regulator_linear_range bd71828_ldo_volts[] = {
+static const struct linear_range bd71828_ldo_volts[] = {
REGULATOR_LINEAR_RANGE(800000, 0x00, 0x31, 50000),
REGULATOR_LINEAR_RANGE(3300000, 0x32, 0x3f, 0),
};
BUCK_RAMPRATE_MASK, ramp_value << 6);
}
-/* Bucks 1 to 4 support DVS. PWM mode is used when voltage is changed.
+/*
+ * On BD71837 (not on BD71847, BD71850, ...)
+ * Bucks 1 to 4 support DVS. PWM mode is used when voltage is changed.
* Bucks 5 to 8 and LDOs can use PFM and must be disabled when voltage
* is changed. Hence we return -EBUSY for these if voltage is changed
* when BUCK/LDO is enabled.
+ *
+ * On BD71847, BD71850, ... The LDO voltage can be changed when LDO is
+ * enabled. But if voltage is increased the LDO power-good monitoring
+ * must be disabled for the duration of changing + 1mS to ensure voltage
+ * has reached the higher level before HW does next under voltage detection
+ * cycle.
*/
-static int bd718xx_set_voltage_sel_restricted(struct regulator_dev *rdev,
+static int bd71837_set_voltage_sel_restricted(struct regulator_dev *rdev,
unsigned int sel)
{
if (regulator_is_enabled_regmap(rdev))
return regulator_set_voltage_sel_regmap(rdev, sel);
}
+static void voltage_change_done(struct regulator_dev *rdev, unsigned int sel,
+ unsigned int *mask)
+{
+ int ret;
+
+ if (*mask) {
+ /*
+ * Let's allow scheduling as we use I2C anyways. We just need to
+ * guarantee minimum of 1ms sleep - it shouldn't matter if we
+ * exceed it due to the scheduling.
+ */
+ msleep(1);
+ /*
+ * Note for next hacker. The PWRGOOD should not be masked on
+ * BD71847 so we will just unconditionally enable detection
+ * when voltage is set.
+ * If someone want's to disable PWRGOOD he must implement
+ * caching and restoring the old value here. I am not
+ * aware of such use-cases so for the sake of the simplicity
+ * we just always enable PWRGOOD here.
+ */
+ ret = regmap_update_bits(rdev->regmap, BD718XX_REG_MVRFLTMASK2,
+ *mask, 0);
+ if (ret)
+ dev_err(&rdev->dev,
+ "Failed to re-enable voltage monitoring (%d)\n",
+ ret);
+ }
+}
+
+static int voltage_change_prepare(struct regulator_dev *rdev, unsigned int sel,
+ unsigned int *mask)
+{
+ int ret;
+
+ *mask = 0;
+ if (regulator_is_enabled_regmap(rdev)) {
+ int now, new;
+
+ now = rdev->desc->ops->get_voltage_sel(rdev);
+ if (now < 0)
+ return now;
+
+ now = rdev->desc->ops->list_voltage(rdev, now);
+ if (now < 0)
+ return now;
+
+ new = rdev->desc->ops->list_voltage(rdev, sel);
+ if (new < 0)
+ return new;
+
+ /*
+ * If we increase LDO voltage when LDO is enabled we need to
+ * disable the power-good detection until voltage has reached
+ * the new level. According to HW colleagues the maximum time
+ * it takes is 1000us. I assume that on systems with light load
+ * this might be less - and we could probably use DT to give
+ * system specific delay value if performance matters.
+ *
+ * Well, knowing we use I2C here and can add scheduling delays
+ * I don't think it is worth the hassle and I just add fixed
+ * 1ms sleep here (and allow scheduling). If this turns out to
+ * be a problem we can change it to delay and make the delay
+ * time configurable.
+ */
+ if (new > now) {
+ int ldo_offset = rdev->desc->id - BD718XX_LDO1;
+
+ *mask = BD718XX_LDO1_VRMON80 << ldo_offset;
+ ret = regmap_update_bits(rdev->regmap,
+ BD718XX_REG_MVRFLTMASK2,
+ *mask, *mask);
+ if (ret) {
+ dev_err(&rdev->dev,
+ "Failed to stop voltage monitoring\n");
+ return ret;
+ }
+ }
+ }
+
+ return 0;
+}
+
+static int bd718xx_set_voltage_sel_restricted(struct regulator_dev *rdev,
+ unsigned int sel)
+{
+ int ret;
+ int mask;
+
+ ret = voltage_change_prepare(rdev, sel, &mask);
+ if (ret)
+ return ret;
+
+ ret = regulator_set_voltage_sel_regmap(rdev, sel);
+ voltage_change_done(rdev, sel, &mask);
+
+ return ret;
+}
+
static int bd718xx_set_voltage_sel_pickable_restricted(
struct regulator_dev *rdev, unsigned int sel)
+{
+ int ret;
+ int mask;
+
+ ret = voltage_change_prepare(rdev, sel, &mask);
+ if (ret)
+ return ret;
+
+ ret = regulator_set_voltage_sel_pickable_regmap(rdev, sel);
+ voltage_change_done(rdev, sel, &mask);
+
+ return ret;
+}
+
+static int bd71837_set_voltage_sel_pickable_restricted(
+ struct regulator_dev *rdev, unsigned int sel)
{
if (regulator_is_enabled_regmap(rdev))
return -EBUSY;
.list_voltage = regulator_list_voltage_pickable_linear_range,
.set_voltage_sel = bd718xx_set_voltage_sel_pickable_restricted,
.get_voltage_sel = regulator_get_voltage_sel_pickable_regmap,
+
+};
+
+static const struct regulator_ops bd71837_pickable_range_ldo_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_pickable_linear_range,
+ .set_voltage_sel = bd71837_set_voltage_sel_pickable_restricted,
+ .get_voltage_sel = regulator_get_voltage_sel_pickable_regmap,
};
static const struct regulator_ops bd718xx_pickable_range_buck_ops = {
.disable = regulator_disable_regmap,
.is_enabled = regulator_is_enabled_regmap,
.list_voltage = regulator_list_voltage_pickable_linear_range,
- .set_voltage_sel = bd718xx_set_voltage_sel_pickable_restricted,
+ .set_voltage_sel = regulator_set_voltage_sel_pickable_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_pickable_regmap,
+ .set_voltage_time_sel = regulator_set_voltage_time_sel,
+};
+
+static const struct regulator_ops bd71837_pickable_range_buck_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_pickable_linear_range,
+ .set_voltage_sel = bd71837_set_voltage_sel_pickable_restricted,
.get_voltage_sel = regulator_get_voltage_sel_pickable_regmap,
.set_voltage_time_sel = regulator_set_voltage_time_sel,
};
+static const struct regulator_ops bd71837_ldo_regulator_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_linear_range,
+ .set_voltage_sel = bd71837_set_voltage_sel_restricted,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+};
+
static const struct regulator_ops bd718xx_ldo_regulator_ops = {
.enable = regulator_enable_regmap,
.disable = regulator_disable_regmap,
.get_voltage_sel = regulator_get_voltage_sel_regmap,
};
+static const struct regulator_ops bd71837_ldo_regulator_nolinear_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_table,
+ .set_voltage_sel = bd71837_set_voltage_sel_restricted,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+};
+
static const struct regulator_ops bd718xx_ldo_regulator_nolinear_ops = {
.enable = regulator_enable_regmap,
.disable = regulator_disable_regmap,
.disable = regulator_disable_regmap,
.is_enabled = regulator_is_enabled_regmap,
.list_voltage = regulator_list_voltage_linear_range,
- .set_voltage_sel = bd718xx_set_voltage_sel_restricted,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .set_voltage_time_sel = regulator_set_voltage_time_sel,
+};
+
+static const struct regulator_ops bd71837_buck_regulator_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_linear_range,
+ .set_voltage_sel = bd71837_set_voltage_sel_restricted,
.get_voltage_sel = regulator_get_voltage_sel_regmap,
.set_voltage_time_sel = regulator_set_voltage_time_sel,
};
static const struct regulator_ops bd718xx_buck_regulator_nolinear_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_table,
+ .map_voltage = regulator_map_voltage_ascend,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .set_voltage_time_sel = regulator_set_voltage_time_sel,
+};
+
+static const struct regulator_ops bd71837_buck_regulator_nolinear_ops = {
.enable = regulator_enable_regmap,
.disable = regulator_disable_regmap,
.is_enabled = regulator_is_enabled_regmap,
* BD71847 BUCK1/2
* 0.70 to 1.30V (10mV step)
*/
-static const struct regulator_linear_range bd718xx_dvs_buck_volts[] = {
+static const struct linear_range bd718xx_dvs_buck_volts[] = {
REGULATOR_LINEAR_RANGE(700000, 0x00, 0x3C, 10000),
REGULATOR_LINEAR_RANGE(1300000, 0x3D, 0x3F, 0),
};
* and
* 0.675 to 1.325 (range 1)
*/
-static const struct regulator_linear_range bd71837_buck5_volts[] = {
+static const struct linear_range bd71837_buck5_volts[] = {
/* Ranges when VOLT_SEL bit is 0 */
REGULATOR_LINEAR_RANGE(700000, 0x00, 0x03, 100000),
REGULATOR_LINEAR_RANGE(1050000, 0x04, 0x05, 50000),
/*
* BD71847 BUCK3
*/
-static const struct regulator_linear_range bd71847_buck3_volts[] = {
+static const struct linear_range bd71847_buck3_volts[] = {
/* Ranges when VOLT_SEL bits are 00 */
REGULATOR_LINEAR_RANGE(700000, 0x00, 0x03, 100000),
REGULATOR_LINEAR_RANGE(1050000, 0x04, 0x05, 50000),
0x0, 0x0, 0x0, 0x40, 0x80, 0x80, 0x80
};
-static const struct regulator_linear_range bd71847_buck4_volts[] = {
+static const struct linear_range bd71847_buck4_volts[] = {
REGULATOR_LINEAR_RANGE(3000000, 0x00, 0x03, 100000),
REGULATOR_LINEAR_RANGE(2600000, 0x00, 0x03, 100000),
};
* BUCK6
* 3.0V to 3.3V (step 100mV)
*/
-static const struct regulator_linear_range bd71837_buck6_volts[] = {
+static const struct linear_range bd71837_buck6_volts[] = {
REGULATOR_LINEAR_RANGE(3000000, 0x00, 0x03, 100000),
};
* BUCK8
* 0.8V to 1.40V (step 10mV)
*/
-static const struct regulator_linear_range bd718xx_4th_nodvs_buck_volts[] = {
+static const struct linear_range bd718xx_4th_nodvs_buck_volts[] = {
REGULATOR_LINEAR_RANGE(800000, 0x00, 0x3C, 10000),
};
* LDO1
* 3.0 to 3.3V (100mV step)
*/
-static const struct regulator_linear_range bd718xx_ldo1_volts[] = {
+static const struct linear_range bd718xx_ldo1_volts[] = {
REGULATOR_LINEAR_RANGE(3000000, 0x00, 0x03, 100000),
REGULATOR_LINEAR_RANGE(1600000, 0x00, 0x03, 100000),
};
* LDO3
* 1.8 to 3.3V (100mV step)
*/
-static const struct regulator_linear_range bd718xx_ldo3_volts[] = {
+static const struct linear_range bd718xx_ldo3_volts[] = {
REGULATOR_LINEAR_RANGE(1800000, 0x00, 0x0F, 100000),
};
* LDO4
* 0.9 to 1.8V (100mV step)
*/
-static const struct regulator_linear_range bd718xx_ldo4_volts[] = {
+static const struct linear_range bd718xx_ldo4_volts[] = {
REGULATOR_LINEAR_RANGE(900000, 0x00, 0x09, 100000),
};
* LDO5 for BD71837
* 1.8 to 3.3V (100mV step)
*/
-static const struct regulator_linear_range bd71837_ldo5_volts[] = {
+static const struct linear_range bd71837_ldo5_volts[] = {
REGULATOR_LINEAR_RANGE(1800000, 0x00, 0x0F, 100000),
};
* LDO5 for BD71837
* 1.8 to 3.3V (100mV step)
*/
-static const struct regulator_linear_range bd71847_ldo5_volts[] = {
+static const struct linear_range bd71847_ldo5_volts[] = {
REGULATOR_LINEAR_RANGE(1800000, 0x00, 0x0F, 100000),
REGULATOR_LINEAR_RANGE(800000, 0x00, 0x0F, 100000),
};
* LDO6
* 0.9 to 1.8V (100mV step)
*/
-static const struct regulator_linear_range bd718xx_ldo6_volts[] = {
+static const struct linear_range bd718xx_ldo6_volts[] = {
REGULATOR_LINEAR_RANGE(900000, 0x00, 0x09, 100000),
};
* LDO7
* 1.8 to 3.3V (100mV step)
*/
-static const struct regulator_linear_range bd71837_ldo7_volts[] = {
+static const struct linear_range bd71837_ldo7_volts[] = {
REGULATOR_LINEAR_RANGE(1800000, 0x00, 0x0F, 100000),
};
.of_match = of_match_ptr("BUCK5"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_BUCK5,
- .ops = &bd718xx_pickable_range_buck_ops,
+ .ops = &bd71837_pickable_range_buck_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD71837_BUCK5_VOLTAGE_NUM,
.linear_ranges = bd71837_buck5_volts,
.of_match = of_match_ptr("BUCK6"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_BUCK6,
- .ops = &bd718xx_buck_regulator_ops,
+ .ops = &bd71837_buck_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD71837_BUCK6_VOLTAGE_NUM,
.linear_ranges = bd71837_buck6_volts,
.of_match = of_match_ptr("BUCK7"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_BUCK7,
- .ops = &bd718xx_buck_regulator_nolinear_ops,
+ .ops = &bd71837_buck_regulator_nolinear_ops,
.type = REGULATOR_VOLTAGE,
.volt_table = &bd718xx_3rd_nodvs_buck_volts[0],
.n_voltages = ARRAY_SIZE(bd718xx_3rd_nodvs_buck_volts),
.of_match = of_match_ptr("BUCK8"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_BUCK8,
- .ops = &bd718xx_buck_regulator_ops,
+ .ops = &bd71837_buck_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD718XX_4TH_NODVS_BUCK_VOLTAGE_NUM,
.linear_ranges = bd718xx_4th_nodvs_buck_volts,
.of_match = of_match_ptr("LDO1"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO1,
- .ops = &bd718xx_pickable_range_ldo_ops,
+ .ops = &bd71837_pickable_range_ldo_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD718XX_LDO1_VOLTAGE_NUM,
.linear_ranges = bd718xx_ldo1_volts,
.of_match = of_match_ptr("LDO2"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO2,
- .ops = &bd718xx_ldo_regulator_nolinear_ops,
+ .ops = &bd71837_ldo_regulator_nolinear_ops,
.type = REGULATOR_VOLTAGE,
.volt_table = &ldo_2_volts[0],
.vsel_reg = BD718XX_REG_LDO2_VOLT,
.of_match = of_match_ptr("LDO3"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO3,
- .ops = &bd718xx_ldo_regulator_ops,
+ .ops = &bd71837_ldo_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD718XX_LDO3_VOLTAGE_NUM,
.linear_ranges = bd718xx_ldo3_volts,
.of_match = of_match_ptr("LDO4"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO4,
- .ops = &bd718xx_ldo_regulator_ops,
+ .ops = &bd71837_ldo_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD718XX_LDO4_VOLTAGE_NUM,
.linear_ranges = bd718xx_ldo4_volts,
.of_match = of_match_ptr("LDO5"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO5,
- .ops = &bd718xx_ldo_regulator_ops,
+ .ops = &bd71837_ldo_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD71837_LDO5_VOLTAGE_NUM,
.linear_ranges = bd71837_ldo5_volts,
.of_match = of_match_ptr("LDO6"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO6,
- .ops = &bd718xx_ldo_regulator_ops,
+ .ops = &bd71837_ldo_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD718XX_LDO6_VOLTAGE_NUM,
.linear_ranges = bd718xx_ldo6_volts,
.of_match = of_match_ptr("LDO7"),
.regulators_node = of_match_ptr("regulators"),
.id = BD718XX_LDO7,
- .ops = &bd718xx_ldo_regulator_ops,
+ .ops = &bd71837_ldo_regulator_ops,
.type = REGULATOR_VOLTAGE,
.n_voltages = BD71837_LDO7_VOLTAGE_NUM,
.linear_ranges = bd71837_ldo7_volts,
return done;
}
-static int regulator_balance_voltage(struct regulator_dev *rdev,
- suspend_state_t state)
+int regulator_do_balance_voltage(struct regulator_dev *rdev,
+ suspend_state_t state, bool skip_coupled)
{
struct regulator_dev **c_rdevs;
struct regulator_dev *best_rdev;
struct coupling_desc *c_desc = &rdev->coupling_desc;
- struct regulator_coupler *coupler = c_desc->coupler;
int i, ret, n_coupled, best_min_uV, best_max_uV, best_c_rdev;
unsigned int delta, best_delta;
unsigned long c_rdev_done = 0;
bool best_c_rdev_done;
c_rdevs = c_desc->coupled_rdevs;
- n_coupled = c_desc->n_coupled;
-
- /*
- * If system is in a state other than PM_SUSPEND_ON, don't check
- * other coupled regulators.
- */
- if (state != PM_SUSPEND_ON)
- n_coupled = 1;
-
- if (c_desc->n_resolved < n_coupled) {
- rdev_err(rdev, "Not all coupled regulators registered\n");
- return -EPERM;
- }
-
- /* Invoke custom balancer for customized couplers */
- if (coupler && coupler->balance_voltage)
- return coupler->balance_voltage(coupler, rdev, state);
+ n_coupled = skip_coupled ? 1 : c_desc->n_coupled;
/*
* Find the best possible voltage change on each loop. Leave the loop
return ret;
}
+static int regulator_balance_voltage(struct regulator_dev *rdev,
+ suspend_state_t state)
+{
+ struct coupling_desc *c_desc = &rdev->coupling_desc;
+ struct regulator_coupler *coupler = c_desc->coupler;
+ bool skip_coupled = false;
+
+ /*
+ * If system is in a state other than PM_SUSPEND_ON, don't check
+ * other coupled regulators.
+ */
+ if (state != PM_SUSPEND_ON)
+ skip_coupled = true;
+
+ if (c_desc->n_resolved < c_desc->n_coupled) {
+ rdev_err(rdev, "Not all coupled regulators registered\n");
+ return -EPERM;
+ }
+
+ /* Invoke custom balancer for customized couplers */
+ if (coupler && coupler->balance_voltage)
+ return coupler->balance_voltage(coupler, rdev, state);
+
+ return regulator_do_balance_voltage(rdev, state, skip_coupled);
+}
+
/**
* regulator_set_voltage - set regulator output voltage
* @regulator: regulator source
int regulator_allow_bypass(struct regulator *regulator, bool enable)
{
struct regulator_dev *rdev = regulator->rdev;
+ const char *name = rdev_get_name(rdev);
int ret = 0;
if (!rdev->desc->ops->set_bypass)
rdev->bypass_count++;
if (rdev->bypass_count == rdev->open_count) {
+ trace_regulator_bypass_enable(name);
+
ret = rdev->desc->ops->set_bypass(rdev, enable);
if (ret != 0)
rdev->bypass_count--;
+ else
+ trace_regulator_bypass_enable_complete(name);
}
} else if (!enable && regulator->bypass) {
rdev->bypass_count--;
if (rdev->bypass_count != rdev->open_count) {
+ trace_regulator_bypass_disable(name);
+
ret = rdev->desc->ops->set_bypass(rdev, enable);
if (ret != 0)
rdev->bypass_count++;
+ else
+ trace_regulator_bypass_disable_complete(name);
}
}
seq_printf(s, "%*s%-*s ",
(level + 1) * 3 + 1, "",
30 - (level + 1) * 3,
+ consumer->supply_name ? consumer->supply_name :
consumer->dev ? dev_name(consumer->dev) : "deviceless");
switch (rdev->desc->type) {
return ret;
}
-static const struct regulator_linear_range da9034_ldo12_ranges[] = {
+static const struct linear_range da9034_ldo12_ranges[] = {
REGULATOR_LINEAR_RANGE(1700000, 0, 7, 50000),
REGULATOR_LINEAR_RANGE(2700000, 8, 15, 50000),
};
goto out;
}
- info->is_enabled = 0;
+ info->is_enabled = false;
out:
return ret;
}
unsigned int r_val;
int range;
unsigned int val;
- int ret, i;
- unsigned int voltages_in_range = 0;
+ int ret;
+ unsigned int voltages = 0;
+ const struct linear_range *r = rdev->desc->linear_ranges;
- if (!rdev->desc->linear_ranges)
+ if (!r)
return -EINVAL;
ret = regmap_read(rdev->regmap, rdev->desc->vsel_reg, &val);
if (range < 0)
return -EINVAL;
- for (i = 0; i < range; i++)
- voltages_in_range += (rdev->desc->linear_ranges[i].max_sel -
- rdev->desc->linear_ranges[i].min_sel) + 1;
+ voltages = linear_range_values_in_range_array(r, range);
- return val + voltages_in_range;
+ return val + voltages;
}
EXPORT_SYMBOL_GPL(regulator_get_voltage_sel_pickable_regmap);
unsigned int voltages_in_range = 0;
for (i = 0; i < rdev->desc->n_linear_ranges; i++) {
- voltages_in_range = (rdev->desc->linear_ranges[i].max_sel -
- rdev->desc->linear_ranges[i].min_sel) + 1;
+ const struct linear_range *r;
+
+ r = &rdev->desc->linear_ranges[i];
+ voltages_in_range = linear_range_values_in_range(r);
+
if (sel < voltages_in_range)
break;
sel -= voltages_in_range;
int regulator_map_voltage_linear_range(struct regulator_dev *rdev,
int min_uV, int max_uV)
{
- const struct regulator_linear_range *range;
+ const struct linear_range *range;
int ret = -EINVAL;
+ unsigned int sel;
+ bool found;
int voltage, i;
if (!rdev->desc->n_linear_ranges) {
}
for (i = 0; i < rdev->desc->n_linear_ranges; i++) {
- int linear_max_uV;
-
range = &rdev->desc->linear_ranges[i];
- linear_max_uV = range->min_uV +
- (range->max_sel - range->min_sel) * range->uV_step;
- if (!(min_uV <= linear_max_uV && max_uV >= range->min_uV))
+ ret = linear_range_get_selector_high(range, min_uV, &sel,
+ &found);
+ if (ret)
continue;
-
- if (min_uV <= range->min_uV)
- min_uV = range->min_uV;
-
- /* range->uV_step == 0 means fixed voltage range */
- if (range->uV_step == 0) {
- ret = 0;
- } else {
- ret = DIV_ROUND_UP(min_uV - range->min_uV,
- range->uV_step);
- if (ret < 0)
- return ret;
- }
-
- ret += range->min_sel;
+ ret = sel;
/*
* Map back into a voltage to verify we're still in bounds.
* If we are not, then continue checking rest of the ranges.
*/
- voltage = rdev->desc->ops->list_voltage(rdev, ret);
+ voltage = rdev->desc->ops->list_voltage(rdev, sel);
if (voltage >= min_uV && voltage <= max_uV)
break;
}
int regulator_map_voltage_pickable_linear_range(struct regulator_dev *rdev,
int min_uV, int max_uV)
{
- const struct regulator_linear_range *range;
+ const struct linear_range *range;
int ret = -EINVAL;
int voltage, i;
unsigned int selector = 0;
for (i = 0; i < rdev->desc->n_linear_ranges; i++) {
int linear_max_uV;
+ bool found;
+ unsigned int sel;
range = &rdev->desc->linear_ranges[i];
- linear_max_uV = range->min_uV +
- (range->max_sel - range->min_sel) * range->uV_step;
+ linear_max_uV = linear_range_get_max_value(range);
- if (!(min_uV <= linear_max_uV && max_uV >= range->min_uV)) {
- selector += (range->max_sel - range->min_sel + 1);
+ if (!(min_uV <= linear_max_uV && max_uV >= range->min)) {
+ selector += linear_range_values_in_range(range);
continue;
}
- if (min_uV <= range->min_uV)
- min_uV = range->min_uV;
-
- /* range->uV_step == 0 means fixed voltage range */
- if (range->uV_step == 0) {
- ret = 0;
- } else {
- ret = DIV_ROUND_UP(min_uV - range->min_uV,
- range->uV_step);
- if (ret < 0)
- return ret;
+ ret = linear_range_get_selector_high(range, min_uV, &sel,
+ &found);
+ if (ret) {
+ selector += linear_range_values_in_range(range);
+ continue;
}
- ret += selector;
+ ret = selector + sel;
voltage = rdev->desc->ops->list_voltage(rdev, ret);
* exit but retry until we have checked all ranges.
*/
if (voltage < min_uV || voltage > max_uV)
- selector += (range->max_sel - range->min_sel + 1);
+ selector += linear_range_values_in_range(range);
else
break;
}
int regulator_list_voltage_pickable_linear_range(struct regulator_dev *rdev,
unsigned int selector)
{
- const struct regulator_linear_range *range;
+ const struct linear_range *range;
int i;
unsigned int all_sels = 0;
}
for (i = 0; i < rdev->desc->n_linear_ranges; i++) {
- unsigned int sels_in_range;
+ unsigned int sel_indexes;
range = &rdev->desc->linear_ranges[i];
- sels_in_range = range->max_sel - range->min_sel;
+ sel_indexes = linear_range_values_in_range(range) - 1;
- if (all_sels + sels_in_range >= selector) {
+ if (all_sels + sel_indexes >= selector) {
selector -= all_sels;
- return range->min_uV + (range->uV_step * selector);
+ /*
+ * As we see here, pickable ranges work only as
+ * long as the first selector for each pickable
+ * range is 0, and the each subsequent range for
+ * this 'pick' follow immediately at next unused
+ * selector (Eg. there is no gaps between ranges).
+ * I think this is fine but it probably should be
+ * documented. OTOH, whole pickable range stuff
+ * might benefit from some documentation
+ */
+ return range->min + (range->step * selector);
}
- all_sels += (sels_in_range + 1);
+ all_sels += (sel_indexes + 1);
}
return -EINVAL;
int regulator_desc_list_voltage_linear_range(const struct regulator_desc *desc,
unsigned int selector)
{
- const struct regulator_linear_range *range;
- int i;
-
- if (!desc->n_linear_ranges) {
- BUG_ON(!desc->n_linear_ranges);
- return -EINVAL;
- }
-
- for (i = 0; i < desc->n_linear_ranges; i++) {
- range = &desc->linear_ranges[i];
-
- if (!(selector >= range->min_sel &&
- selector <= range->max_sel))
- continue;
+ unsigned int val;
+ int ret;
- selector -= range->min_sel;
+ BUG_ON(!desc->n_linear_ranges);
- return range->min_uV + (range->uV_step * selector);
- }
+ ret = linear_range_get_value_array(desc->linear_ranges,
+ desc->n_linear_ranges, selector,
+ &val);
+ if (ret)
+ return ret;
- return -EINVAL;
+ return val;
}
EXPORT_SYMBOL_GPL(regulator_desc_list_voltage_linear_range);
};
/* Ranges are sorted in ascending order. */
-static const struct regulator_linear_range ldo_audio_volt_range[] = {
+static const struct linear_range ldo_audio_volt_range[] = {
REGULATOR_LINEAR_RANGE(2800000, 0, 3, 50000),
REGULATOR_LINEAR_RANGE(3000000, 4, 7, 100000),
};
* _id - LDO id name string
* _match - of match name string
* n_volt - number of votages available
- * volt_ranges - array of regulator_linear_range
+ * volt_ranges - array of linear_range
* vstep - voltage increase in each linear step in uV
* vreg - voltage select register
* vmask - voltage select mask
.set_voltage_sel = regulator_set_voltage_sel_regmap,
};
-static const struct regulator_linear_range lochnagar_micvdd_ranges[] = {
+static const struct linear_range lochnagar_micvdd_ranges[] = {
REGULATOR_LINEAR_RANGE(1000000, 0, 0xC, 50000),
REGULATOR_LINEAR_RANGE(1700000, 0xD, 0x1F, 100000),
};
.set_voltage_sel = regulator_set_voltage_sel_regmap,
};
-static const struct regulator_linear_range lochnagar_vddcore_ranges[] = {
+static const struct linear_range lochnagar_vddcore_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0x8, 0x41, 12500),
};
static const struct lp873x_regulator regulators[];
-static const struct regulator_linear_range buck0_buck1_ranges[] = {
+static const struct linear_range buck0_buck1_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x0, 0x13, 0),
REGULATOR_LINEAR_RANGE(700000, 0x14, 0x17, 10000),
REGULATOR_LINEAR_RANGE(735000, 0x18, 0x9d, 5000),
REGULATOR_LINEAR_RANGE(1420000, 0x9e, 0xff, 20000),
};
-static const struct regulator_linear_range ldo0_ldo1_ranges[] = {
+static const struct linear_range ldo0_ldo1_ranges[] = {
REGULATOR_LINEAR_RANGE(800000, 0x0, 0x19, 100000),
};
static const struct lp87565_regulator regulators[];
-static const struct regulator_linear_range buck0_1_2_3_ranges[] = {
+static const struct linear_range buck0_1_2_3_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0xA, 0x17, 10000),
REGULATOR_LINEAR_RANGE(735000, 0x18, 0x9d, 5000),
REGULATOR_LINEAR_RANGE(1420000, 0x9e, 0xff, 20000),
};
/* BUCK 1 ~ 4 voltage ranges */
-static const struct regulator_linear_range buck_volt_ranges[] = {
+static const struct linear_range buck_volt_ranges[] = {
REGULATOR_LINEAR_RANGE(500000, 0, 0, 0),
REGULATOR_LINEAR_RANGE(800000, 1, 25, 50000),
};
0x0, 0x1, 0x2, 0x3
};
-static const struct regulator_linear_range max77651_sbb1_volt_ranges[] = {
+static const struct linear_range max77651_sbb1_volt_ranges[] = {
/* range index 0 */
REGULATOR_LINEAR_RANGE(2400000, 0x00, 0x0f, 50000),
/* range index 1 */
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-or-later
+//
+// max77826-regulator.c - regulator driver for Maxim MAX77826
+//
+// Author: Iskren Chernev <iskren.chernev@gmail.com>
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/init.h>
+#include <linux/err.h>
+#include <linux/of.h>
+#include <linux/of_device.h>
+#include <linux/platform_device.h>
+#include <linux/regulator/driver.h>
+#include <linux/regulator/of_regulator.h>
+#include <linux/i2c.h>
+#include <linux/regmap.h>
+
+enum max77826_registers {
+ MAX77826_REG_INT_SRC = 0x00,
+ MAX77826_REG_SYS_INT,
+ MAX77826_REG_INT1,
+ MAX77826_REG_INT2,
+ MAX77826_REG_BB_INT,
+ MAX77826_REG_INT_SRC_M,
+ MAX77826_REG_TOPSYS_INT_M,
+ MAX77826_REG_INT1_M,
+ MAX77826_REG_INT2_M,
+ MAX77826_REG_BB_INT_M,
+ MAX77826_REG_TOPSYS_STAT,
+ MAX77826_REG_STAT1,
+ MAX77826_REG_STAT2,
+ MAX77826_REG_BB_STAT,
+ /* 0x0E - 0x0F: Reserved */
+ MAX77826_REG_LDO_OPMD1 = 0x10,
+ MAX77826_REG_LDO_OPMD2,
+ MAX77826_REG_LDO_OPMD3,
+ MAX77826_REG_LDO_OPMD4,
+ MAX77826_REG_B_BB_OPMD,
+ /* 0x15 - 0x1F: Reserved */
+ MAX77826_REG_LDO1_CFG = 0x20,
+ MAX77826_REG_LDO2_CFG,
+ MAX77826_REG_LDO3_CFG,
+ MAX77826_REG_LDO4_CFG,
+ MAX77826_REG_LDO5_CFG,
+ MAX77826_REG_LDO6_CFG,
+ MAX77826_REG_LDO7_CFG,
+ MAX77826_REG_LDO8_CFG,
+ MAX77826_REG_LDO9_CFG,
+ MAX77826_REG_LDO10_CFG,
+ MAX77826_REG_LDO11_CFG,
+ MAX77826_REG_LDO12_CFG,
+ MAX77826_REG_LDO13_CFG,
+ MAX77826_REG_LDO14_CFG,
+ MAX77826_REG_LDO15_CFG,
+ /* 0x2F: Reserved */
+ MAX77826_REG_BUCK_CFG = 0x30,
+ MAX77826_REG_BUCK_VOUT,
+ MAX77826_REG_BB_CFG,
+ MAX77826_REG_BB_VOUT,
+ /* 0x34 - 0x3F: Reserved */
+ MAX77826_REG_BUCK_SS_FREQ = 0x40,
+ MAX77826_REG_UVLO_FALL,
+ /* 0x42 - 0xCE: Reserved */
+ MAX77826_REG_DEVICE_ID = 0xCF,
+};
+
+enum max77826_regulators {
+ MAX77826_LDO1 = 0,
+ MAX77826_LDO2,
+ MAX77826_LDO3,
+ MAX77826_LDO4,
+ MAX77826_LDO5,
+ MAX77826_LDO6,
+ MAX77826_LDO7,
+ MAX77826_LDO8,
+ MAX77826_LDO9,
+ MAX77826_LDO10,
+ MAX77826_LDO11,
+ MAX77826_LDO12,
+ MAX77826_LDO13,
+ MAX77826_LDO14,
+ MAX77826_LDO15,
+ MAX77826_BUCK,
+ MAX77826_BUCKBOOST,
+ MAX77826_MAX_REGULATORS,
+};
+
+#define MAX77826_MASK_LDO 0x7f
+#define MAX77826_MASK_BUCK 0xff
+#define MAX77826_MASK_BUCKBOOST 0x7f
+#define MAX77826_BUCK_RAMP_DELAY 12500
+
+/* values in mV */
+/* for LDO1-3 */
+#define MAX77826_NMOS_LDO_VOLT_MIN 600000
+#define MAX77826_NMOS_LDO_VOLT_MAX 2187500
+#define MAX77826_NMOS_LDO_VOLT_STEP 12500
+
+/* for LDO4-15 */
+#define MAX77826_PMOS_LDO_VOLT_MIN 800000
+#define MAX77826_PMOS_LDO_VOLT_MAX 3975000
+#define MAX77826_PMOS_LDO_VOLT_STEP 25000
+
+/* for BUCK */
+#define MAX77826_BUCK_VOLT_MIN 500000
+#define MAX77826_BUCK_VOLT_MAX 1800000
+#define MAX77826_BUCK_VOLT_STEP 6250
+
+/* for BUCKBOOST */
+#define MAX77826_BUCKBOOST_VOLT_MIN 2600000
+#define MAX77826_BUCKBOOST_VOLT_MAX 4187500
+#define MAX77826_BUCKBOOST_VOLT_STEP 12500
+#define MAX77826_VOLT_RANGE(_type) \
+ ((MAX77826_ ## _type ## _VOLT_MAX - \
+ MAX77826_ ## _type ## _VOLT_MIN) / \
+ MAX77826_ ## _type ## _VOLT_STEP + 1)
+
+#define MAX77826_LDO(_id, _type) \
+ [MAX77826_LDO ## _id] = { \
+ .id = MAX77826_LDO ## _id, \
+ .name = "LDO"#_id, \
+ .of_match = of_match_ptr("LDO"#_id), \
+ .regulators_node = "regulators", \
+ .ops = &max77826_most_ops, \
+ .min_uV = MAX77826_ ## _type ## _LDO_VOLT_MIN, \
+ .uV_step = MAX77826_ ## _type ## _LDO_VOLT_STEP, \
+ .n_voltages = MAX77826_VOLT_RANGE(_type ## _LDO), \
+ .enable_reg = MAX77826_REG_LDO_OPMD1 + (_id - 1) / 4, \
+ .enable_mask = BIT(((_id - 1) % 4) * 2 + 1), \
+ .vsel_reg = MAX77826_REG_LDO1_CFG + (_id - 1), \
+ .vsel_mask = MAX77826_MASK_LDO, \
+ .owner = THIS_MODULE, \
+ }
+
+#define MAX77826_BUCK(_idx, _id, _ops) \
+ [MAX77826_ ## _id] = { \
+ .id = MAX77826_ ## _id, \
+ .name = #_id, \
+ .of_match = of_match_ptr(#_id), \
+ .regulators_node = "regulators", \
+ .ops = &_ops, \
+ .min_uV = MAX77826_ ## _id ## _VOLT_MIN, \
+ .uV_step = MAX77826_ ## _id ## _VOLT_STEP, \
+ .n_voltages = MAX77826_VOLT_RANGE(_id), \
+ .enable_reg = MAX77826_REG_B_BB_OPMD, \
+ .enable_mask = BIT(_idx * 2 + 1), \
+ .vsel_reg = MAX77826_REG_BUCK_VOUT + _idx * 2, \
+ .vsel_mask = MAX77826_MASK_ ## _id, \
+ .owner = THIS_MODULE, \
+ }
+
+
+
+struct max77826_regulator_info {
+ struct regmap *regmap;
+ struct regulator_desc *rdesc;
+};
+
+static const struct regmap_config max77826_regmap_config = {
+ .reg_bits = 8,
+ .val_bits = 8,
+ .max_register = MAX77826_REG_DEVICE_ID,
+};
+
+static int max77826_set_voltage_time_sel(struct regulator_dev *,
+ unsigned int old_selector,
+ unsigned int new_selector);
+
+static const struct regulator_ops max77826_most_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_linear,
+ .map_voltage = regulator_map_voltage_linear,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+};
+
+static const struct regulator_ops max77826_buck_ops = {
+ .enable = regulator_enable_regmap,
+ .disable = regulator_disable_regmap,
+ .is_enabled = regulator_is_enabled_regmap,
+ .list_voltage = regulator_list_voltage_linear,
+ .map_voltage = regulator_map_voltage_linear,
+ .get_voltage_sel = regulator_get_voltage_sel_regmap,
+ .set_voltage_sel = regulator_set_voltage_sel_regmap,
+ .set_voltage_time_sel = max77826_set_voltage_time_sel,
+};
+
+static struct regulator_desc max77826_regulators_desc[] = {
+ MAX77826_LDO(1, NMOS),
+ MAX77826_LDO(2, NMOS),
+ MAX77826_LDO(3, NMOS),
+ MAX77826_LDO(4, PMOS),
+ MAX77826_LDO(5, PMOS),
+ MAX77826_LDO(6, PMOS),
+ MAX77826_LDO(7, PMOS),
+ MAX77826_LDO(8, PMOS),
+ MAX77826_LDO(9, PMOS),
+ MAX77826_LDO(10, PMOS),
+ MAX77826_LDO(11, PMOS),
+ MAX77826_LDO(12, PMOS),
+ MAX77826_LDO(13, PMOS),
+ MAX77826_LDO(14, PMOS),
+ MAX77826_LDO(15, PMOS),
+ MAX77826_BUCK(0, BUCK, max77826_buck_ops),
+ MAX77826_BUCK(1, BUCKBOOST, max77826_most_ops),
+};
+
+static int max77826_set_voltage_time_sel(struct regulator_dev *rdev,
+ unsigned int old_selector,
+ unsigned int new_selector)
+{
+ if (new_selector > old_selector) {
+ return DIV_ROUND_UP(MAX77826_BUCK_VOLT_STEP *
+ (new_selector - old_selector),
+ MAX77826_BUCK_RAMP_DELAY);
+ }
+
+ return 0;
+}
+
+static int max77826_read_device_id(struct regmap *regmap, struct device *dev)
+{
+ unsigned int device_id;
+ int res;
+
+ res = regmap_read(regmap, MAX77826_REG_DEVICE_ID, &device_id);
+ if (!res)
+ dev_dbg(dev, "DEVICE_ID: 0x%x\n", device_id);
+
+ return res;
+}
+
+static int max77826_i2c_probe(struct i2c_client *client)
+{
+ struct device *dev = &client->dev;
+ struct max77826_regulator_info *info;
+ struct regulator_config config = {};
+ struct regulator_dev *rdev;
+ struct regmap *regmap;
+ int i;
+
+ info = devm_kzalloc(dev, sizeof(struct max77826_regulator_info),
+ GFP_KERNEL);
+ if (!info)
+ return -ENOMEM;
+
+ info->rdesc = max77826_regulators_desc;
+ regmap = devm_regmap_init_i2c(client, &max77826_regmap_config);
+ if (IS_ERR(regmap)) {
+ dev_err(dev, "Failed to allocate regmap!\n");
+ return PTR_ERR(regmap);
+ }
+
+ info->regmap = regmap;
+ i2c_set_clientdata(client, info);
+
+ config.dev = dev;
+ config.regmap = regmap;
+ config.driver_data = info;
+
+ for (i = 0; i < MAX77826_MAX_REGULATORS; i++) {
+ rdev = devm_regulator_register(dev,
+ &max77826_regulators_desc[i],
+ &config);
+ if (IS_ERR(rdev)) {
+ dev_err(dev, "Failed to register regulator!\n");
+ return PTR_ERR(rdev);
+ }
+ }
+
+ return max77826_read_device_id(regmap, dev);
+}
+
+static const struct of_device_id max77826_of_match[] = {
+ { .compatible = "maxim,max77826" },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(of, max77826_of_match);
+
+static const struct i2c_device_id max77826_id[] = {
+ { "max77826-regulator" },
+ { /* sentinel */ }
+};
+MODULE_DEVICE_TABLE(i2c, max77826_id);
+
+static struct i2c_driver max77826_regulator_driver = {
+ .driver = {
+ .name = "max77826",
+ .of_match_table = of_match_ptr(max77826_of_match),
+ },
+ .probe_new = max77826_i2c_probe,
+ .id_table = max77826_id,
+};
+module_i2c_driver(max77826_regulator_driver);
+
+MODULE_AUTHOR("Iskren Chernev <iskren.chernev@gmail.com>");
+MODULE_DESCRIPTION("MAX77826 PMIC regulator driver");
+MODULE_LICENSE("GPL");
unsigned int buck2_idx;
};
+static const unsigned int charger_current_table[] = {
+ 90000, 380000, 475000, 550000, 570000, 600000, 700000, 800000,
+};
+
static int max8998_get_enable_register(struct regulator_dev *rdev,
int *reg, int *shift)
{
*reg = MAX8998_REG_CHGR2;
*shift = 7 - (ldo - MAX8998_ESAFEOUT1);
break;
+ case MAX8998_CHARGER:
+ *reg = MAX8998_REG_CHGR2;
+ *shift = 0;
+ break;
default:
return -EINVAL;
}
return val & (1 << shift);
}
+static int max8998_ldo_is_enabled_inverted(struct regulator_dev *rdev)
+{
+ return (!max8998_ldo_is_enabled(rdev));
+}
+
static int max8998_ldo_enable(struct regulator_dev *rdev)
{
struct max8998_data *max8998 = rdev_get_drvdata(rdev);
return 0;
}
+static int max8998_set_current_limit(struct regulator_dev *rdev,
+ int min_uA, int max_uA)
+{
+ struct max8998_data *max8998 = rdev_get_drvdata(rdev);
+ struct i2c_client *i2c = max8998->iodev->i2c;
+ unsigned int n_currents = rdev->desc->n_current_limits;
+ int i, sel = -1;
+
+ if (n_currents == 0)
+ return -EINVAL;
+
+ if (rdev->desc->curr_table) {
+ const unsigned int *curr_table = rdev->desc->curr_table;
+ bool ascend = curr_table[n_currents - 1] > curr_table[0];
+
+ /* search for closest to maximum */
+ if (ascend) {
+ for (i = n_currents - 1; i >= 0; i--) {
+ if (min_uA <= curr_table[i] &&
+ curr_table[i] <= max_uA) {
+ sel = i;
+ break;
+ }
+ }
+ } else {
+ for (i = 0; i < n_currents; i++) {
+ if (min_uA <= curr_table[i] &&
+ curr_table[i] <= max_uA) {
+ sel = i;
+ break;
+ }
+ }
+ }
+ }
+
+ if (sel < 0)
+ return -EINVAL;
+
+ sel <<= ffs(rdev->desc->csel_mask) - 1;
+
+ return max8998_update_reg(i2c, rdev->desc->csel_reg,
+ sel, rdev->desc->csel_mask);
+}
+
+int max8998_get_current_limit(struct regulator_dev *rdev)
+{
+ struct max8998_data *max8998 = rdev_get_drvdata(rdev);
+ struct i2c_client *i2c = max8998->iodev->i2c;
+ u8 val;
+ int ret;
+
+ ret = max8998_read_reg(i2c, rdev->desc->csel_reg, &val);
+ if (ret != 0)
+ return ret;
+
+ val &= rdev->desc->csel_mask;
+ val >>= ffs(rdev->desc->csel_mask) - 1;
+
+ if (rdev->desc->curr_table) {
+ if (val >= rdev->desc->n_current_limits)
+ return -EINVAL;
+
+ return rdev->desc->curr_table[val];
+ }
+
+ return -EINVAL;
+}
+
static const struct regulator_ops max8998_ldo_ops = {
.list_voltage = regulator_list_voltage_linear,
.map_voltage = regulator_map_voltage_linear,
.set_voltage_time_sel = max8998_set_voltage_buck_time_sel,
};
+static const struct regulator_ops max8998_charger_ops = {
+ .set_current_limit = max8998_set_current_limit,
+ .get_current_limit = max8998_get_current_limit,
+ .is_enabled = max8998_ldo_is_enabled_inverted,
+ /* Swapped as register is inverted */
+ .enable = max8998_ldo_disable,
+ .disable = max8998_ldo_enable,
+};
+
static const struct regulator_ops max8998_others_ops = {
.is_enabled = max8998_ldo_is_enabled,
.enable = max8998_ldo_enable,
.owner = THIS_MODULE, \
}
+#define MAX8998_CURRENT_REG(_name, _ops, _table, _reg, _mask) \
+ { \
+ .name = #_name, \
+ .id = MAX8998_##_name, \
+ .ops = _ops, \
+ .curr_table = _table, \
+ .n_current_limits = ARRAY_SIZE(_table), \
+ .csel_reg = _reg, \
+ .csel_mask = _mask, \
+ .type = REGULATOR_CURRENT, \
+ .owner = THIS_MODULE, \
+ }
+
#define MAX8998_OTHERS_REG(_name, _id) \
{ \
.name = #_name, \
MAX8998_OTHERS_REG(ENVICHG, MAX8998_ENVICHG),
MAX8998_OTHERS_REG(ESAFEOUT1, MAX8998_ESAFEOUT1),
MAX8998_OTHERS_REG(ESAFEOUT2, MAX8998_ESAFEOUT2),
+ MAX8998_CURRENT_REG(CHARGER, &max8998_charger_ops,
+ charger_current_table, MAX8998_REG_CHGR1, 0x7),
};
static int max8998_pmic_dt_parse_dvs_gpio(struct max8998_dev *iodev,
};
MODULE_DEVICE_TABLE(of, mcp16502_ids);
-static const struct regulator_linear_range b1l12_ranges[] = {
+static const struct linear_range b1l12_ranges[] = {
REGULATOR_LINEAR_RANGE(1200000, VDD_LOW_SEL, VDD_HIGH_SEL, 50000),
};
-static const struct regulator_linear_range b234_ranges[] = {
+static const struct linear_range b234_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, VDD_LOW_SEL, VDD_HIGH_SEL, 25000),
};
return val;
}
-static const struct regulator_linear_range mp8859_dcdc_ranges[] = {
+static const struct linear_range mp8859_dcdc_ranges[] = {
REGULATOR_LINEAR_RANGE(0, VOL_MIN_IDX, VOL_MAX_IDX, 10000),
};
.modeset_mask = _modeset_mask, \
}
-static const struct regulator_linear_range buck_volt_range1[] = {
+static const struct linear_range buck_volt_range1[] = {
REGULATOR_LINEAR_RANGE(700000, 0, 0x7f, 6250),
};
-static const struct regulator_linear_range buck_volt_range2[] = {
+static const struct linear_range buck_volt_range2[] = {
REGULATOR_LINEAR_RANGE(1400000, 0, 0x7f, 12500),
};
-static const struct regulator_linear_range buck_volt_range3[] = {
+static const struct linear_range buck_volt_range3[] = {
REGULATOR_LINEAR_RANGE(500000, 0, 0x3f, 50000),
};
.qi = BIT(15), \
}
-static const struct regulator_linear_range buck_volt_range1[] = {
+static const struct linear_range buck_volt_range1[] = {
REGULATOR_LINEAR_RANGE(500000, 0, 0x7f, 6250),
};
-static const struct regulator_linear_range buck_volt_range2[] = {
+static const struct linear_range buck_volt_range2[] = {
REGULATOR_LINEAR_RANGE(500000, 0, 0x7f, 12500),
};
-static const struct regulator_linear_range buck_volt_range3[] = {
+static const struct linear_range buck_volt_range3[] = {
REGULATOR_LINEAR_RANGE(500000, 0, 0x3f, 50000),
};
-static const struct regulator_linear_range buck_volt_range4[] = {
+static const struct linear_range buck_volt_range4[] = {
REGULATOR_LINEAR_RANGE(1000000, 0, 0x7f, 12500),
};
.modeset_mask = _modeset_mask, \
}
-static const struct regulator_linear_range buck_volt_range1[] = {
+static const struct linear_range buck_volt_range1[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 0xfe, 6250),
};
-static const struct regulator_linear_range buck_volt_range2[] = {
+static const struct linear_range buck_volt_range2[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 0xfe, 6250),
};
-static const struct regulator_linear_range buck_volt_range3[] = {
+static const struct linear_range buck_volt_range3[] = {
REGULATOR_LINEAR_RANGE(1200000, 0, 0x3c, 25000),
};
.qi = BIT(15), \
}
-static const struct regulator_linear_range buck_volt_range1[] = {
+static const struct linear_range buck_volt_range1[] = {
REGULATOR_LINEAR_RANGE(700000, 0, 0x7f, 6250),
};
-static const struct regulator_linear_range buck_volt_range2[] = {
+static const struct linear_range buck_volt_range2[] = {
REGULATOR_LINEAR_RANGE(800000, 0, 0x7f, 6250),
};
-static const struct regulator_linear_range buck_volt_range3[] = {
+static const struct linear_range buck_volt_range3[] = {
REGULATOR_LINEAR_RANGE(1500000, 0, 0x1f, 20000),
};
#include <linux/of_platform.h>
#include <linux/regulator/of_regulator.h>
-static const struct regulator_linear_range smps_low_ranges[] = {
+static const struct linear_range smps_low_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x0, 0x0, 0),
REGULATOR_LINEAR_RANGE(500000, 0x1, 0x6, 0),
REGULATOR_LINEAR_RANGE(510000, 0x7, 0x79, 10000),
REGULATOR_LINEAR_RANGE(1650000, 0x7A, 0x7f, 0),
};
-static const struct regulator_linear_range smps_high_ranges[] = {
+static const struct linear_range smps_high_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x0, 0x0, 0),
REGULATOR_LINEAR_RANGE(1000000, 0x1, 0x6, 0),
REGULATOR_LINEAR_RANGE(1020000, 0x7, 0x79, 20000),
struct rpmh_vreg_hw_data {
enum rpmh_regulator_type regulator_type;
const struct regulator_ops *ops;
- const struct regulator_linear_range voltage_range;
+ const struct linear_range voltage_range;
int n_voltages;
int hpm_min_load_uA;
const int *pmic_mode_map;
RPMH_VREG("ldo10", "ldo%s10", &pmic5_pldo, "vdd-l2-l10"),
RPMH_VREG("ldo11", "ldo%s11", &pmic5_nldo, "vdd-l1-l8-l11"),
RPMH_VREG("ldo12", "ldo%s12", &pmic5_pldo_lv, "vdd-l7-l12-l14-l15"),
- RPMH_VREG("ldo13", "ldo%s13", &pmic5_pldo, "vdd-l13-l6-l17"),
+ RPMH_VREG("ldo13", "ldo%s13", &pmic5_pldo, "vdd-l13-l16-l17"),
RPMH_VREG("ldo14", "ldo%s14", &pmic5_pldo_lv, "vdd-l7-l12-l14-l15"),
RPMH_VREG("ldo15", "ldo%s15", &pmic5_pldo_lv, "vdd-l7-l12-l14-l15"),
- RPMH_VREG("ldo16", "ldo%s16", &pmic5_pldo, "vdd-l13-l6-l17"),
- RPMH_VREG("ldo17", "ldo%s17", &pmic5_pldo, "vdd-l13-l6-l17"),
+ RPMH_VREG("ldo16", "ldo%s16", &pmic5_pldo, "vdd-l13-l16-l17"),
+ RPMH_VREG("ldo17", "ldo%s17", &pmic5_pldo, "vdd-l13-l16-l17"),
RPMH_VREG("ldo18", "ldo%s18", &pmic5_nldo, "vdd-l3-l4-l5-l18"),
{},
};
RPMH_VREG("ldo5", "ldo%s5", &pmic5_pldo, "vdd-l4-l5-l6"),
RPMH_VREG("ldo6", "ldo%s6", &pmic5_pldo, "vdd-l4-l5-l6"),
RPMH_VREG("ldo7", "ldo%s7", &pmic5_pldo, "vdd-l7-l11"),
- RPMH_VREG("ldo8", "ldo%s8", &pmic5_pldo_lv, "vdd-l1-l8-l11"),
+ RPMH_VREG("ldo8", "ldo%s8", &pmic5_pldo_lv, "vdd-l1-l8"),
RPMH_VREG("ldo9", "ldo%s9", &pmic5_pldo, "vdd-l9-l10"),
RPMH_VREG("ldo10", "ldo%s10", &pmic5_pldo, "vdd-l9-l10"),
RPMH_VREG("ldo11", "ldo%s11", &pmic5_pldo, "vdd-l7-l11"),
/*
* Physically available PMIC regulator voltage ranges
*/
-static const struct regulator_linear_range pldo_ranges[] = {
+static const struct linear_range pldo_ranges[] = {
REGULATOR_LINEAR_RANGE( 750000, 0, 59, 12500),
REGULATOR_LINEAR_RANGE(1500000, 60, 123, 25000),
REGULATOR_LINEAR_RANGE(3100000, 124, 160, 50000),
};
-static const struct regulator_linear_range nldo_ranges[] = {
+static const struct linear_range nldo_ranges[] = {
REGULATOR_LINEAR_RANGE( 750000, 0, 63, 12500),
};
-static const struct regulator_linear_range nldo1200_ranges[] = {
+static const struct linear_range nldo1200_ranges[] = {
REGULATOR_LINEAR_RANGE( 375000, 0, 59, 6250),
REGULATOR_LINEAR_RANGE( 750000, 60, 123, 12500),
};
-static const struct regulator_linear_range smps_ranges[] = {
+static const struct linear_range smps_ranges[] = {
REGULATOR_LINEAR_RANGE( 375000, 0, 29, 12500),
REGULATOR_LINEAR_RANGE( 750000, 30, 89, 12500),
REGULATOR_LINEAR_RANGE(1500000, 90, 153, 25000),
};
-static const struct regulator_linear_range ftsmps_ranges[] = {
+static const struct linear_range ftsmps_ranges[] = {
REGULATOR_LINEAR_RANGE( 350000, 0, 6, 50000),
REGULATOR_LINEAR_RANGE( 700000, 7, 63, 12500),
REGULATOR_LINEAR_RANGE(1500000, 64, 100, 50000),
};
-static const struct regulator_linear_range smb208_ranges[] = {
+static const struct linear_range smb208_ranges[] = {
REGULATOR_LINEAR_RANGE( 375000, 0, 29, 12500),
REGULATOR_LINEAR_RANGE( 750000, 30, 89, 12500),
REGULATOR_LINEAR_RANGE(1500000, 90, 153, 25000),
REGULATOR_LINEAR_RANGE(3100000, 154, 234, 25000),
};
-static const struct regulator_linear_range ncp_ranges[] = {
+static const struct linear_range ncp_ranges[] = {
REGULATOR_LINEAR_RANGE(1500000, 0, 31, 50000),
};
.supports_force_mode_bypass = false,
};
-static const struct qcom_rpm_reg pm8921_ftsmps = {
- .desc.linear_ranges = ftsmps_ranges,
- .desc.n_linear_ranges = ARRAY_SIZE(ftsmps_ranges),
- .desc.n_voltages = 101,
- .desc.ops = &uV_ops,
- .parts = &rpm8960_smps_parts,
- .supports_force_mode_auto = true,
- .supports_force_mode_bypass = false,
-};
-
static const struct qcom_rpm_reg pm8921_ncp = {
.desc.linear_ranges = ncp_ranges,
.desc.n_linear_ranges = ARRAY_SIZE(ncp_ranges),
};
static const struct regulator_desc pma8084_hfsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(375000, 0, 95, 12500),
REGULATOR_LINEAR_RANGE(1550000, 96, 158, 25000),
},
};
static const struct regulator_desc pma8084_ftsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(350000, 0, 184, 5000),
REGULATOR_LINEAR_RANGE(1280000, 185, 261, 10000),
},
};
static const struct regulator_desc pma8084_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE( 750000, 0, 63, 12500),
REGULATOR_LINEAR_RANGE(1550000, 64, 126, 25000),
REGULATOR_LINEAR_RANGE(3100000, 127, 163, 50000),
};
static const struct regulator_desc pma8084_nldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(750000, 0, 63, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8x41_hfsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE( 375000, 0, 95, 12500),
REGULATOR_LINEAR_RANGE(1575000, 96, 158, 25000),
},
};
static const struct regulator_desc pm8841_ftsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(350000, 0, 184, 5000),
REGULATOR_LINEAR_RANGE(1280000, 185, 261, 10000),
},
};
static const struct regulator_desc pm8941_boost = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(4000000, 0, 30, 50000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8941_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE( 750000, 0, 63, 12500),
REGULATOR_LINEAR_RANGE(1550000, 64, 126, 25000),
REGULATOR_LINEAR_RANGE(3100000, 127, 163, 50000),
};
static const struct regulator_desc pm8941_nldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(750000, 0, 63, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8916_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(750000, 0, 208, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8916_nldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(375000, 0, 93, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8916_buck_lvo_smps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(375000, 0, 95, 12500),
REGULATOR_LINEAR_RANGE(750000, 96, 127, 25000),
},
};
static const struct regulator_desc pm8916_buck_hvo_smps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1550000, 0, 31, 25000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8950_hfsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(375000, 0, 95, 12500),
REGULATOR_LINEAR_RANGE(1550000, 96, 127, 25000),
},
};
static const struct regulator_desc pm8950_ftsmps2p5 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(80000, 0, 255, 5000),
REGULATOR_LINEAR_RANGE(160000, 256, 460, 10000),
},
};
static const struct regulator_desc pm8950_ult_nldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(375000, 0, 202, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8950_ult_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1750000, 0, 127, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8950_pldo_lv = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1500000, 0, 16, 25000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8950_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(975000, 0, 164, 12500),
},
.n_linear_ranges = 1,
static const struct regulator_desc pm8994_hfsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE( 375000, 0, 95, 12500),
REGULATOR_LINEAR_RANGE(1550000, 96, 158, 25000),
},
};
static const struct regulator_desc pm8994_ftsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(350000, 0, 199, 5000),
REGULATOR_LINEAR_RANGE(700000, 200, 349, 10000),
},
};
static const struct regulator_desc pm8994_nldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(750000, 0, 63, 12500),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8994_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE( 750000, 0, 63, 12500),
REGULATOR_LINEAR_RANGE(1550000, 64, 126, 25000),
REGULATOR_LINEAR_RANGE(3100000, 127, 163, 50000),
};
static const struct regulator_desc pmi8994_ftsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(350000, 0, 199, 5000),
REGULATOR_LINEAR_RANGE(700000, 200, 349, 10000),
},
};
static const struct regulator_desc pmi8994_hfsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(350000, 0, 80, 12500),
REGULATOR_LINEAR_RANGE(700000, 81, 141, 25000),
},
};
static const struct regulator_desc pmi8994_bby = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(3000000, 0, 44, 50000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pmi8994_boost = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(4000000, 0, 30, 50000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8998_ftsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(320000, 0, 258, 4000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8998_hfsmps = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(320000, 0, 215, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8998_nldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(312000, 0, 127, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8998_pldo = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1664000, 0, 255, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pm8998_pldo_lv = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1256000, 0, 127, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pmi8998_bob = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1824000, 0, 83, 32000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pms405_hfsmps3 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(320000, 0, 215, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pms405_nldo300 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(312000, 0, 127, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pms405_nldo1200 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(312000, 0, 127, 8000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pms405_pldo50 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1664000, 0, 128, 16000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pms405_pldo150 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1664000, 0, 128, 16000),
},
.n_linear_ranges = 1,
};
static const struct regulator_desc pms405_pldo600 = {
- .linear_ranges = (struct regulator_linear_range[]) {
+ .linear_ranges = (struct linear_range[]) {
REGULATOR_LINEAR_RANGE(1256000, 0, 98, 8000),
},
.n_linear_ranges = 1,
RK808_BUCK4_CONFIG_REG,
};
-static const struct regulator_linear_range rk808_ldo3_voltage_ranges[] = {
+static const struct linear_range rk808_ldo3_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(800000, 0, 13, 100000),
REGULATOR_LINEAR_RANGE(2500000, 15, 15, 0),
};
#define RK809_BUCK5_SEL_CNT (8)
-static const struct regulator_linear_range rk809_buck5_voltage_ranges[] = {
+static const struct linear_range rk809_buck5_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(1500000, 0, 0, 0),
REGULATOR_LINEAR_RANGE(1800000, 1, 3, 200000),
REGULATOR_LINEAR_RANGE(2800000, 4, 5, 200000),
#define RK817_BUCK1_SEL_CNT (RK817_BUCK1_SEL0 + RK817_BUCK1_SEL1 + 1)
#define RK817_BUCK3_SEL_CNT (RK817_BUCK1_SEL0 + RK817_BUCK3_SEL1 + 1)
-static const struct regulator_linear_range rk817_buck1_voltage_ranges[] = {
+static const struct linear_range rk817_buck1_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(RK817_BUCK1_MIN0, 0,
RK817_BUCK1_SEL0, RK817_BUCK1_STP0),
REGULATOR_LINEAR_RANGE(RK817_BUCK1_MIN1, RK817_BUCK1_SEL0 + 1,
RK817_BUCK1_SEL_CNT, RK817_BUCK1_STP1),
};
-static const struct regulator_linear_range rk817_buck3_voltage_ranges[] = {
+static const struct linear_range rk817_buck3_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(RK817_BUCK1_MIN0, 0,
RK817_BUCK1_SEL0, RK817_BUCK1_STP0),
REGULATOR_LINEAR_RANGE(RK817_BUCK1_MIN1, RK817_BUCK1_SEL0 + 1,
.set_suspend_disable = rk808_set_suspend_disable,
};
-static const struct regulator_linear_range rk805_buck_1_2_voltage_ranges[] = {
+static const struct linear_range rk805_buck_1_2_voltage_ranges[] = {
REGULATOR_LINEAR_RANGE(712500, 0, 59, 12500),
REGULATOR_LINEAR_RANGE(1800000, 60, 62, 200000),
REGULATOR_LINEAR_RANGE(2300000, 63, 63, 0),
}
/* voltage range for s2mps15 LDO 3, 5, 15, 16, 18, 20, 23 and 27 */
-static const struct regulator_linear_range s2mps15_ldo_voltage_ranges1[] = {
+static const struct linear_range s2mps15_ldo_voltage_ranges1[] = {
REGULATOR_LINEAR_RANGE(1000000, 0xc, 0x38, 25000),
};
/* voltage range for s2mps15 LDO 2, 6, 14, 17, 19, 21, 24 and 25 */
-static const struct regulator_linear_range s2mps15_ldo_voltage_ranges2[] = {
+static const struct linear_range s2mps15_ldo_voltage_ranges2[] = {
REGULATOR_LINEAR_RANGE(1800000, 0x0, 0x3f, 25000),
};
/* voltage range for s2mps15 LDO 4, 11, 12, 13, 22 and 26 */
-static const struct regulator_linear_range s2mps15_ldo_voltage_ranges3[] = {
+static const struct linear_range s2mps15_ldo_voltage_ranges3[] = {
REGULATOR_LINEAR_RANGE(700000, 0x0, 0x34, 12500),
};
/* voltage range for s2mps15 LDO 7, 8, 9 and 10 */
-static const struct regulator_linear_range s2mps15_ldo_voltage_ranges4[] = {
+static const struct linear_range s2mps15_ldo_voltage_ranges4[] = {
REGULATOR_LINEAR_RANGE(700000, 0x10, 0x20, 25000),
};
/* voltage range for s2mps15 LDO 1 */
-static const struct regulator_linear_range s2mps15_ldo_voltage_ranges5[] = {
+static const struct linear_range s2mps15_ldo_voltage_ranges5[] = {
REGULATOR_LINEAR_RANGE(500000, 0x0, 0x20, 12500),
};
/* voltage range for s2mps15 BUCK 1, 2, 3, 4, 5, 6 and 7 */
-static const struct regulator_linear_range s2mps15_buck_voltage_ranges1[] = {
+static const struct linear_range s2mps15_buck_voltage_ranges1[] = {
REGULATOR_LINEAR_RANGE(500000, 0x20, 0xc0, 6250),
};
/* voltage range for s2mps15 BUCK 8, 9 and 10 */
-static const struct regulator_linear_range s2mps15_buck_voltage_ranges2[] = {
+static const struct linear_range s2mps15_buck_voltage_ranges2[] = {
REGULATOR_LINEAR_RANGE(1000000, 0x20, 0x78, 12500),
};
.is_enabled = regulator_is_enabled_regmap,
};
-static const struct regulator_linear_range sky81452_reg_ranges[] = {
+static const struct linear_range sky81452_reg_ranges[] = {
REGULATOR_LINEAR_RANGE(4500000, 0, 14, 250000),
REGULATOR_LINEAR_RANGE(9000000, 15, 31, 1000000),
};
/* Ramp delay worst case is (2250uV/uS) */
#define PMIC_RAMP_DELAY 2200
-static const struct regulator_linear_range buck1_ranges[] = {
+static const struct linear_range buck1_ranges[] = {
REGULATOR_LINEAR_RANGE(725000, 0, 4, 0),
REGULATOR_LINEAR_RANGE(725000, 5, 36, 25000),
REGULATOR_LINEAR_RANGE(1500000, 37, 63, 0),
};
-static const struct regulator_linear_range buck2_ranges[] = {
+static const struct linear_range buck2_ranges[] = {
REGULATOR_LINEAR_RANGE(1000000, 0, 17, 0),
REGULATOR_LINEAR_RANGE(1050000, 18, 19, 0),
REGULATOR_LINEAR_RANGE(1100000, 20, 21, 0),
REGULATOR_LINEAR_RANGE(1500000, 36, 63, 0),
};
-static const struct regulator_linear_range buck3_ranges[] = {
+static const struct linear_range buck3_ranges[] = {
REGULATOR_LINEAR_RANGE(1000000, 0, 19, 0),
REGULATOR_LINEAR_RANGE(1100000, 20, 23, 0),
REGULATOR_LINEAR_RANGE(1200000, 24, 27, 0),
REGULATOR_LINEAR_RANGE(3400000, 56, 63, 0),
};
-static const struct regulator_linear_range buck4_ranges[] = {
+static const struct linear_range buck4_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 27, 25000),
REGULATOR_LINEAR_RANGE(1300000, 28, 29, 0),
REGULATOR_LINEAR_RANGE(1350000, 30, 31, 0),
REGULATOR_LINEAR_RANGE(3900000, 61, 63, 0),
};
-static const struct regulator_linear_range ldo1_ranges[] = {
+static const struct linear_range ldo1_ranges[] = {
REGULATOR_LINEAR_RANGE(1700000, 0, 7, 0),
REGULATOR_LINEAR_RANGE(1700000, 8, 24, 100000),
REGULATOR_LINEAR_RANGE(3300000, 25, 31, 0),
};
-static const struct regulator_linear_range ldo2_ranges[] = {
+static const struct linear_range ldo2_ranges[] = {
REGULATOR_LINEAR_RANGE(1700000, 0, 7, 0),
REGULATOR_LINEAR_RANGE(1700000, 8, 24, 100000),
REGULATOR_LINEAR_RANGE(3300000, 25, 30, 0),
};
-static const struct regulator_linear_range ldo3_ranges[] = {
+static const struct linear_range ldo3_ranges[] = {
REGULATOR_LINEAR_RANGE(1700000, 0, 7, 0),
REGULATOR_LINEAR_RANGE(1700000, 8, 24, 100000),
REGULATOR_LINEAR_RANGE(3300000, 25, 30, 0),
REGULATOR_LINEAR_RANGE(500000, 31, 31, 0),
};
-static const struct regulator_linear_range ldo5_ranges[] = {
+static const struct linear_range ldo5_ranges[] = {
REGULATOR_LINEAR_RANGE(1700000, 0, 7, 0),
REGULATOR_LINEAR_RANGE(1700000, 8, 30, 100000),
REGULATOR_LINEAR_RANGE(3900000, 31, 31, 0),
};
-static const struct regulator_linear_range ldo6_ranges[] = {
+static const struct linear_range ldo6_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 24, 100000),
REGULATOR_LINEAR_RANGE(3300000, 25, 31, 0),
};
unsigned int decay_mask;
};
-static const struct regulator_linear_range tps65086_10mv_ranges[] = {
+static const struct linear_range tps65086_10mv_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x0, 0x0, 0),
REGULATOR_LINEAR_RANGE(410000, 0x1, 0x7F, 10000),
};
-static const struct regulator_linear_range tps65086_buck126_25mv_ranges[] = {
+static const struct linear_range tps65086_buck126_25mv_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x0, 0x0, 0),
REGULATOR_LINEAR_RANGE(1000000, 0x1, 0x18, 0),
REGULATOR_LINEAR_RANGE(1025000, 0x19, 0x7F, 25000),
};
-static const struct regulator_linear_range tps65086_buck345_25mv_ranges[] = {
+static const struct linear_range tps65086_buck345_25mv_ranges[] = {
REGULATOR_LINEAR_RANGE(0, 0x0, 0x0, 0),
REGULATOR_LINEAR_RANGE(425000, 0x1, 0x7F, 25000),
};
-static const struct regulator_linear_range tps65086_ldoa1_ranges[] = {
+static const struct linear_range tps65086_ldoa1_ranges[] = {
REGULATOR_LINEAR_RANGE(1350000, 0x0, 0x0, 0),
REGULATOR_LINEAR_RANGE(1500000, 0x1, 0x7, 100000),
REGULATOR_LINEAR_RANGE(2300000, 0x8, 0xB, 100000),
REGULATOR_LINEAR_RANGE(3300000, 0xE, 0xE, 0),
};
-static const struct regulator_linear_range tps65086_ldoa23_ranges[] = {
+static const struct linear_range tps65086_ldoa23_ranges[] = {
REGULATOR_LINEAR_RANGE(700000, 0x0, 0xD, 50000),
REGULATOR_LINEAR_RANGE(1400000, 0xE, 0xF, 100000),
};
2800000, 3000000, 3100000, 3300000,
};
-static const struct regulator_linear_range tps65217_uv1_ranges[] = {
+static const struct linear_range tps65217_uv1_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 24, 25000),
REGULATOR_LINEAR_RANGE(1550000, 25, 52, 50000),
REGULATOR_LINEAR_RANGE(3000000, 53, 55, 100000),
REGULATOR_LINEAR_RANGE(3300000, 56, 63, 0),
};
-static const struct regulator_linear_range tps65217_uv2_ranges[] = {
+static const struct linear_range tps65217_uv2_ranges[] = {
REGULATOR_LINEAR_RANGE(1500000, 0, 8, 50000),
REGULATOR_LINEAR_RANGE(2000000, 9, 13, 100000),
REGULATOR_LINEAR_RANGE(2450000, 14, 31, 50000),
.bypass_mask = _sm, \
} \
-static const struct regulator_linear_range dcdc1_dcdc2_ranges[] = {
+static const struct linear_range dcdc1_dcdc2_ranges[] = {
REGULATOR_LINEAR_RANGE(850000, 0x0, 0x32, 10000),
REGULATOR_LINEAR_RANGE(1375000, 0x33, 0x3f, 25000),
};
-static const struct regulator_linear_range ldo1_dcdc3_ranges[] = {
+static const struct linear_range ldo1_dcdc3_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0x0, 0x1a, 25000),
REGULATOR_LINEAR_RANGE(1600000, 0x1b, 0x3f, 50000),
};
-static const struct regulator_linear_range dcdc4_ranges[] = {
+static const struct linear_range dcdc4_ranges[] = {
REGULATOR_LINEAR_RANGE(1175000, 0x0, 0xf, 25000),
REGULATOR_LINEAR_RANGE(1600000, 0x10, 0x34, 50000),
};
.n_linear_ranges = ARRAY_SIZE(_lr), \
}
-static const struct regulator_linear_range tps65912_dcdc_ranges[] = {
+static const struct linear_range tps65912_dcdc_ranges[] = {
REGULATOR_LINEAR_RANGE(500000, 0x0, 0x3f, 50000),
};
-static const struct regulator_linear_range tps65912_ldo_ranges[] = {
+static const struct linear_range tps65912_ldo_ranges[] = {
REGULATOR_LINEAR_RANGE(800000, 0x0, 0x20, 25000),
REGULATOR_LINEAR_RANGE(1650000, 0x21, 0x3c, 50000),
REGULATOR_LINEAR_RANGE(3100000, 0x3d, 0x3f, 100000),
{
struct tps80031_regulator *ri = rdev_get_drvdata(rdev);
struct device *parent = to_tps80031_dev(rdev);
- int ret = -EIO;
+ int ret;
uint8_t ctrl1 = 0;
uint8_t ctrl3 = 0;
{
struct tps80031_regulator *ri = rdev_get_drvdata(rdev);
struct device *parent = to_tps80031_dev(rdev);
- int ret = 0;
+ int ret;
if (ri->config_flags & TPS80031_VBUS_DISCHRG_EN_PDN) {
ret = tps80031_write(parent, TPS80031_SLAVE_ID2,
case TPS80031_REGULATOR_LDOUSB:
if (ri->config_flags & (TPS80031_USBLDO_INPUT_VSYS |
TPS80031_USBLDO_INPUT_PMID)) {
- unsigned val = 0;
+ unsigned val;
+
if (ri->config_flags & TPS80031_USBLDO_INPUT_VSYS)
val = MISC2_LDOUSB_IN_VSYS;
else
};
/* 600mV to 1450mV in 12.5 mV steps */
-static const struct regulator_linear_range VDD1_ranges[] = {
+static const struct linear_range VDD1_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 68, 12500)
};
/* 600mV to 1450mV in 12.5 mV steps, everything above = 1500mV */
-static const struct regulator_linear_range VDD2_ranges[] = {
+static const struct linear_range VDD2_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 68, 12500),
REGULATOR_LINEAR_RANGE(1500000, 69, 69, 12500)
};
};
/*----------------------------------------------------------------------*/
-static const struct regulator_linear_range twl6030ldo_linear_range[] = {
+static const struct linear_range twl6030ldo_linear_range[] = {
REGULATOR_LINEAR_RANGE(0, 0, 0, 0),
REGULATOR_LINEAR_RANGE(1000000, 1, 24, 100000),
REGULATOR_LINEAR_RANGE(2750000, 31, 31, 0),
* BUCKV specifics
*/
-static const struct regulator_linear_range wm831x_buckv_ranges[] = {
+static const struct linear_range wm831x_buckv_ranges[] = {
REGULATOR_LINEAR_RANGE(600000, 0, 0x7, 0),
REGULATOR_LINEAR_RANGE(600000, 0x8, 0x68, 12500),
};
* General purpose LDOs
*/
-static const struct regulator_linear_range wm831x_gp_ldo_ranges[] = {
+static const struct linear_range wm831x_gp_ldo_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 14, 50000),
REGULATOR_LINEAR_RANGE(1700000, 15, 31, 100000),
};
* Analogue LDOs
*/
-static const struct regulator_linear_range wm831x_aldo_ranges[] = {
+static const struct linear_range wm831x_aldo_ranges[] = {
REGULATOR_LINEAR_RANGE(1000000, 0, 12, 50000),
REGULATOR_LINEAR_RANGE(1700000, 13, 31, 100000),
};
return 0;
}
-static const struct regulator_linear_range wm8350_ldo_ranges[] = {
+static const struct linear_range wm8350_ldo_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 15, 50000),
REGULATOR_LINEAR_RANGE(1800000, 16, 31, 100000),
};
#include <linux/regulator/driver.h>
#include <linux/mfd/wm8400-private.h>
-static const struct regulator_linear_range wm8400_ldo_ranges[] = {
+static const struct linear_range wm8400_ldo_ranges[] = {
REGULATOR_LINEAR_RANGE(900000, 0, 14, 50000),
REGULATOR_LINEAR_RANGE(1700000, 15, 31, 100000),
};
ism->smcd = smcd_alloc_dev(&pdev->dev, dev_name(&pdev->dev), &ism_ops,
ISM_NR_DMBS);
- if (!ism->smcd)
+ if (!ism->smcd) {
+ ret = -ENOMEM;
goto err_resource;
+ }
ism->smcd->priv = ism;
ret = ism_dev_init(ism);
return -EINVAL;
}
- ql_log(ql_log_info, vha, 0x70d6,
- "port speed:%d\n", ha->link_data_rate);
-
return scnprintf(buf, PAGE_SIZE, "%s\n", spd[ha->link_data_rate]);
}
dev_dbg(dev, "scsi resume: %d\n", err);
if (err == 0) {
+ bool was_runtime_suspended;
+
+ was_runtime_suspended = pm_runtime_suspended(dev);
+
pm_runtime_disable(dev);
err = pm_runtime_set_active(dev);
pm_runtime_enable(dev);
*/
if (!err && scsi_is_sdev_device(dev)) {
struct scsi_device *sdev = to_scsi_device(dev);
-
- blk_set_runtime_active(sdev->request_queue);
+ if (was_runtime_suspended)
+ blk_post_runtime_resume(sdev->request_queue, 0);
+ else
+ blk_set_runtime_active(sdev->request_queue);
}
}
spin_unlock_irqrestore(&client->lock, flags);
}
- mbox_send_message(client->chan, pkt);
+ err = mbox_send_message(client->chan, pkt);
+ if (err < 0)
+ return err;
/* We can send next packet immediately, so just call txdone. */
mbox_client_txdone(client->chan, 0);
help
general driver for SPI controller core from DesignWare
+if SPI_DESIGNWARE
+
+config SPI_DW_DMA
+ bool "DMA support for DW SPI controller"
+
config SPI_DW_PCI
tristate "PCI interface driver for DW SPI core"
- depends on SPI_DESIGNWARE && PCI
-
-config SPI_DW_MID_DMA
- bool "DMA support for DW SPI controller on Intel MID platform"
- depends on SPI_DW_PCI && DW_DMAC_PCI
+ depends on PCI
config SPI_DW_MMIO
tristate "Memory-mapped io interface driver for DW SPI core"
- depends on SPI_DESIGNWARE
+ depends on HAS_IOMEM
+
+endif
config SPI_DLN2
tristate "Diolan DLN-2 USB SPI adapter"
config SPI_UNIPHIER
tristate "Socionext UniPhier SPI Controller"
depends on (ARCH_UNIPHIER || COMPILE_TEST) && OF
+ depends on HAS_IOMEM
help
This enables a driver for the Socionext UniPhier SoC SCSSI SPI controller.
help
Enables Xilinx GQSPI controller driver for Zynq UltraScale+ MPSoC.
+config SPI_AMD
+ tristate "AMD SPI controller"
+ depends on SPI_MASTER || COMPILE_TEST
+ help
+ Enables SPI controller driver for AMD SoC.
+
#
# Add new SPI master controllers in alphabetical order above this line
#
obj-$(CONFIG_SPI_DAVINCI) += spi-davinci.o
obj-$(CONFIG_SPI_DLN2) += spi-dln2.o
obj-$(CONFIG_SPI_DESIGNWARE) += spi-dw.o
+spi-dw-y := spi-dw-core.o
+spi-dw-$(CONFIG_SPI_DW_DMA) += spi-dw-dma.o
obj-$(CONFIG_SPI_DW_MMIO) += spi-dw-mmio.o
-obj-$(CONFIG_SPI_DW_PCI) += spi-dw-midpci.o
-spi-dw-midpci-objs := spi-dw-pci.o spi-dw-mid.o
+obj-$(CONFIG_SPI_DW_PCI) += spi-dw-pci.o
obj-$(CONFIG_SPI_EFM32) += spi-efm32.o
obj-$(CONFIG_SPI_EP93XX) += spi-ep93xx.o
obj-$(CONFIG_SPI_FALCON) += spi-falcon.o
obj-$(CONFIG_SPI_XTENSA_XTFPGA) += spi-xtensa-xtfpga.o
obj-$(CONFIG_SPI_ZYNQ_QSPI) += spi-zynq-qspi.o
obj-$(CONFIG_SPI_ZYNQMP_GQSPI) += spi-zynqmp-gqspi.o
+obj-$(CONFIG_SPI_AMD) += spi-amd.o
# SPI slave protocol handlers
obj-$(CONFIG_SPI_SLAVE_TIME) += spi-slave-time.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0 OR BSD-3-Clause
+//
+// AMD SPI controller driver
+//
+// Copyright (c) 2020, Advanced Micro Devices, Inc.
+//
+// Author: Sanjay R Mehta <sanju.mehta@amd.com>
+
+#include <linux/acpi.h>
+#include <linux/init.h>
+#include <linux/module.h>
+#include <linux/platform_device.h>
+#include <linux/delay.h>
+#include <linux/spi/spi.h>
+
+#define AMD_SPI_CTRL0_REG 0x00
+#define AMD_SPI_EXEC_CMD BIT(16)
+#define AMD_SPI_FIFO_CLEAR BIT(20)
+#define AMD_SPI_BUSY BIT(31)
+
+#define AMD_SPI_OPCODE_MASK 0xFF
+
+#define AMD_SPI_ALT_CS_REG 0x1D
+#define AMD_SPI_ALT_CS_MASK 0x3
+
+#define AMD_SPI_FIFO_BASE 0x80
+#define AMD_SPI_TX_COUNT_REG 0x48
+#define AMD_SPI_RX_COUNT_REG 0x4B
+#define AMD_SPI_STATUS_REG 0x4C
+
+#define AMD_SPI_MEM_SIZE 200
+
+/* M_CMD OP codes for SPI */
+#define AMD_SPI_XFER_TX 1
+#define AMD_SPI_XFER_RX 2
+
+struct amd_spi {
+ void __iomem *io_remap_addr;
+ unsigned long io_base_addr;
+ u32 rom_addr;
+ u8 chip_select;
+};
+
+static inline u8 amd_spi_readreg8(struct spi_master *master, int idx)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+
+ return ioread8((u8 __iomem *)amd_spi->io_remap_addr + idx);
+}
+
+static inline void amd_spi_writereg8(struct spi_master *master, int idx,
+ u8 val)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+
+ iowrite8(val, ((u8 __iomem *)amd_spi->io_remap_addr + idx));
+}
+
+static inline void amd_spi_setclear_reg8(struct spi_master *master, int idx,
+ u8 set, u8 clear)
+{
+ u8 tmp = amd_spi_readreg8(master, idx);
+
+ tmp = (tmp & ~clear) | set;
+ amd_spi_writereg8(master, idx, tmp);
+}
+
+static inline u32 amd_spi_readreg32(struct spi_master *master, int idx)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+
+ return ioread32((u8 __iomem *)amd_spi->io_remap_addr + idx);
+}
+
+static inline void amd_spi_writereg32(struct spi_master *master, int idx,
+ u32 val)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+
+ iowrite32(val, ((u8 __iomem *)amd_spi->io_remap_addr + idx));
+}
+
+static inline void amd_spi_setclear_reg32(struct spi_master *master, int idx,
+ u32 set, u32 clear)
+{
+ u32 tmp = amd_spi_readreg32(master, idx);
+
+ tmp = (tmp & ~clear) | set;
+ amd_spi_writereg32(master, idx, tmp);
+}
+
+static void amd_spi_select_chip(struct spi_master *master)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+ u8 chip_select = amd_spi->chip_select;
+
+ amd_spi_setclear_reg8(master, AMD_SPI_ALT_CS_REG, chip_select,
+ AMD_SPI_ALT_CS_MASK);
+}
+
+static void amd_spi_clear_fifo_ptr(struct spi_master *master)
+{
+ amd_spi_setclear_reg32(master, AMD_SPI_CTRL0_REG, AMD_SPI_FIFO_CLEAR,
+ AMD_SPI_FIFO_CLEAR);
+}
+
+static void amd_spi_set_opcode(struct spi_master *master, u8 cmd_opcode)
+{
+ amd_spi_setclear_reg32(master, AMD_SPI_CTRL0_REG, cmd_opcode,
+ AMD_SPI_OPCODE_MASK);
+}
+
+static inline void amd_spi_set_rx_count(struct spi_master *master,
+ u8 rx_count)
+{
+ amd_spi_setclear_reg8(master, AMD_SPI_RX_COUNT_REG, rx_count, 0xff);
+}
+
+static inline void amd_spi_set_tx_count(struct spi_master *master,
+ u8 tx_count)
+{
+ amd_spi_setclear_reg8(master, AMD_SPI_TX_COUNT_REG, tx_count, 0xff);
+}
+
+static inline int amd_spi_busy_wait(struct amd_spi *amd_spi)
+{
+ bool spi_busy;
+ int timeout = 100000;
+
+ /* poll for SPI bus to become idle */
+ spi_busy = (ioread32((u8 __iomem *)amd_spi->io_remap_addr +
+ AMD_SPI_CTRL0_REG) & AMD_SPI_BUSY) == AMD_SPI_BUSY;
+ while (spi_busy) {
+ usleep_range(10, 20);
+ if (timeout-- < 0)
+ return -ETIMEDOUT;
+
+ spi_busy = (ioread32((u8 __iomem *)amd_spi->io_remap_addr +
+ AMD_SPI_CTRL0_REG) & AMD_SPI_BUSY) == AMD_SPI_BUSY;
+ }
+
+ return 0;
+}
+
+static void amd_spi_execute_opcode(struct spi_master *master)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+
+ /* Set ExecuteOpCode bit in the CTRL0 register */
+ amd_spi_setclear_reg32(master, AMD_SPI_CTRL0_REG, AMD_SPI_EXEC_CMD,
+ AMD_SPI_EXEC_CMD);
+
+ amd_spi_busy_wait(amd_spi);
+}
+
+static int amd_spi_master_setup(struct spi_device *spi)
+{
+ struct spi_master *master = spi->master;
+
+ amd_spi_clear_fifo_ptr(master);
+
+ return 0;
+}
+
+static inline int amd_spi_fifo_xfer(struct amd_spi *amd_spi,
+ struct spi_master *master,
+ struct spi_message *message)
+{
+ struct spi_transfer *xfer = NULL;
+ u8 cmd_opcode;
+ u8 *buf = NULL;
+ u32 m_cmd = 0;
+ u32 i = 0;
+ u32 tx_len = 0, rx_len = 0;
+
+ list_for_each_entry(xfer, &message->transfers,
+ transfer_list) {
+ if (xfer->rx_buf)
+ m_cmd = AMD_SPI_XFER_RX;
+ if (xfer->tx_buf)
+ m_cmd = AMD_SPI_XFER_TX;
+
+ if (m_cmd & AMD_SPI_XFER_TX) {
+ buf = (u8 *)xfer->tx_buf;
+ tx_len = xfer->len - 1;
+ cmd_opcode = *(u8 *)xfer->tx_buf;
+ buf++;
+ amd_spi_set_opcode(master, cmd_opcode);
+
+ /* Write data into the FIFO. */
+ for (i = 0; i < tx_len; i++) {
+ iowrite8(buf[i],
+ ((u8 __iomem *)amd_spi->io_remap_addr +
+ AMD_SPI_FIFO_BASE + i));
+ }
+
+ amd_spi_set_tx_count(master, tx_len);
+ amd_spi_clear_fifo_ptr(master);
+ /* Execute command */
+ amd_spi_execute_opcode(master);
+ }
+ if (m_cmd & AMD_SPI_XFER_RX) {
+ /*
+ * Store no. of bytes to be received from
+ * FIFO
+ */
+ rx_len = xfer->len;
+ buf = (u8 *)xfer->rx_buf;
+ amd_spi_set_rx_count(master, rx_len);
+ amd_spi_clear_fifo_ptr(master);
+ /* Execute command */
+ amd_spi_execute_opcode(master);
+ /* Read data from FIFO to receive buffer */
+ for (i = 0; i < rx_len; i++)
+ buf[i] = amd_spi_readreg8(master,
+ AMD_SPI_FIFO_BASE +
+ tx_len + i);
+ }
+ }
+
+ /* Update statistics */
+ message->actual_length = tx_len + rx_len + 1;
+ /* complete the transaction */
+ message->status = 0;
+ spi_finalize_current_message(master);
+
+ return 0;
+}
+
+static int amd_spi_master_transfer(struct spi_master *master,
+ struct spi_message *msg)
+{
+ struct amd_spi *amd_spi = spi_master_get_devdata(master);
+ struct spi_device *spi = msg->spi;
+
+ amd_spi->chip_select = spi->chip_select;
+ amd_spi_select_chip(master);
+
+ /*
+ * Extract spi_transfers from the spi message and
+ * program the controller.
+ */
+ amd_spi_fifo_xfer(amd_spi, master, msg);
+
+ return 0;
+}
+
+static int amd_spi_probe(struct platform_device *pdev)
+{
+ struct device *dev = &pdev->dev;
+ struct spi_master *master;
+ struct amd_spi *amd_spi;
+ struct resource *res;
+ int err = 0;
+
+ /* Allocate storage for spi_master and driver private data */
+ master = spi_alloc_master(dev, sizeof(struct amd_spi));
+ if (!master) {
+ dev_err(dev, "Error allocating SPI master\n");
+ return -ENOMEM;
+ }
+
+ amd_spi = spi_master_get_devdata(master);
+
+ res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
+ amd_spi->io_remap_addr = devm_ioremap_resource(&pdev->dev, res);
+ if (IS_ERR(amd_spi->io_remap_addr)) {
+ err = PTR_ERR(amd_spi->io_remap_addr);
+ dev_err(dev, "error %d ioremap of SPI registers failed\n", err);
+ goto err_free_master;
+ }
+ dev_dbg(dev, "io_remap_address: %p\n", amd_spi->io_remap_addr);
+
+ /* Initialize the spi_master fields */
+ master->bus_num = 0;
+ master->num_chipselect = 4;
+ master->mode_bits = 0;
+ master->flags = SPI_MASTER_HALF_DUPLEX;
+ master->setup = amd_spi_master_setup;
+ master->transfer_one_message = amd_spi_master_transfer;
+
+ /* Register the controller with SPI framework */
+ err = devm_spi_register_master(dev, master);
+ if (err) {
+ dev_err(dev, "error %d registering SPI controller\n", err);
+ goto err_free_master;
+ }
+
+ return 0;
+
+err_free_master:
+ spi_master_put(master);
+
+ return err;
+}
+
+static const struct acpi_device_id spi_acpi_match[] = {
+ { "AMDI0061", 0 },
+ {},
+};
+MODULE_DEVICE_TABLE(acpi, spi_acpi_match);
+
+static struct platform_driver amd_spi_driver = {
+ .driver = {
+ .name = "amd_spi",
+ .acpi_match_table = ACPI_PTR(spi_acpi_match),
+ },
+ .probe = amd_spi_probe,
+};
+
+module_platform_driver(amd_spi_driver);
+
+MODULE_LICENSE("Dual BSD/GPL");
+MODULE_AUTHOR("Sanjay Mehta <sanju.mehta@amd.com>");
+MODULE_DESCRIPTION("AMD SPI Master Controller Driver");
return -ETIMEDOUT;
}
-static int a3700_spi_init(struct a3700_spi *a3700_spi)
+static void a3700_spi_init(struct a3700_spi *a3700_spi)
{
struct spi_master *master = a3700_spi->master;
u32 val;
- int i, ret = 0;
+ int i;
/* Reset SPI unit */
val = spireg_read(a3700_spi, A3700_SPI_IF_CFG_REG);
/* Mask the interrupts and clear cause bits */
spireg_write(a3700_spi, A3700_SPI_INT_MASK_REG, 0);
spireg_write(a3700_spi, A3700_SPI_INT_STAT_REG, ~0U);
-
- return ret;
}
static irqreturn_t a3700_spi_interrupt(int irq, void *dev_id)
master->min_speed_hz = DIV_ROUND_UP(clk_get_rate(spi->clk),
A3700_SPI_MAX_PRESCALE);
- ret = a3700_spi_init(spi);
- if (ret)
- goto error_clk;
+ a3700_spi_init(spi);
ret = devm_request_irq(dev, spi->irq, a3700_spi_interrupt, 0,
dev_name(dev), master);
static int atmel_spi_next_xfer_dma_submit(struct spi_master *master,
struct spi_transfer *xfer,
u32 *plen)
+ __must_hold(&as->lock)
{
struct atmel_spi *as = spi_master_get_devdata(master);
struct dma_chan *rxchan = master->dma_rx;
spin_lock_init(&spi_engine->lock);
- spi_engine->base = devm_platform_ioremap_resource(pdev, 0);
- if (IS_ERR(spi_engine->base)) {
- ret = PTR_ERR(spi_engine->base);
- goto err_put_master;
- }
-
- version = readl(spi_engine->base + SPI_ENGINE_REG_VERSION);
- if (SPI_ENGINE_VERSION_MAJOR(version) != 1) {
- dev_err(&pdev->dev, "Unsupported peripheral version %u.%u.%c\n",
- SPI_ENGINE_VERSION_MAJOR(version),
- SPI_ENGINE_VERSION_MINOR(version),
- SPI_ENGINE_VERSION_PATCH(version));
- ret = -ENODEV;
- goto err_put_master;
- }
-
spi_engine->clk = devm_clk_get(&pdev->dev, "s_axi_aclk");
if (IS_ERR(spi_engine->clk)) {
ret = PTR_ERR(spi_engine->clk);
if (ret)
goto err_clk_disable;
+ spi_engine->base = devm_platform_ioremap_resource(pdev, 0);
+ if (IS_ERR(spi_engine->base)) {
+ ret = PTR_ERR(spi_engine->base);
+ goto err_ref_clk_disable;
+ }
+
+ version = readl(spi_engine->base + SPI_ENGINE_REG_VERSION);
+ if (SPI_ENGINE_VERSION_MAJOR(version) != 1) {
+ dev_err(&pdev->dev, "Unsupported peripheral version %u.%u.%c\n",
+ SPI_ENGINE_VERSION_MAJOR(version),
+ SPI_ENGINE_VERSION_MINOR(version),
+ SPI_ENGINE_VERSION_PATCH(version));
+ ret = -ENODEV;
+ goto err_ref_clk_disable;
+ }
+
writel_relaxed(0x00, spi_engine->base + SPI_ENGINE_REG_RESET);
writel_relaxed(0xff, spi_engine->base + SPI_ENGINE_REG_INT_PENDING);
writel_relaxed(0x00, spi_engine->base + SPI_ENGINE_REG_INT_ENABLE);
#define MSPI_MSPI_STATUS 0x020
#define MSPI_CPTQP 0x024
#define MSPI_SPCR3 0x028
+#define MSPI_REV 0x02c
#define MSPI_TXRAM 0x040
#define MSPI_RXRAM 0x0c0
#define MSPI_CDRAM 0x140
#define MSPI_SPCR2_SPE BIT(6)
#define MSPI_SPCR2_CONT_AFTER_CMD BIT(7)
+#define MSPI_SPCR3_FASTBR BIT(0)
+#define MSPI_SPCR3_FASTDT BIT(1)
+#define MSPI_SPCR3_SYSCLKSEL_MASK GENMASK(11, 10)
+#define MSPI_SPCR3_SYSCLKSEL_27 (MSPI_SPCR3_SYSCLKSEL_MASK & \
+ ~(BIT(10) | BIT(11)))
+#define MSPI_SPCR3_SYSCLKSEL_108 (MSPI_SPCR3_SYSCLKSEL_MASK & \
+ BIT(11))
+
#define MSPI_MSPI_STATUS_SPIF BIT(0)
#define INTR_BASE_BIT_SHIFT 0x02
#define INTR_COUNT 0x07
#define NUM_CHIPSELECT 4
-#define QSPI_SPBR_MIN 8U
#define QSPI_SPBR_MAX 255U
+#define MSPI_BASE_FREQ 27000000UL
#define OPCODE_DIOR 0xBB
#define OPCODE_QIOR 0xEB
struct bcm_qspi_dev_id *dev_ids;
struct completion mspi_done;
struct completion bspi_done;
+ u8 mspi_maj_rev;
+ u8 mspi_min_rev;
+ bool mspi_spcr3_sysclk;
};
static inline bool has_bspi(struct bcm_qspi *qspi)
return qspi->bspi_mode;
}
+/* hardware supports spcr3 and fast baud-rate */
+static inline bool bcm_qspi_has_fastbr(struct bcm_qspi *qspi)
+{
+ if (!has_bspi(qspi) &&
+ ((qspi->mspi_maj_rev >= 1) &&
+ (qspi->mspi_min_rev >= 5)))
+ return true;
+
+ return false;
+}
+
+/* hardware supports sys clk 108Mhz */
+static inline bool bcm_qspi_has_sysclk_108(struct bcm_qspi *qspi)
+{
+ if (!has_bspi(qspi) && (qspi->mspi_spcr3_sysclk ||
+ ((qspi->mspi_maj_rev >= 1) &&
+ (qspi->mspi_min_rev >= 6))))
+ return true;
+
+ return false;
+}
+
+static inline int bcm_qspi_spbr_min(struct bcm_qspi *qspi)
+{
+ if (bcm_qspi_has_fastbr(qspi))
+ return 1;
+ else
+ return 8;
+}
+
/* Read qspi controller register*/
static inline u32 bcm_qspi_read(struct bcm_qspi *qspi, enum base_type type,
unsigned int offset)
if (xp->speed_hz)
spbr = qspi->base_clk / (2 * xp->speed_hz);
- spcr = clamp_val(spbr, QSPI_SPBR_MIN, QSPI_SPBR_MAX);
+ spcr = clamp_val(spbr, bcm_qspi_spbr_min(qspi), QSPI_SPBR_MAX);
bcm_qspi_write(qspi, MSPI, MSPI_SPCR0_LSB, spcr);
- spcr = MSPI_MASTER_BIT;
+ if (!qspi->mspi_maj_rev)
+ /* legacy controller */
+ spcr = MSPI_MASTER_BIT;
+ else
+ spcr = 0;
+
/* for 16 bit the data should be zero */
if (xp->bits_per_word != 16)
spcr |= xp->bits_per_word << 2;
spcr |= xp->mode & 3;
+
bcm_qspi_write(qspi, MSPI, MSPI_SPCR0_MSB, spcr);
+ if (bcm_qspi_has_fastbr(qspi)) {
+ spcr = 0;
+
+ /* enable fastbr */
+ spcr |= MSPI_SPCR3_FASTBR;
+
+ if (bcm_qspi_has_sysclk_108(qspi)) {
+ /* SYSCLK_108 */
+ spcr |= MSPI_SPCR3_SYSCLKSEL_108;
+ qspi->base_clk = MSPI_BASE_FREQ * 4;
+ /* Change spbr as we changed sysclk */
+ bcm_qspi_write(qspi, MSPI, MSPI_SPCR0_LSB, 4);
+ }
+
+ bcm_qspi_write(qspi, MSPI, MSPI_SPCR3, spcr);
+ }
+
qspi->last_parms = *xp;
}
if (qt->trans->cs_change &&
(flags & TRANS_STATUS_BREAK_CS_CHANGE))
ret |= TRANS_STATUS_BREAK_CS_CHANGE;
- if (ret)
- goto done;
- dev_dbg(&qspi->pdev->dev, "advance msg exit\n");
if (bcm_qspi_mspi_transfer_is_last(qspi, qt))
- ret = TRANS_STATUS_BREAK_EOM;
+ ret |= TRANS_STATUS_BREAK_EOM;
else
- ret = TRANS_STATUS_BREAK_NO_BYTES;
+ ret |= TRANS_STATUS_BREAK_NO_BYTES;
qt->trans = NULL;
}
-done:
dev_dbg(&qspi->pdev->dev, "trans %p len %d byte %d ret %x\n",
qt->trans, qt->trans ? qt->trans->len : 0, qt->byte, ret);
return ret;
if (buf)
buf[tp.byte] = read_rxram_slot_u8(qspi, slot);
dev_dbg(&qspi->pdev->dev, "RD %02x\n",
- buf ? buf[tp.byte] : 0xff);
+ buf ? buf[tp.byte] : 0x0);
} else {
u16 *buf = tp.trans->rx_buf;
buf[tp.byte / 2] = read_rxram_slot_u16(qspi,
slot);
dev_dbg(&qspi->pdev->dev, "RD %04x\n",
- buf ? buf[tp.byte] : 0xffff);
+ buf ? buf[tp.byte / 2] : 0x0);
}
update_qspi_trans_byte_count(qspi, &tp,
while (!tstatus && slot < MSPI_NUM_CDRAM) {
if (tp.trans->bits_per_word <= 8) {
const u8 *buf = tp.trans->tx_buf;
- u8 val = buf ? buf[tp.byte] : 0xff;
+ u8 val = buf ? buf[tp.byte] : 0x00;
write_txram_slot_u8(qspi, slot, val);
dev_dbg(&qspi->pdev->dev, "WR %02x\n", val);
} else {
const u16 *buf = tp.trans->tx_buf;
- u16 val = buf ? buf[tp.byte / 2] : 0xffff;
+ u16 val = buf ? buf[tp.byte / 2] : 0x0000;
write_txram_slot_u16(qspi, slot, val);
dev_dbg(&qspi->pdev->dev, "WR %04x\n", val);
bcm_qspi_write(qspi, MSPI, MSPI_NEWQP, 0);
bcm_qspi_write(qspi, MSPI, MSPI_ENDQP, slot - 1);
- if (tstatus & TRANS_STATUS_BREAK_DESELECT) {
+ /*
+ * case 1) EOM =1, cs_change =0: SSb inactive
+ * case 2) EOM =1, cs_change =1: SSb stay active
+ * case 3) EOM =0, cs_change =0: SSb stay active
+ * case 4) EOM =0, cs_change =1: SSb inactive
+ */
+ if (((tstatus & TRANS_STATUS_BREAK_DESELECT)
+ == TRANS_STATUS_BREAK_CS_CHANGE) ||
+ ((tstatus & TRANS_STATUS_BREAK_DESELECT)
+ == TRANS_STATUS_BREAK_EOM)) {
mspi_cdram = read_cdram_slot(qspi, slot - 1) &
~MSPI_CDRAM_CONT_BIT;
write_cdram_slot(qspi, slot - 1, mspi_cdram);
.exec_op = bcm_qspi_exec_mem_op,
};
+struct bcm_qspi_data {
+ bool has_mspi_rev;
+ bool has_spcr3_sysclk;
+};
+
+static const struct bcm_qspi_data bcm_qspi_no_rev_data = {
+ .has_mspi_rev = false,
+ .has_spcr3_sysclk = false,
+};
+
+static const struct bcm_qspi_data bcm_qspi_rev_data = {
+ .has_mspi_rev = true,
+ .has_spcr3_sysclk = false,
+};
+
+static const struct bcm_qspi_data bcm_qspi_spcr3_data = {
+ .has_mspi_rev = true,
+ .has_spcr3_sysclk = true,
+};
+
static const struct of_device_id bcm_qspi_of_match[] = {
- { .compatible = "brcm,spi-bcm-qspi" },
+ {
+ .compatible = "brcm,spi-bcm7425-qspi",
+ .data = &bcm_qspi_no_rev_data,
+ },
+ {
+ .compatible = "brcm,spi-bcm7429-qspi",
+ .data = &bcm_qspi_no_rev_data,
+ },
+ {
+ .compatible = "brcm,spi-bcm7435-qspi",
+ .data = &bcm_qspi_no_rev_data,
+ },
+ {
+ .compatible = "brcm,spi-bcm-qspi",
+ .data = &bcm_qspi_rev_data,
+ },
+ {
+ .compatible = "brcm,spi-bcm7216-qspi",
+ .data = &bcm_qspi_spcr3_data,
+ },
+ {
+ .compatible = "brcm,spi-bcm7278-qspi",
+ .data = &bcm_qspi_spcr3_data,
+ },
{},
};
MODULE_DEVICE_TABLE(of, bcm_qspi_of_match);
int bcm_qspi_probe(struct platform_device *pdev,
struct bcm_qspi_soc_intc *soc_intc)
{
+ const struct of_device_id *of_id = NULL;
+ const struct bcm_qspi_data *data;
struct device *dev = &pdev->dev;
struct bcm_qspi *qspi;
struct spi_master *master;
struct resource *res;
int irq, ret = 0, num_ints = 0;
u32 val;
+ u32 rev = 0;
const char *name = NULL;
int num_irqs = ARRAY_SIZE(qspi_irq_tab);
if (!dev->of_node)
return -ENODEV;
- if (!of_match_node(bcm_qspi_of_match, dev->of_node))
+ of_id = of_match_node(bcm_qspi_of_match, dev->of_node);
+ if (!of_id)
return -ENODEV;
+ data = of_id->data;
+
master = spi_alloc_master(dev, sizeof(struct bcm_qspi));
if (!master) {
dev_err(dev, "error allocating spi_master\n");
}
qspi = spi_master_get_devdata(master);
+
+ qspi->clk = devm_clk_get_optional(&pdev->dev, NULL);
+ if (IS_ERR(qspi->clk))
+ return PTR_ERR(qspi->clk);
+
qspi->pdev = pdev;
qspi->trans_pos.trans = NULL;
qspi->trans_pos.byte = 0;
qspi->soc_intc = NULL;
}
- qspi->clk = devm_clk_get(&pdev->dev, NULL);
- if (IS_ERR(qspi->clk)) {
- dev_warn(dev, "unable to get clock\n");
- ret = PTR_ERR(qspi->clk);
- goto qspi_probe_err;
- }
-
ret = clk_prepare_enable(qspi->clk);
if (ret) {
dev_err(dev, "failed to prepare clock\n");
}
qspi->base_clk = clk_get_rate(qspi->clk);
- qspi->max_speed_hz = qspi->base_clk / (QSPI_SPBR_MIN * 2);
+
+ if (data->has_mspi_rev) {
+ rev = bcm_qspi_read(qspi, MSPI, MSPI_REV);
+ /* some older revs do not have a MSPI_REV register */
+ if ((rev & 0xff) == 0xff)
+ rev = 0;
+ }
+
+ qspi->mspi_maj_rev = (rev >> 4) & 0xf;
+ qspi->mspi_min_rev = rev & 0xf;
+ qspi->mspi_spcr3_sysclk = data->has_spcr3_sysclk;
+
+ qspi->max_speed_hz = qspi->base_clk / (bcm_qspi_spbr_min(qspi) * 2);
bcm_qspi_hw_init(qspi);
init_completion(&qspi->mspi_done);
bcm_qspi_read(qspi, BSPI, BSPI_STRAP_OVERRIDE_CTRL);
spi_master_suspend(qspi->master);
- clk_disable(qspi->clk);
+ clk_disable_unprepare(qspi->clk);
bcm_qspi_hw_uninit(qspi);
return 0;
qspi->soc_intc->bcm_qspi_int_set(qspi->soc_intc, MSPI_DONE,
true);
- ret = clk_enable(qspi->clk);
+ ret = clk_prepare_enable(qspi->clk);
if (!ret)
spi_master_resume(qspi->master);
}
#endif /* CONFIG_DEBUG_FS */
-static inline u32 bcm2835_rd(struct bcm2835_spi *bs, unsigned reg)
+static inline u32 bcm2835_rd(struct bcm2835_spi *bs, unsigned int reg)
{
return readl(bs->regs + reg);
}
-static inline void bcm2835_wr(struct bcm2835_spi *bs, unsigned reg, u32 val)
+static inline void bcm2835_wr(struct bcm2835_spi *bs, unsigned int reg, u32 val)
{
writel(val, bs->regs + reg);
}
if (dma_mapping_error(ctlr->dma_tx->device->dev, bs->fill_tx_addr)) {
dev_err(dev, "cannot map zero page - not using DMA mode\n");
bs->fill_tx_addr = 0;
+ ret = -ENOMEM;
goto err_release;
}
DMA_MEM_TO_DEV, 0);
if (!bs->fill_tx_desc) {
dev_err(dev, "cannot prepare fill_tx_desc - not using DMA mode\n");
+ ret = -ENOMEM;
goto err_release;
}
if (dma_mapping_error(ctlr->dma_rx->device->dev, bs->clear_rx_addr)) {
dev_err(dev, "cannot map clear_rx_cs - not using DMA mode\n");
bs->clear_rx_addr = 0;
+ ret = -ENOMEM;
goto err_release;
}
DMA_MEM_TO_DEV, 0);
if (!bs->clear_rx_desc[i]) {
dev_err(dev, "cannot prepare clear_rx_desc - not using DMA mode\n");
+ ret = -ENOMEM;
goto err_release;
}
goto out_dma_release;
}
- err = devm_spi_register_controller(&pdev->dev, ctlr);
+ err = spi_register_controller(ctlr);
if (err) {
dev_err(&pdev->dev, "could not register SPI controller: %d\n",
err);
bcm2835_debugfs_remove(bs);
+ spi_unregister_controller(ctlr);
+
+ bcm2835_dma_release(ctlr, bs);
+
/* Clear FIFOs, and disable the HW block */
bcm2835_wr(bs, BCM2835_SPI_CS,
BCM2835_SPI_CS_CLEAR_RX | BCM2835_SPI_CS_CLEAR_TX);
clk_disable_unprepare(bs->clk);
- bcm2835_dma_release(ctlr, bs);
-
return 0;
}
+static void bcm2835_spi_shutdown(struct platform_device *pdev)
+{
+ int ret;
+
+ ret = bcm2835_spi_remove(pdev);
+ if (ret)
+ dev_err(&pdev->dev, "failed to shutdown\n");
+}
+
static const struct of_device_id bcm2835_spi_match[] = {
{ .compatible = "brcm,bcm2835-spi", },
{}
},
.probe = bcm2835_spi_probe,
.remove = bcm2835_spi_remove,
+ .shutdown = bcm2835_spi_shutdown,
};
module_platform_driver(bcm2835_spi_driver);
goto out_clk_disable;
}
- err = devm_spi_register_master(&pdev->dev, master);
+ err = spi_register_master(master);
if (err) {
dev_err(&pdev->dev, "could not register SPI master: %d\n", err);
goto out_clk_disable;
bcm2835aux_debugfs_remove(bs);
+ spi_unregister_master(master);
+
bcm2835aux_spi_reset_hw(bs);
/* disable the HW block by releasing the clock */
u8 tmode; /* TR/TO/RO/EEPROM */
u8 type; /* SPI/SSP/MicroWire */
- u8 poll_mode; /* 1 means use poll mode */
-
u16 clk_div; /* baud rate divider */
u32 speed_hz; /* baud rate */
- void (*cs_control)(u32 command);
};
#ifdef CONFIG_DEBUG_FS
-#define SPI_REGS_BUFSIZE 1024
-static ssize_t dw_spi_show_regs(struct file *file, char __user *user_buf,
- size_t count, loff_t *ppos)
-{
- struct dw_spi *dws = file->private_data;
- char *buf;
- u32 len = 0;
- ssize_t ret;
-
- buf = kzalloc(SPI_REGS_BUFSIZE, GFP_KERNEL);
- if (!buf)
- return 0;
-
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "%s registers:\n", dev_name(&dws->master->dev));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "=================================\n");
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "CTRL0: \t\t0x%08x\n", dw_readl(dws, DW_SPI_CTRL0));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "CTRL1: \t\t0x%08x\n", dw_readl(dws, DW_SPI_CTRL1));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "SSIENR: \t0x%08x\n", dw_readl(dws, DW_SPI_SSIENR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "SER: \t\t0x%08x\n", dw_readl(dws, DW_SPI_SER));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "BAUDR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_BAUDR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "TXFTLR: \t0x%08x\n", dw_readl(dws, DW_SPI_TXFLTR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "RXFTLR: \t0x%08x\n", dw_readl(dws, DW_SPI_RXFLTR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "TXFLR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_TXFLR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "RXFLR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_RXFLR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "SR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_SR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "IMR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_IMR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "ISR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_ISR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "DMACR: \t\t0x%08x\n", dw_readl(dws, DW_SPI_DMACR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "DMATDLR: \t0x%08x\n", dw_readl(dws, DW_SPI_DMATDLR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "DMARDLR: \t0x%08x\n", dw_readl(dws, DW_SPI_DMARDLR));
- len += scnprintf(buf + len, SPI_REGS_BUFSIZE - len,
- "=================================\n");
-
- ret = simple_read_from_buffer(user_buf, count, ppos, buf, len);
- kfree(buf);
- return ret;
+
+#define DW_SPI_DBGFS_REG(_name, _off) \
+{ \
+ .name = _name, \
+ .offset = _off, \
}
-static const struct file_operations dw_spi_regs_ops = {
- .owner = THIS_MODULE,
- .open = simple_open,
- .read = dw_spi_show_regs,
- .llseek = default_llseek,
+static const struct debugfs_reg32 dw_spi_dbgfs_regs[] = {
+ DW_SPI_DBGFS_REG("CTRLR0", DW_SPI_CTRLR0),
+ DW_SPI_DBGFS_REG("CTRLR1", DW_SPI_CTRLR1),
+ DW_SPI_DBGFS_REG("SSIENR", DW_SPI_SSIENR),
+ DW_SPI_DBGFS_REG("SER", DW_SPI_SER),
+ DW_SPI_DBGFS_REG("BAUDR", DW_SPI_BAUDR),
+ DW_SPI_DBGFS_REG("TXFTLR", DW_SPI_TXFTLR),
+ DW_SPI_DBGFS_REG("RXFTLR", DW_SPI_RXFTLR),
+ DW_SPI_DBGFS_REG("TXFLR", DW_SPI_TXFLR),
+ DW_SPI_DBGFS_REG("RXFLR", DW_SPI_RXFLR),
+ DW_SPI_DBGFS_REG("SR", DW_SPI_SR),
+ DW_SPI_DBGFS_REG("IMR", DW_SPI_IMR),
+ DW_SPI_DBGFS_REG("ISR", DW_SPI_ISR),
+ DW_SPI_DBGFS_REG("DMACR", DW_SPI_DMACR),
+ DW_SPI_DBGFS_REG("DMATDLR", DW_SPI_DMATDLR),
+ DW_SPI_DBGFS_REG("DMARDLR", DW_SPI_DMARDLR),
};
static int dw_spi_debugfs_init(struct dw_spi *dws)
if (!dws->debugfs)
return -ENOMEM;
- debugfs_create_file("registers", S_IFREG | S_IRUGO,
- dws->debugfs, (void *)dws, &dw_spi_regs_ops);
+ dws->regset.regs = dw_spi_dbgfs_regs;
+ dws->regset.nregs = ARRAY_SIZE(dw_spi_dbgfs_regs);
+ dws->regset.base = dws->regs;
+ debugfs_create_regset32("registers", 0400, dws->debugfs, &dws->regset);
+
return 0;
}
void dw_spi_set_cs(struct spi_device *spi, bool enable)
{
struct dw_spi *dws = spi_controller_get_devdata(spi->controller);
- struct chip_data *chip = spi_get_ctldata(spi);
-
- /* Chip select logic is inverted from spi_set_cs() */
- if (chip && chip->cs_control)
- chip->cs_control(!enable);
+ bool cs_high = !!(spi->mode & SPI_CS_HIGH);
- if (!enable)
+ /*
+ * DW SPI controller demands any native CS being set in order to
+ * proceed with data transfer. So in order to activate the SPI
+ * communications we must set a corresponding bit in the Slave
+ * Enable register no matter whether the SPI core is configured to
+ * support active-high or active-low CS level.
+ */
+ if (cs_high == enable)
dw_writel(dws, DW_SPI_SER, BIT(spi->chip_select));
else if (dws->cs_override)
dw_writel(dws, DW_SPI_SER, 0);
return dws->transfer_handler(dws);
}
-/* Must be called inside pump_transfers() */
-static int poll_transfer(struct dw_spi *dws)
+/* Configure CTRLR0 for DW_apb_ssi */
+u32 dw_spi_update_cr0(struct spi_controller *master, struct spi_device *spi,
+ struct spi_transfer *transfer)
{
- do {
- dw_writer(dws);
- dw_reader(dws);
- cpu_relax();
- } while (dws->rx_end > dws->rx);
+ struct chip_data *chip = spi_get_ctldata(spi);
+ u32 cr0;
- return 0;
+ /* Default SPI mode is SCPOL = 0, SCPH = 0 */
+ cr0 = (transfer->bits_per_word - 1)
+ | (chip->type << SPI_FRF_OFFSET)
+ | ((((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) |
+ (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) |
+ (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET))
+ | (chip->tmode << SPI_TMOD_OFFSET);
+
+ return cr0;
+}
+EXPORT_SYMBOL_GPL(dw_spi_update_cr0);
+
+/* Configure CTRLR0 for DWC_ssi */
+u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
+ struct spi_device *spi,
+ struct spi_transfer *transfer)
+{
+ struct chip_data *chip = spi_get_ctldata(spi);
+ u32 cr0;
+
+ /* CTRLR0[ 4: 0] Data Frame Size */
+ cr0 = (transfer->bits_per_word - 1);
+
+ /* CTRLR0[ 7: 6] Frame Format */
+ cr0 |= chip->type << DWC_SSI_CTRLR0_FRF_OFFSET;
+
+ /*
+ * SPI mode (SCPOL|SCPH)
+ * CTRLR0[ 8] Serial Clock Phase
+ * CTRLR0[ 9] Serial Clock Polarity
+ */
+ cr0 |= ((spi->mode & SPI_CPOL) ? 1 : 0) << DWC_SSI_CTRLR0_SCPOL_OFFSET;
+ cr0 |= ((spi->mode & SPI_CPHA) ? 1 : 0) << DWC_SSI_CTRLR0_SCPH_OFFSET;
+
+ /* CTRLR0[11:10] Transfer Mode */
+ cr0 |= chip->tmode << DWC_SSI_CTRLR0_TMOD_OFFSET;
+
+ /* CTRLR0[13] Shift Register Loop */
+ cr0 |= ((spi->mode & SPI_LOOP) ? 1 : 0) << DWC_SSI_CTRLR0_SRL_OFFSET;
+
+ return cr0;
}
+EXPORT_SYMBOL_GPL(dw_spi_update_cr0_v1_01a);
static int dw_spi_transfer_one(struct spi_controller *master,
struct spi_device *spi, struct spi_transfer *transfer)
spi_set_clk(dws, chip->clk_div);
}
+ transfer->effective_speed_hz = dws->max_freq / chip->clk_div;
dws->n_bytes = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
- dws->dma_width = DIV_ROUND_UP(transfer->bits_per_word, BITS_PER_BYTE);
-
- /* Default SPI mode is SCPOL = 0, SCPH = 0 */
- cr0 = (transfer->bits_per_word - 1)
- | (chip->type << SPI_FRF_OFFSET)
- | ((((spi->mode & SPI_CPOL) ? 1 : 0) << SPI_SCOL_OFFSET) |
- (((spi->mode & SPI_CPHA) ? 1 : 0) << SPI_SCPH_OFFSET) |
- (((spi->mode & SPI_LOOP) ? 1 : 0) << SPI_SRL_OFFSET))
- | (chip->tmode << SPI_TMOD_OFFSET);
- /*
- * Adjust transfer mode if necessary. Requires platform dependent
- * chipselect mechanism.
- */
- if (chip->cs_control) {
- if (dws->rx && dws->tx)
- chip->tmode = SPI_TMOD_TR;
- else if (dws->rx)
- chip->tmode = SPI_TMOD_RO;
- else
- chip->tmode = SPI_TMOD_TO;
-
- cr0 &= ~SPI_TMOD_MASK;
- cr0 |= (chip->tmode << SPI_TMOD_OFFSET);
- }
-
- dw_writel(dws, DW_SPI_CTRL0, cr0);
+ cr0 = dws->update_cr0(master, spi, transfer);
+ dw_writel(dws, DW_SPI_CTRLR0, cr0);
/* Check if current transfer is a DMA transaction */
if (master->can_dma && master->can_dma(master, spi, transfer))
spi_enable_chip(dws, 1);
return ret;
}
- } else if (!chip->poll_mode) {
+ } else {
txlevel = min_t(u16, dws->fifo_len / 2, dws->len / dws->n_bytes);
- dw_writel(dws, DW_SPI_TXFLTR, txlevel);
+ dw_writel(dws, DW_SPI_TXFTLR, txlevel);
/* Set the interrupt mask */
imask |= SPI_INT_TXEI | SPI_INT_TXOI |
spi_enable_chip(dws, 1);
- if (dws->dma_mapped) {
- ret = dws->dma_ops->dma_transfer(dws, transfer);
- if (ret < 0)
- return ret;
- }
-
- if (chip->poll_mode)
- return poll_transfer(dws);
+ if (dws->dma_mapped)
+ return dws->dma_ops->dma_transfer(dws, transfer);
return 1;
}
/* This may be called twice for each spi dev */
static int dw_spi_setup(struct spi_device *spi)
{
- struct dw_spi_chip *chip_info = NULL;
struct chip_data *chip;
/* Only alloc on first setup */
spi_set_ctldata(spi, chip);
}
- /*
- * Protocol drivers may change the chip settings, so...
- * if chip_info exists, use it
- */
- chip_info = spi->controller_data;
-
- /* chip_info doesn't always exist */
- if (chip_info) {
- if (chip_info->cs_control)
- chip->cs_control = chip_info->cs_control;
-
- chip->poll_mode = chip_info->poll_mode;
- chip->type = chip_info->type;
- }
-
chip->tmode = SPI_TMOD_TR;
return 0;
u32 fifo;
for (fifo = 1; fifo < 256; fifo++) {
- dw_writel(dws, DW_SPI_TXFLTR, fifo);
- if (fifo != dw_readl(dws, DW_SPI_TXFLTR))
+ dw_writel(dws, DW_SPI_TXFTLR, fifo);
+ if (fifo != dw_readl(dws, DW_SPI_TXFTLR))
break;
}
- dw_writel(dws, DW_SPI_TXFLTR, 0);
+ dw_writel(dws, DW_SPI_TXFTLR, 0);
dws->fifo_len = (fifo == 1) ? 0 : fifo;
dev_dbg(dev, "Detected FIFO size: %u bytes\n", dws->fifo_len);
dws->master = master;
dws->type = SSI_MOTO_SPI;
- dws->dma_inited = 0;
dws->dma_addr = (dma_addr_t)(dws->paddr + DW_SPI_DR);
spin_lock_init(&dws->buf_lock);
spi_hw_init(dev, dws);
if (dws->dma_ops && dws->dma_ops->dma_init) {
- ret = dws->dma_ops->dma_init(dws);
+ ret = dws->dma_ops->dma_init(dev, dws);
if (ret) {
dev_warn(dev, "DMA init failed\n");
- dws->dma_inited = 0;
} else {
master->can_dma = dws->dma_ops->can_dma;
+ master->flags |= SPI_CONTROLLER_MUST_TX;
}
}
- ret = devm_spi_register_controller(dev, master);
+ ret = spi_register_controller(master);
if (ret) {
dev_err(&master->dev, "problem registering spi master\n");
goto err_dma_exit;
{
dw_spi_debugfs_remove(dws);
+ spi_unregister_controller(dws->master);
+
if (dws->dma_ops && dws->dma_ops->dma_exit)
dws->dma_ops->dma_exit(dws);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Special handling for DW DMA core
+ *
+ * Copyright (c) 2009, 2014 Intel Corporation.
+ */
+
+#include <linux/completion.h>
+#include <linux/dma-mapping.h>
+#include <linux/dmaengine.h>
+#include <linux/irqreturn.h>
+#include <linux/jiffies.h>
+#include <linux/pci.h>
+#include <linux/platform_data/dma-dw.h>
+#include <linux/spi/spi.h>
+#include <linux/types.h>
+
+#include "spi-dw.h"
+
+#define WAIT_RETRIES 5
+#define RX_BUSY 0
+#define RX_BURST_LEVEL 16
+#define TX_BUSY 1
+#define TX_BURST_LEVEL 16
+
+static bool dw_spi_dma_chan_filter(struct dma_chan *chan, void *param)
+{
+ struct dw_dma_slave *s = param;
+
+ if (s->dma_dev != chan->device->dev)
+ return false;
+
+ chan->private = s;
+ return true;
+}
+
+static void dw_spi_dma_maxburst_init(struct dw_spi *dws)
+{
+ struct dma_slave_caps caps;
+ u32 max_burst, def_burst;
+ int ret;
+
+ def_burst = dws->fifo_len / 2;
+
+ ret = dma_get_slave_caps(dws->rxchan, &caps);
+ if (!ret && caps.max_burst)
+ max_burst = caps.max_burst;
+ else
+ max_burst = RX_BURST_LEVEL;
+
+ dws->rxburst = min(max_burst, def_burst);
+
+ ret = dma_get_slave_caps(dws->txchan, &caps);
+ if (!ret && caps.max_burst)
+ max_burst = caps.max_burst;
+ else
+ max_burst = TX_BURST_LEVEL;
+
+ dws->txburst = min(max_burst, def_burst);
+}
+
+static int dw_spi_dma_init_mfld(struct device *dev, struct dw_spi *dws)
+{
+ struct dw_dma_slave dma_tx = { .dst_id = 1 }, *tx = &dma_tx;
+ struct dw_dma_slave dma_rx = { .src_id = 0 }, *rx = &dma_rx;
+ struct pci_dev *dma_dev;
+ dma_cap_mask_t mask;
+
+ /*
+ * Get pci device for DMA controller, currently it could only
+ * be the DMA controller of Medfield
+ */
+ dma_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0827, NULL);
+ if (!dma_dev)
+ return -ENODEV;
+
+ dma_cap_zero(mask);
+ dma_cap_set(DMA_SLAVE, mask);
+
+ /* 1. Init rx channel */
+ rx->dma_dev = &dma_dev->dev;
+ dws->rxchan = dma_request_channel(mask, dw_spi_dma_chan_filter, rx);
+ if (!dws->rxchan)
+ goto err_exit;
+
+ /* 2. Init tx channel */
+ tx->dma_dev = &dma_dev->dev;
+ dws->txchan = dma_request_channel(mask, dw_spi_dma_chan_filter, tx);
+ if (!dws->txchan)
+ goto free_rxchan;
+
+ dws->master->dma_rx = dws->rxchan;
+ dws->master->dma_tx = dws->txchan;
+
+ init_completion(&dws->dma_completion);
+
+ dw_spi_dma_maxburst_init(dws);
+
+ return 0;
+
+free_rxchan:
+ dma_release_channel(dws->rxchan);
+ dws->rxchan = NULL;
+err_exit:
+ return -EBUSY;
+}
+
+static int dw_spi_dma_init_generic(struct device *dev, struct dw_spi *dws)
+{
+ dws->rxchan = dma_request_slave_channel(dev, "rx");
+ if (!dws->rxchan)
+ return -ENODEV;
+
+ dws->txchan = dma_request_slave_channel(dev, "tx");
+ if (!dws->txchan) {
+ dma_release_channel(dws->rxchan);
+ dws->rxchan = NULL;
+ return -ENODEV;
+ }
+
+ dws->master->dma_rx = dws->rxchan;
+ dws->master->dma_tx = dws->txchan;
+
+ init_completion(&dws->dma_completion);
+
+ dw_spi_dma_maxburst_init(dws);
+
+ return 0;
+}
+
+static void dw_spi_dma_exit(struct dw_spi *dws)
+{
+ if (dws->txchan) {
+ dmaengine_terminate_sync(dws->txchan);
+ dma_release_channel(dws->txchan);
+ }
+
+ if (dws->rxchan) {
+ dmaengine_terminate_sync(dws->rxchan);
+ dma_release_channel(dws->rxchan);
+ }
+
+ dw_writel(dws, DW_SPI_DMACR, 0);
+}
+
+static irqreturn_t dw_spi_dma_transfer_handler(struct dw_spi *dws)
+{
+ u16 irq_status = dw_readl(dws, DW_SPI_ISR);
+
+ if (!irq_status)
+ return IRQ_NONE;
+
+ dw_readl(dws, DW_SPI_ICR);
+ spi_reset_chip(dws);
+
+ dev_err(&dws->master->dev, "%s: FIFO overrun/underrun\n", __func__);
+ dws->master->cur_msg->status = -EIO;
+ complete(&dws->dma_completion);
+ return IRQ_HANDLED;
+}
+
+static bool dw_spi_can_dma(struct spi_controller *master,
+ struct spi_device *spi, struct spi_transfer *xfer)
+{
+ struct dw_spi *dws = spi_controller_get_devdata(master);
+
+ return xfer->len > dws->fifo_len;
+}
+
+static enum dma_slave_buswidth dw_spi_dma_convert_width(u8 n_bytes)
+{
+ if (n_bytes == 1)
+ return DMA_SLAVE_BUSWIDTH_1_BYTE;
+ else if (n_bytes == 2)
+ return DMA_SLAVE_BUSWIDTH_2_BYTES;
+
+ return DMA_SLAVE_BUSWIDTH_UNDEFINED;
+}
+
+static int dw_spi_dma_wait(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+ unsigned long long ms;
+
+ ms = xfer->len * MSEC_PER_SEC * BITS_PER_BYTE;
+ do_div(ms, xfer->effective_speed_hz);
+ ms += ms + 200;
+
+ if (ms > UINT_MAX)
+ ms = UINT_MAX;
+
+ ms = wait_for_completion_timeout(&dws->dma_completion,
+ msecs_to_jiffies(ms));
+
+ if (ms == 0) {
+ dev_err(&dws->master->cur_msg->spi->dev,
+ "DMA transaction timed out\n");
+ return -ETIMEDOUT;
+ }
+
+ return 0;
+}
+
+static inline bool dw_spi_dma_tx_busy(struct dw_spi *dws)
+{
+ return !(dw_readl(dws, DW_SPI_SR) & SR_TF_EMPT);
+}
+
+static int dw_spi_dma_wait_tx_done(struct dw_spi *dws,
+ struct spi_transfer *xfer)
+{
+ int retry = WAIT_RETRIES;
+ struct spi_delay delay;
+ u32 nents;
+
+ nents = dw_readl(dws, DW_SPI_TXFLR);
+ delay.unit = SPI_DELAY_UNIT_SCK;
+ delay.value = nents * dws->n_bytes * BITS_PER_BYTE;
+
+ while (dw_spi_dma_tx_busy(dws) && retry--)
+ spi_delay_exec(&delay, xfer);
+
+ if (retry < 0) {
+ dev_err(&dws->master->dev, "Tx hanged up\n");
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/*
+ * dws->dma_chan_busy is set before the dma transfer starts, callback for tx
+ * channel will clear a corresponding bit.
+ */
+static void dw_spi_dma_tx_done(void *arg)
+{
+ struct dw_spi *dws = arg;
+
+ clear_bit(TX_BUSY, &dws->dma_chan_busy);
+ if (test_bit(RX_BUSY, &dws->dma_chan_busy))
+ return;
+
+ dw_writel(dws, DW_SPI_DMACR, 0);
+ complete(&dws->dma_completion);
+}
+
+static struct dma_async_tx_descriptor *
+dw_spi_dma_prepare_tx(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+ struct dma_slave_config txconf;
+ struct dma_async_tx_descriptor *txdesc;
+
+ if (!xfer->tx_buf)
+ return NULL;
+
+ memset(&txconf, 0, sizeof(txconf));
+ txconf.direction = DMA_MEM_TO_DEV;
+ txconf.dst_addr = dws->dma_addr;
+ txconf.dst_maxburst = dws->txburst;
+ txconf.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+ txconf.dst_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
+ txconf.device_fc = false;
+
+ dmaengine_slave_config(dws->txchan, &txconf);
+
+ txdesc = dmaengine_prep_slave_sg(dws->txchan,
+ xfer->tx_sg.sgl,
+ xfer->tx_sg.nents,
+ DMA_MEM_TO_DEV,
+ DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+ if (!txdesc)
+ return NULL;
+
+ txdesc->callback = dw_spi_dma_tx_done;
+ txdesc->callback_param = dws;
+
+ return txdesc;
+}
+
+static inline bool dw_spi_dma_rx_busy(struct dw_spi *dws)
+{
+ return !!(dw_readl(dws, DW_SPI_SR) & SR_RF_NOT_EMPT);
+}
+
+static int dw_spi_dma_wait_rx_done(struct dw_spi *dws)
+{
+ int retry = WAIT_RETRIES;
+ struct spi_delay delay;
+ unsigned long ns, us;
+ u32 nents;
+
+ /*
+ * It's unlikely that DMA engine is still doing the data fetching, but
+ * if it's let's give it some reasonable time. The timeout calculation
+ * is based on the synchronous APB/SSI reference clock rate, on a
+ * number of data entries left in the Rx FIFO, times a number of clock
+ * periods normally needed for a single APB read/write transaction
+ * without PREADY signal utilized (which is true for the DW APB SSI
+ * controller).
+ */
+ nents = dw_readl(dws, DW_SPI_RXFLR);
+ ns = 4U * NSEC_PER_SEC / dws->max_freq * nents;
+ if (ns <= NSEC_PER_USEC) {
+ delay.unit = SPI_DELAY_UNIT_NSECS;
+ delay.value = ns;
+ } else {
+ us = DIV_ROUND_UP(ns, NSEC_PER_USEC);
+ delay.unit = SPI_DELAY_UNIT_USECS;
+ delay.value = clamp_val(us, 0, USHRT_MAX);
+ }
+
+ while (dw_spi_dma_rx_busy(dws) && retry--)
+ spi_delay_exec(&delay, NULL);
+
+ if (retry < 0) {
+ dev_err(&dws->master->dev, "Rx hanged up\n");
+ return -EIO;
+ }
+
+ return 0;
+}
+
+/*
+ * dws->dma_chan_busy is set before the dma transfer starts, callback for rx
+ * channel will clear a corresponding bit.
+ */
+static void dw_spi_dma_rx_done(void *arg)
+{
+ struct dw_spi *dws = arg;
+
+ clear_bit(RX_BUSY, &dws->dma_chan_busy);
+ if (test_bit(TX_BUSY, &dws->dma_chan_busy))
+ return;
+
+ dw_writel(dws, DW_SPI_DMACR, 0);
+ complete(&dws->dma_completion);
+}
+
+static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws,
+ struct spi_transfer *xfer)
+{
+ struct dma_slave_config rxconf;
+ struct dma_async_tx_descriptor *rxdesc;
+
+ if (!xfer->rx_buf)
+ return NULL;
+
+ memset(&rxconf, 0, sizeof(rxconf));
+ rxconf.direction = DMA_DEV_TO_MEM;
+ rxconf.src_addr = dws->dma_addr;
+ rxconf.src_maxburst = dws->rxburst;
+ rxconf.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
+ rxconf.src_addr_width = dw_spi_dma_convert_width(dws->n_bytes);
+ rxconf.device_fc = false;
+
+ dmaengine_slave_config(dws->rxchan, &rxconf);
+
+ rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
+ xfer->rx_sg.sgl,
+ xfer->rx_sg.nents,
+ DMA_DEV_TO_MEM,
+ DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
+ if (!rxdesc)
+ return NULL;
+
+ rxdesc->callback = dw_spi_dma_rx_done;
+ rxdesc->callback_param = dws;
+
+ return rxdesc;
+}
+
+static int dw_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+ u16 imr = 0, dma_ctrl = 0;
+
+ dw_writel(dws, DW_SPI_DMARDLR, dws->rxburst - 1);
+ dw_writel(dws, DW_SPI_DMATDLR, dws->fifo_len - dws->txburst);
+
+ if (xfer->tx_buf)
+ dma_ctrl |= SPI_DMA_TDMAE;
+ if (xfer->rx_buf)
+ dma_ctrl |= SPI_DMA_RDMAE;
+ dw_writel(dws, DW_SPI_DMACR, dma_ctrl);
+
+ /* Set the interrupt mask */
+ if (xfer->tx_buf)
+ imr |= SPI_INT_TXOI;
+ if (xfer->rx_buf)
+ imr |= SPI_INT_RXUI | SPI_INT_RXOI;
+ spi_umask_intr(dws, imr);
+
+ reinit_completion(&dws->dma_completion);
+
+ dws->transfer_handler = dw_spi_dma_transfer_handler;
+
+ return 0;
+}
+
+static int dw_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
+{
+ struct dma_async_tx_descriptor *txdesc, *rxdesc;
+ int ret;
+
+ /* Prepare the TX dma transfer */
+ txdesc = dw_spi_dma_prepare_tx(dws, xfer);
+
+ /* Prepare the RX dma transfer */
+ rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
+
+ /* rx must be started before tx due to spi instinct */
+ if (rxdesc) {
+ set_bit(RX_BUSY, &dws->dma_chan_busy);
+ dmaengine_submit(rxdesc);
+ dma_async_issue_pending(dws->rxchan);
+ }
+
+ if (txdesc) {
+ set_bit(TX_BUSY, &dws->dma_chan_busy);
+ dmaengine_submit(txdesc);
+ dma_async_issue_pending(dws->txchan);
+ }
+
+ ret = dw_spi_dma_wait(dws, xfer);
+ if (ret)
+ return ret;
+
+ if (txdesc && dws->master->cur_msg->status == -EINPROGRESS) {
+ ret = dw_spi_dma_wait_tx_done(dws, xfer);
+ if (ret)
+ return ret;
+ }
+
+ if (rxdesc && dws->master->cur_msg->status == -EINPROGRESS)
+ ret = dw_spi_dma_wait_rx_done(dws);
+
+ return ret;
+}
+
+static void dw_spi_dma_stop(struct dw_spi *dws)
+{
+ if (test_bit(TX_BUSY, &dws->dma_chan_busy)) {
+ dmaengine_terminate_sync(dws->txchan);
+ clear_bit(TX_BUSY, &dws->dma_chan_busy);
+ }
+ if (test_bit(RX_BUSY, &dws->dma_chan_busy)) {
+ dmaengine_terminate_sync(dws->rxchan);
+ clear_bit(RX_BUSY, &dws->dma_chan_busy);
+ }
+
+ dw_writel(dws, DW_SPI_DMACR, 0);
+}
+
+static const struct dw_spi_dma_ops dw_spi_dma_mfld_ops = {
+ .dma_init = dw_spi_dma_init_mfld,
+ .dma_exit = dw_spi_dma_exit,
+ .dma_setup = dw_spi_dma_setup,
+ .can_dma = dw_spi_can_dma,
+ .dma_transfer = dw_spi_dma_transfer,
+ .dma_stop = dw_spi_dma_stop,
+};
+
+void dw_spi_dma_setup_mfld(struct dw_spi *dws)
+{
+ dws->dma_ops = &dw_spi_dma_mfld_ops;
+}
+EXPORT_SYMBOL_GPL(dw_spi_dma_setup_mfld);
+
+static const struct dw_spi_dma_ops dw_spi_dma_generic_ops = {
+ .dma_init = dw_spi_dma_init_generic,
+ .dma_exit = dw_spi_dma_exit,
+ .dma_setup = dw_spi_dma_setup,
+ .can_dma = dw_spi_can_dma,
+ .dma_transfer = dw_spi_dma_transfer,
+ .dma_stop = dw_spi_dma_stop,
+};
+
+void dw_spi_dma_setup_generic(struct dw_spi *dws)
+{
+ dws->dma_ops = &dw_spi_dma_generic_ops;
+}
+EXPORT_SYMBOL_GPL(dw_spi_dma_setup_generic);
+++ /dev/null
-// SPDX-License-Identifier: GPL-2.0-only
-/*
- * Special handling for DW core on Intel MID platform
- *
- * Copyright (c) 2009, 2014 Intel Corporation.
- */
-
-#include <linux/dma-mapping.h>
-#include <linux/dmaengine.h>
-#include <linux/interrupt.h>
-#include <linux/slab.h>
-#include <linux/spi/spi.h>
-#include <linux/types.h>
-
-#include "spi-dw.h"
-
-#ifdef CONFIG_SPI_DW_MID_DMA
-#include <linux/pci.h>
-#include <linux/platform_data/dma-dw.h>
-
-#define RX_BUSY 0
-#define TX_BUSY 1
-
-static struct dw_dma_slave mid_dma_tx = { .dst_id = 1 };
-static struct dw_dma_slave mid_dma_rx = { .src_id = 0 };
-
-static bool mid_spi_dma_chan_filter(struct dma_chan *chan, void *param)
-{
- struct dw_dma_slave *s = param;
-
- if (s->dma_dev != chan->device->dev)
- return false;
-
- chan->private = s;
- return true;
-}
-
-static int mid_spi_dma_init(struct dw_spi *dws)
-{
- struct pci_dev *dma_dev;
- struct dw_dma_slave *tx = dws->dma_tx;
- struct dw_dma_slave *rx = dws->dma_rx;
- dma_cap_mask_t mask;
-
- /*
- * Get pci device for DMA controller, currently it could only
- * be the DMA controller of Medfield
- */
- dma_dev = pci_get_device(PCI_VENDOR_ID_INTEL, 0x0827, NULL);
- if (!dma_dev)
- return -ENODEV;
-
- dma_cap_zero(mask);
- dma_cap_set(DMA_SLAVE, mask);
-
- /* 1. Init rx channel */
- rx->dma_dev = &dma_dev->dev;
- dws->rxchan = dma_request_channel(mask, mid_spi_dma_chan_filter, rx);
- if (!dws->rxchan)
- goto err_exit;
- dws->master->dma_rx = dws->rxchan;
-
- /* 2. Init tx channel */
- tx->dma_dev = &dma_dev->dev;
- dws->txchan = dma_request_channel(mask, mid_spi_dma_chan_filter, tx);
- if (!dws->txchan)
- goto free_rxchan;
- dws->master->dma_tx = dws->txchan;
-
- dws->dma_inited = 1;
- return 0;
-
-free_rxchan:
- dma_release_channel(dws->rxchan);
-err_exit:
- return -EBUSY;
-}
-
-static void mid_spi_dma_exit(struct dw_spi *dws)
-{
- if (!dws->dma_inited)
- return;
-
- dmaengine_terminate_sync(dws->txchan);
- dma_release_channel(dws->txchan);
-
- dmaengine_terminate_sync(dws->rxchan);
- dma_release_channel(dws->rxchan);
-}
-
-static irqreturn_t dma_transfer(struct dw_spi *dws)
-{
- u16 irq_status = dw_readl(dws, DW_SPI_ISR);
-
- if (!irq_status)
- return IRQ_NONE;
-
- dw_readl(dws, DW_SPI_ICR);
- spi_reset_chip(dws);
-
- dev_err(&dws->master->dev, "%s: FIFO overrun/underrun\n", __func__);
- dws->master->cur_msg->status = -EIO;
- spi_finalize_current_transfer(dws->master);
- return IRQ_HANDLED;
-}
-
-static bool mid_spi_can_dma(struct spi_controller *master,
- struct spi_device *spi, struct spi_transfer *xfer)
-{
- struct dw_spi *dws = spi_controller_get_devdata(master);
-
- if (!dws->dma_inited)
- return false;
-
- return xfer->len > dws->fifo_len;
-}
-
-static enum dma_slave_buswidth convert_dma_width(u32 dma_width) {
- if (dma_width == 1)
- return DMA_SLAVE_BUSWIDTH_1_BYTE;
- else if (dma_width == 2)
- return DMA_SLAVE_BUSWIDTH_2_BYTES;
-
- return DMA_SLAVE_BUSWIDTH_UNDEFINED;
-}
-
-/*
- * dws->dma_chan_busy is set before the dma transfer starts, callback for tx
- * channel will clear a corresponding bit.
- */
-static void dw_spi_dma_tx_done(void *arg)
-{
- struct dw_spi *dws = arg;
-
- clear_bit(TX_BUSY, &dws->dma_chan_busy);
- if (test_bit(RX_BUSY, &dws->dma_chan_busy))
- return;
- spi_finalize_current_transfer(dws->master);
-}
-
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_tx(struct dw_spi *dws,
- struct spi_transfer *xfer)
-{
- struct dma_slave_config txconf;
- struct dma_async_tx_descriptor *txdesc;
-
- if (!xfer->tx_buf)
- return NULL;
-
- txconf.direction = DMA_MEM_TO_DEV;
- txconf.dst_addr = dws->dma_addr;
- txconf.dst_maxburst = 16;
- txconf.src_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
- txconf.dst_addr_width = convert_dma_width(dws->dma_width);
- txconf.device_fc = false;
-
- dmaengine_slave_config(dws->txchan, &txconf);
-
- txdesc = dmaengine_prep_slave_sg(dws->txchan,
- xfer->tx_sg.sgl,
- xfer->tx_sg.nents,
- DMA_MEM_TO_DEV,
- DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
- if (!txdesc)
- return NULL;
-
- txdesc->callback = dw_spi_dma_tx_done;
- txdesc->callback_param = dws;
-
- return txdesc;
-}
-
-/*
- * dws->dma_chan_busy is set before the dma transfer starts, callback for rx
- * channel will clear a corresponding bit.
- */
-static void dw_spi_dma_rx_done(void *arg)
-{
- struct dw_spi *dws = arg;
-
- clear_bit(RX_BUSY, &dws->dma_chan_busy);
- if (test_bit(TX_BUSY, &dws->dma_chan_busy))
- return;
- spi_finalize_current_transfer(dws->master);
-}
-
-static struct dma_async_tx_descriptor *dw_spi_dma_prepare_rx(struct dw_spi *dws,
- struct spi_transfer *xfer)
-{
- struct dma_slave_config rxconf;
- struct dma_async_tx_descriptor *rxdesc;
-
- if (!xfer->rx_buf)
- return NULL;
-
- rxconf.direction = DMA_DEV_TO_MEM;
- rxconf.src_addr = dws->dma_addr;
- rxconf.src_maxburst = 16;
- rxconf.dst_addr_width = DMA_SLAVE_BUSWIDTH_4_BYTES;
- rxconf.src_addr_width = convert_dma_width(dws->dma_width);
- rxconf.device_fc = false;
-
- dmaengine_slave_config(dws->rxchan, &rxconf);
-
- rxdesc = dmaengine_prep_slave_sg(dws->rxchan,
- xfer->rx_sg.sgl,
- xfer->rx_sg.nents,
- DMA_DEV_TO_MEM,
- DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
- if (!rxdesc)
- return NULL;
-
- rxdesc->callback = dw_spi_dma_rx_done;
- rxdesc->callback_param = dws;
-
- return rxdesc;
-}
-
-static int mid_spi_dma_setup(struct dw_spi *dws, struct spi_transfer *xfer)
-{
- u16 dma_ctrl = 0;
-
- dw_writel(dws, DW_SPI_DMARDLR, 0xf);
- dw_writel(dws, DW_SPI_DMATDLR, 0x10);
-
- if (xfer->tx_buf)
- dma_ctrl |= SPI_DMA_TDMAE;
- if (xfer->rx_buf)
- dma_ctrl |= SPI_DMA_RDMAE;
- dw_writel(dws, DW_SPI_DMACR, dma_ctrl);
-
- /* Set the interrupt mask */
- spi_umask_intr(dws, SPI_INT_TXOI | SPI_INT_RXUI | SPI_INT_RXOI);
-
- dws->transfer_handler = dma_transfer;
-
- return 0;
-}
-
-static int mid_spi_dma_transfer(struct dw_spi *dws, struct spi_transfer *xfer)
-{
- struct dma_async_tx_descriptor *txdesc, *rxdesc;
-
- /* Prepare the TX dma transfer */
- txdesc = dw_spi_dma_prepare_tx(dws, xfer);
-
- /* Prepare the RX dma transfer */
- rxdesc = dw_spi_dma_prepare_rx(dws, xfer);
-
- /* rx must be started before tx due to spi instinct */
- if (rxdesc) {
- set_bit(RX_BUSY, &dws->dma_chan_busy);
- dmaengine_submit(rxdesc);
- dma_async_issue_pending(dws->rxchan);
- }
-
- if (txdesc) {
- set_bit(TX_BUSY, &dws->dma_chan_busy);
- dmaengine_submit(txdesc);
- dma_async_issue_pending(dws->txchan);
- }
-
- return 0;
-}
-
-static void mid_spi_dma_stop(struct dw_spi *dws)
-{
- if (test_bit(TX_BUSY, &dws->dma_chan_busy)) {
- dmaengine_terminate_sync(dws->txchan);
- clear_bit(TX_BUSY, &dws->dma_chan_busy);
- }
- if (test_bit(RX_BUSY, &dws->dma_chan_busy)) {
- dmaengine_terminate_sync(dws->rxchan);
- clear_bit(RX_BUSY, &dws->dma_chan_busy);
- }
-}
-
-static const struct dw_spi_dma_ops mid_dma_ops = {
- .dma_init = mid_spi_dma_init,
- .dma_exit = mid_spi_dma_exit,
- .dma_setup = mid_spi_dma_setup,
- .can_dma = mid_spi_can_dma,
- .dma_transfer = mid_spi_dma_transfer,
- .dma_stop = mid_spi_dma_stop,
-};
-#endif
-
-/* Some specific info for SPI0 controller on Intel MID */
-
-/* HW info for MRST Clk Control Unit, 32b reg per controller */
-#define MRST_SPI_CLK_BASE 100000000 /* 100m */
-#define MRST_CLK_SPI_REG 0xff11d86c
-#define CLK_SPI_BDIV_OFFSET 0
-#define CLK_SPI_BDIV_MASK 0x00000007
-#define CLK_SPI_CDIV_OFFSET 9
-#define CLK_SPI_CDIV_MASK 0x00000e00
-#define CLK_SPI_DISABLE_OFFSET 8
-
-int dw_spi_mid_init(struct dw_spi *dws)
-{
- void __iomem *clk_reg;
- u32 clk_cdiv;
-
- clk_reg = ioremap(MRST_CLK_SPI_REG, 16);
- if (!clk_reg)
- return -ENOMEM;
-
- /* Get SPI controller operating freq info */
- clk_cdiv = readl(clk_reg + dws->bus_num * sizeof(u32));
- clk_cdiv &= CLK_SPI_CDIV_MASK;
- clk_cdiv >>= CLK_SPI_CDIV_OFFSET;
- dws->max_freq = MRST_SPI_CLK_BASE / (clk_cdiv + 1);
-
- iounmap(clk_reg);
-
-#ifdef CONFIG_SPI_DW_MID_DMA
- dws->dma_tx = &mid_dma_tx;
- dws->dma_rx = &mid_dma_rx;
- dws->dma_ops = &mid_dma_ops;
-#endif
- return 0;
-}
#include <linux/clk.h>
#include <linux/err.h>
-#include <linux/interrupt.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
#include <linux/slab.h>
#include <linux/acpi.h>
#include <linux/property.h>
#include <linux/regmap.h>
+#include <linux/reset.h>
#include "spi-dw.h"
struct clk *clk;
struct clk *pclk;
void *priv;
+ struct reset_control *rstc;
};
#define MSCC_CPU_SYSTEM_CTRL_GENERAL_CTRL 0x24
#define MSCC_SPI_MST_SW_MODE_SW_PIN_CTRL_MODE BIT(13)
#define MSCC_SPI_MST_SW_MODE_SW_SPI_CS(x) (x << 5)
+/*
+ * For Keem Bay, CTRLR0[31] is used to select controller mode.
+ * 0: SSI is slave
+ * 1: SSI is master
+ */
+#define KEEMBAY_CTRLR0_SSIC_IS_MST BIT(31)
+
struct dw_spi_mscc {
struct regmap *syscon;
void __iomem *spi_mst;
dwsmmio->dws.set_cs = dw_spi_mscc_set_cs;
dwsmmio->priv = dwsmscc;
+ /* Register hook to configure CTRLR0 */
+ dwsmmio->dws.update_cr0 = dw_spi_update_cr0;
+
return 0;
}
{
dwsmmio->dws.cs_override = 1;
+ /* Register hook to configure CTRLR0 */
+ dwsmmio->dws.update_cr0 = dw_spi_update_cr0;
+
+ return 0;
+}
+
+static int dw_spi_dw_apb_init(struct platform_device *pdev,
+ struct dw_spi_mmio *dwsmmio)
+{
+ /* Register hook to configure CTRLR0 */
+ dwsmmio->dws.update_cr0 = dw_spi_update_cr0;
+
+ dw_spi_dma_setup_generic(&dwsmmio->dws);
+
+ return 0;
+}
+
+static int dw_spi_dwc_ssi_init(struct platform_device *pdev,
+ struct dw_spi_mmio *dwsmmio)
+{
+ /* Register hook to configure CTRLR0 */
+ dwsmmio->dws.update_cr0 = dw_spi_update_cr0_v1_01a;
+
+ dw_spi_dma_setup_generic(&dwsmmio->dws);
+
+ return 0;
+}
+
+static u32 dw_spi_update_cr0_keembay(struct spi_controller *master,
+ struct spi_device *spi,
+ struct spi_transfer *transfer)
+{
+ u32 cr0 = dw_spi_update_cr0_v1_01a(master, spi, transfer);
+
+ return cr0 | KEEMBAY_CTRLR0_SSIC_IS_MST;
+}
+
+static int dw_spi_keembay_init(struct platform_device *pdev,
+ struct dw_spi_mmio *dwsmmio)
+{
+ /* Register hook to configure CTRLR0 */
+ dwsmmio->dws.update_cr0 = dw_spi_update_cr0_keembay;
+
return 0;
}
int (*init_func)(struct platform_device *pdev,
struct dw_spi_mmio *dwsmmio);
struct dw_spi_mmio *dwsmmio;
+ struct resource *mem;
struct dw_spi *dws;
int ret;
int num_cs;
dws = &dwsmmio->dws;
/* Get basic io resource and map it */
- dws->regs = devm_platform_ioremap_resource(pdev, 0);
- if (IS_ERR(dws->regs)) {
- dev_err(&pdev->dev, "SPI region map failed\n");
+ dws->regs = devm_platform_get_and_ioremap_resource(pdev, 0, &mem);
+ if (IS_ERR(dws->regs))
return PTR_ERR(dws->regs);
- }
+
+ dws->paddr = mem->start;
dws->irq = platform_get_irq(pdev, 0);
if (dws->irq < 0)
if (ret)
goto out_clk;
+ /* find an optional reset controller */
+ dwsmmio->rstc = devm_reset_control_get_optional_exclusive(&pdev->dev, "spi");
+ if (IS_ERR(dwsmmio->rstc)) {
+ ret = PTR_ERR(dwsmmio->rstc);
+ goto out_clk;
+ }
+ reset_control_deassert(dwsmmio->rstc);
+
dws->bus_num = pdev->id;
dws->max_freq = clk_get_rate(dwsmmio->clk);
clk_disable_unprepare(dwsmmio->pclk);
out_clk:
clk_disable_unprepare(dwsmmio->clk);
+ reset_control_assert(dwsmmio->rstc);
+
return ret;
}
pm_runtime_disable(&pdev->dev);
clk_disable_unprepare(dwsmmio->pclk);
clk_disable_unprepare(dwsmmio->clk);
+ reset_control_assert(dwsmmio->rstc);
return 0;
}
static const struct of_device_id dw_spi_mmio_of_match[] = {
- { .compatible = "snps,dw-apb-ssi", },
+ { .compatible = "snps,dw-apb-ssi", .data = dw_spi_dw_apb_init},
{ .compatible = "mscc,ocelot-spi", .data = dw_spi_mscc_ocelot_init},
{ .compatible = "mscc,jaguar2-spi", .data = dw_spi_mscc_jaguar2_init},
{ .compatible = "amazon,alpine-dw-apb-ssi", .data = dw_spi_alpine_init},
- { .compatible = "renesas,rzn1-spi", },
+ { .compatible = "renesas,rzn1-spi", .data = dw_spi_dw_apb_init},
+ { .compatible = "snps,dwc-ssi-1.01a", .data = dw_spi_dwc_ssi_init},
+ { .compatible = "intel,keembay-ssi", .data = dw_spi_keembay_init},
{ /* end of table */}
};
MODULE_DEVICE_TABLE(of, dw_spi_mmio_of_match);
+#ifdef CONFIG_ACPI
static const struct acpi_device_id dw_spi_mmio_acpi_match[] = {
- {"HISI0173", 0},
+ {"HISI0173", (kernel_ulong_t)dw_spi_dw_apb_init},
{},
};
MODULE_DEVICE_TABLE(acpi, dw_spi_mmio_acpi_match);
+#endif
static struct platform_driver dw_spi_mmio_driver = {
.probe = dw_spi_mmio_probe,
* Copyright (c) 2009, 2014 Intel Corporation.
*/
-#include <linux/interrupt.h>
#include <linux/pci.h>
#include <linux/pm_runtime.h>
#include <linux/slab.h>
#define DRIVER_NAME "dw_spi_pci"
+/* HW info for MRST Clk Control Unit, 32b reg per controller */
+#define MRST_SPI_CLK_BASE 100000000 /* 100m */
+#define MRST_CLK_SPI_REG 0xff11d86c
+#define CLK_SPI_BDIV_OFFSET 0
+#define CLK_SPI_BDIV_MASK 0x00000007
+#define CLK_SPI_CDIV_OFFSET 9
+#define CLK_SPI_CDIV_MASK 0x00000e00
+#define CLK_SPI_DISABLE_OFFSET 8
+
struct spi_pci_desc {
int (*setup)(struct dw_spi *);
u16 num_cs;
u32 max_freq;
};
+static int spi_mid_init(struct dw_spi *dws)
+{
+ void __iomem *clk_reg;
+ u32 clk_cdiv;
+
+ clk_reg = ioremap(MRST_CLK_SPI_REG, 16);
+ if (!clk_reg)
+ return -ENOMEM;
+
+ /* Get SPI controller operating freq info */
+ clk_cdiv = readl(clk_reg + dws->bus_num * sizeof(u32));
+ clk_cdiv &= CLK_SPI_CDIV_MASK;
+ clk_cdiv >>= CLK_SPI_CDIV_OFFSET;
+ dws->max_freq = MRST_SPI_CLK_BASE / (clk_cdiv + 1);
+
+ iounmap(clk_reg);
+
+ /* Register hook to configure CTRLR0 */
+ dws->update_cr0 = dw_spi_update_cr0;
+
+ dw_spi_dma_setup_mfld(dws);
+
+ return 0;
+}
+
+static int spi_generic_init(struct dw_spi *dws)
+{
+ /* Register hook to configure CTRLR0 */
+ dws->update_cr0 = dw_spi_update_cr0;
+
+ dw_spi_dma_setup_generic(dws);
+
+ return 0;
+}
+
static struct spi_pci_desc spi_pci_mid_desc_1 = {
- .setup = dw_spi_mid_init,
+ .setup = spi_mid_init,
.num_cs = 5,
.bus_num = 0,
};
static struct spi_pci_desc spi_pci_mid_desc_2 = {
- .setup = dw_spi_mid_init,
+ .setup = spi_mid_init,
.num_cs = 2,
.bus_num = 1,
};
static struct spi_pci_desc spi_pci_ehl_desc = {
+ .setup = spi_generic_init,
.num_cs = 2,
.bus_num = -1,
.max_freq = 100000000,
#ifndef DW_SPI_HEADER_H
#define DW_SPI_HEADER_H
+#include <linux/completion.h>
+#include <linux/debugfs.h>
+#include <linux/irqreturn.h>
#include <linux/io.h>
#include <linux/scatterlist.h>
/* Register offsets */
-#define DW_SPI_CTRL0 0x00
-#define DW_SPI_CTRL1 0x04
+#define DW_SPI_CTRLR0 0x00
+#define DW_SPI_CTRLR1 0x04
#define DW_SPI_SSIENR 0x08
#define DW_SPI_MWCR 0x0c
#define DW_SPI_SER 0x10
#define DW_SPI_BAUDR 0x14
-#define DW_SPI_TXFLTR 0x18
-#define DW_SPI_RXFLTR 0x1c
+#define DW_SPI_TXFTLR 0x18
+#define DW_SPI_RXFTLR 0x1c
#define DW_SPI_TXFLR 0x20
#define DW_SPI_RXFLR 0x24
#define DW_SPI_SR 0x28
#define SPI_SRL_OFFSET 11
#define SPI_CFS_OFFSET 12
+/* Bit fields in CTRLR0 based on DWC_ssi_databook.pdf v1.01a */
+#define DWC_SSI_CTRLR0_SRL_OFFSET 13
+#define DWC_SSI_CTRLR0_TMOD_OFFSET 10
+#define DWC_SSI_CTRLR0_TMOD_MASK GENMASK(11, 10)
+#define DWC_SSI_CTRLR0_SCPOL_OFFSET 9
+#define DWC_SSI_CTRLR0_SCPH_OFFSET 8
+#define DWC_SSI_CTRLR0_FRF_OFFSET 6
+#define DWC_SSI_CTRLR0_DFS_OFFSET 0
+
/* Bit fields in SR, 7 bits */
#define SR_MASK 0x7f /* cover 7 bits */
#define SR_BUSY (1 << 0)
struct dw_spi;
struct dw_spi_dma_ops {
- int (*dma_init)(struct dw_spi *dws);
+ int (*dma_init)(struct device *dev, struct dw_spi *dws);
void (*dma_exit)(struct dw_spi *dws);
int (*dma_setup)(struct dw_spi *dws, struct spi_transfer *xfer);
bool (*can_dma)(struct spi_controller *master, struct spi_device *spi,
u16 bus_num;
u16 num_cs; /* supported slave numbers */
void (*set_cs)(struct spi_device *spi, bool enable);
+ u32 (*update_cr0)(struct spi_controller *master, struct spi_device *spi,
+ struct spi_transfer *transfer);
/* Current message transfer state info */
size_t len;
void *rx_end;
int dma_mapped;
u8 n_bytes; /* current is a 1/2 bytes op */
- u32 dma_width;
irqreturn_t (*transfer_handler)(struct dw_spi *dws);
u32 current_freq; /* frequency in hz */
/* DMA info */
- int dma_inited;
struct dma_chan *txchan;
+ u32 txburst;
struct dma_chan *rxchan;
+ u32 rxburst;
unsigned long dma_chan_busy;
dma_addr_t dma_addr; /* phy address of the Data register */
const struct dw_spi_dma_ops *dma_ops;
- void *dma_tx;
- void *dma_rx;
+ struct completion dma_completion;
- /* Bus interface info */
- void *priv;
#ifdef CONFIG_DEBUG_FS
struct dentry *debugfs;
+ struct debugfs_regset32 regset;
#endif
};
spi_set_clk(dws, 0);
}
-/*
- * Each SPI slave device to work with dw_api controller should
- * has such a structure claiming its working mode (poll or PIO/DMA),
- * which can be save in the "controller_data" member of the
- * struct spi_device.
- */
-struct dw_spi_chip {
- u8 poll_mode; /* 1 for controller polling mode */
- u8 type; /* SPI/SSP/MicroWire */
- void (*cs_control)(u32 command);
-};
-
extern void dw_spi_set_cs(struct spi_device *spi, bool enable);
extern int dw_spi_add_host(struct device *dev, struct dw_spi *dws);
extern void dw_spi_remove_host(struct dw_spi *dws);
extern int dw_spi_suspend_host(struct dw_spi *dws);
extern int dw_spi_resume_host(struct dw_spi *dws);
+extern u32 dw_spi_update_cr0(struct spi_controller *master,
+ struct spi_device *spi,
+ struct spi_transfer *transfer);
+extern u32 dw_spi_update_cr0_v1_01a(struct spi_controller *master,
+ struct spi_device *spi,
+ struct spi_transfer *transfer);
+
+#ifdef CONFIG_SPI_DW_DMA
+
+extern void dw_spi_dma_setup_mfld(struct dw_spi *dws);
+extern void dw_spi_dma_setup_generic(struct dw_spi *dws);
+
+#else
+
+static inline void dw_spi_dma_setup_mfld(struct dw_spi *dws) {}
+static inline void dw_spi_dma_setup_generic(struct dw_spi *dws) {}
+
+#endif /* !CONFIG_SPI_DW_DMA */
-/* platform related setup */
-extern int dw_spi_mid_init(struct dw_spi *dws); /* Intel MID platforms */
#endif /* DW_SPI_HEADER_H */
#include <linux/platform_data/spi-ep93xx.h>
#define SSPCR0 0x0000
-#define SSPCR0_MODE_SHIFT 6
+#define SSPCR0_SPO BIT(6)
+#define SSPCR0_SPH BIT(7)
#define SSPCR0_SCR_SHIFT 8
#define SSPCR1 0x0004
return err;
cr0 = div_scr << SSPCR0_SCR_SHIFT;
- cr0 |= (spi->mode & (SPI_CPHA | SPI_CPOL)) << SSPCR0_MODE_SHIFT;
+ if (spi->mode & SPI_CPOL)
+ cr0 |= SSPCR0_SPO;
+ if (spi->mode & SPI_CPHA)
+ cr0 |= SSPCR0_SPH;
cr0 |= dss;
dev_dbg(&master->dev, "setup: mode %d, cpsr %d, scr %d, dss %d\n",
// SPDX-License-Identifier: GPL-2.0+
//
// Copyright 2013 Freescale Semiconductor, Inc.
+// Copyright 2020 NXP
//
// Freescale DSPI driver
// This file contains a driver for the Freescale DSPI
#define SPI_MCR_CLR_TXF BIT(11)
#define SPI_MCR_CLR_RXF BIT(10)
#define SPI_MCR_XSPI BIT(3)
+#define SPI_MCR_DIS_TXF BIT(13)
+#define SPI_MCR_DIS_RXF BIT(12)
+#define SPI_MCR_HALT BIT(0)
#define SPI_TCR 0x08
#define SPI_TCR_GET_TCNT(x) (((x) & GENMASK(31, 16)) >> 16)
static void dspi_native_host_to_dev(struct fsl_dspi *dspi, u32 *txdata)
{
- memcpy(txdata, dspi->tx, dspi->oper_word_size);
+ switch (dspi->oper_word_size) {
+ case 1:
+ *txdata = *(u8 *)dspi->tx;
+ break;
+ case 2:
+ *txdata = *(u16 *)dspi->tx;
+ break;
+ case 4:
+ *txdata = *(u32 *)dspi->tx;
+ break;
+ }
dspi->tx += dspi->oper_word_size;
}
static void dspi_native_dev_to_host(struct fsl_dspi *dspi, u32 rxdata)
{
- memcpy(dspi->rx, &rxdata, dspi->oper_word_size);
+ switch (dspi->oper_word_size) {
+ case 1:
+ *(u8 *)dspi->rx = rxdata;
+ break;
+ case 2:
+ *(u16 *)dspi->rx = rxdata;
+ break;
+ case 4:
+ *(u32 *)dspi->rx = rxdata;
+ break;
+ }
dspi->rx += dspi->oper_word_size;
}
return 0;
}
+static void dspi_shutdown(struct platform_device *pdev)
+{
+ struct spi_controller *ctlr = platform_get_drvdata(pdev);
+ struct fsl_dspi *dspi = spi_controller_get_devdata(ctlr);
+
+ /* Disable RX and TX */
+ regmap_update_bits(dspi->regmap, SPI_MCR,
+ SPI_MCR_DIS_TXF | SPI_MCR_DIS_RXF,
+ SPI_MCR_DIS_TXF | SPI_MCR_DIS_RXF);
+
+ /* Stop Running */
+ regmap_update_bits(dspi->regmap, SPI_MCR, SPI_MCR_HALT, SPI_MCR_HALT);
+
+ dspi_release_dma(dspi);
+ clk_disable_unprepare(dspi->clk);
+ spi_unregister_controller(dspi->ctlr);
+}
+
static struct platform_driver fsl_dspi_driver = {
.driver.name = DRIVER_NAME,
.driver.of_match_table = fsl_dspi_dt_ids,
.driver.pm = &dspi_pm,
.probe = dspi_probe,
.remove = dspi_remove,
+ .shutdown = dspi_shutdown,
};
module_platform_driver(fsl_dspi_driver);
bytes_per_word = fsl_lpspi_bytes_per_word(transfer->bits_per_word);
- switch (bytes_per_word)
- {
- case 1:
- case 2:
- case 4:
- break;
- default:
- return false;
+ switch (bytes_per_word) {
+ case 1:
+ case 2:
+ case 4:
+ break;
+ default:
+ return false;
}
return true;
ret = pm_runtime_get_sync(fsl_lpspi->dev);
if (ret < 0) {
dev_err(fsl_lpspi->dev, "failed to enable clock\n");
- goto out_controller_put;
+ goto out_pm_get;
}
temp = readl(fsl_lpspi->base + IMX7ULP_PARAM);
ret = fsl_lpspi_dma_init(&pdev->dev, fsl_lpspi, controller);
if (ret == -EPROBE_DEFER)
- goto out_controller_put;
+ goto out_pm_get;
if (ret < 0)
dev_err(&pdev->dev, "dma setup error %d, use pio\n", ret);
return 0;
+out_pm_get:
+ pm_runtime_put_noidle(fsl_lpspi->dev);
out_controller_put:
spi_controller_put(controller);
res = platform_get_resource_byname(pdev, IORESOURCE_MEM,
"QuadSPI-memory");
- q->ahb_addr = devm_ioremap_resource(dev, res);
- if (IS_ERR(q->ahb_addr)) {
- ret = PTR_ERR(q->ahb_addr);
+ q->memmap_phy = res->start;
+ /* Since there are 4 cs, map size required is 4 times ahb_buf_size */
+ q->ahb_addr = devm_ioremap(dev, q->memmap_phy,
+ (q->devtype_data->ahb_buf_size * 4));
+ if (!q->ahb_addr) {
+ ret = -ENOMEM;
goto err_put_ctrl;
}
- q->memmap_phy = res->start;
-
/* find the clocks */
q->clk_en = devm_clk_get(dev, "qspi_en");
if (IS_ERR(q->clk_en)) {
pdata->cs_control = fsl_spi_grlib_cs_control;
}
-static struct spi_master * fsl_spi_probe(struct device *dev,
+static struct spi_master *fsl_spi_probe(struct device *dev,
struct resource *mem, unsigned int irq)
{
struct fsl_spi_platform_data *pdata = dev_get_platdata(dev);
#define HISI_SFC_V3XX_VERSION (0x1f8)
+#define HISI_SFC_V3XX_INT_STAT (0x120)
+#define HISI_SFC_V3XX_INT_STAT_PP_ERR BIT(2)
+#define HISI_SFC_V3XX_INT_STAT_ADDR_IACCES BIT(5)
+#define HISI_SFC_V3XX_INT_CLR (0x12c)
+#define HISI_SFC_V3XX_INT_CLR_CLEAR (0xff)
#define HISI_SFC_V3XX_CMD_CFG (0x300)
#define HISI_SFC_V3XX_CMD_CFG_DUAL_IN_DUAL_OUT (1 << 17)
#define HISI_SFC_V3XX_CMD_CFG_DUAL_IO (2 << 17)
u8 chip_select)
{
int ret, len = op->data.nbytes;
- u32 config = 0;
+ u32 int_stat, config = 0;
if (op->addr.nbytes)
config |= HISI_SFC_V3XX_CMD_CFG_ADDR_EN_MSK;
if (ret)
return ret;
+ /*
+ * The interrupt status register indicates whether an error occurs
+ * after per operation. Check it, and clear the interrupts for
+ * next time judgement.
+ */
+ int_stat = readl(host->regbase + HISI_SFC_V3XX_INT_STAT);
+ writel(HISI_SFC_V3XX_INT_CLR_CLEAR,
+ host->regbase + HISI_SFC_V3XX_INT_CLR);
+
+ if (int_stat & HISI_SFC_V3XX_INT_STAT_ADDR_IACCES) {
+ dev_err(host->dev, "fail to access protected address\n");
+ return -EIO;
+ }
+
+ if (int_stat & HISI_SFC_V3XX_INT_STAT_PP_ERR) {
+ dev_err(host->dev, "page program operation failed\n");
+ return -EIO;
+ }
+
if (op->data.dir == SPI_MEM_DATA_IN)
hisi_sfc_v3xx_read_databuf(host, op->data.buf.in, len);
void (*reset)(struct spi_imx_data *);
void (*setup_wml)(struct spi_imx_data *);
void (*disable)(struct spi_imx_data *);
+ void (*disable_dma)(struct spi_imx_data *);
bool has_dmamode;
bool has_slavemode;
unsigned int fifo_size;
writel(reg, spi_imx->base + MX51_ECSPI_CTRL);
}
+static void mx51_disable_dma(struct spi_imx_data *spi_imx)
+{
+ writel(0, spi_imx->base + MX51_ECSPI_DMA);
+}
+
static void mx51_ecspi_disable(struct spi_imx_data *spi_imx)
{
u32 ctrl;
.rx_available = mx51_ecspi_rx_available,
.reset = mx51_ecspi_reset,
.setup_wml = mx51_setup_wml,
+ .disable_dma = mx51_disable_dma,
.fifo_size = 64,
.has_dmamode = true,
.dynamic_burst = true,
.prepare_transfer = mx51_ecspi_prepare_transfer,
.trigger = mx51_ecspi_trigger,
.rx_available = mx51_ecspi_rx_available,
+ .disable_dma = mx51_disable_dma,
.reset = mx51_ecspi_reset,
.fifo_size = 64,
.has_dmamode = true,
DMA_PREP_INTERRUPT | DMA_CTRL_ACK);
if (!desc_tx) {
dmaengine_terminate_all(master->dma_tx);
+ dmaengine_terminate_all(master->dma_rx);
return -EINVAL;
}
struct spi_transfer *transfer)
{
struct spi_imx_data *spi_imx = spi_master_get_devdata(spi->master);
+ int ret;
/* flush rxfifo before transfer */
while (spi_imx->devtype_data->rx_available(spi_imx))
if (spi_imx->slave_mode)
return spi_imx_pio_transfer_slave(spi, transfer);
- if (spi_imx->usedma)
- return spi_imx_dma_transfer(spi_imx, transfer);
- else
- return spi_imx_pio_transfer(spi, transfer);
+ /*
+ * fallback PIO mode if dma setup error happen, for example sdma
+ * firmware may not be updated as ERR009165 required.
+ */
+ if (spi_imx->usedma) {
+ ret = spi_imx_dma_transfer(spi_imx, transfer);
+ if (ret != -EINVAL)
+ return ret;
+
+ spi_imx->devtype_data->disable_dma(spi_imx);
+
+ spi_imx->usedma = false;
+ spi_imx->dynamic_burst = spi_imx->devtype_data->dynamic_burst;
+ dev_dbg(&spi->dev, "Fallback to PIO mode\n");
+ }
+
+ return spi_imx_pio_transfer(spi, transfer);
}
static int spi_imx_setup(struct spi_device *spi)
return 0;
case 2:
- if ((tx && (mode & (SPI_TX_DUAL | SPI_TX_QUAD))) ||
- (!tx && (mode & (SPI_RX_DUAL | SPI_RX_QUAD))))
+ if ((tx &&
+ (mode & (SPI_TX_DUAL | SPI_TX_QUAD | SPI_TX_OCTAL))) ||
+ (!tx &&
+ (mode & (SPI_RX_DUAL | SPI_RX_QUAD | SPI_RX_OCTAL))))
return 0;
break;
case 4:
- if ((tx && (mode & SPI_TX_QUAD)) ||
- (!tx && (mode & SPI_RX_QUAD)))
+ if ((tx && (mode & (SPI_TX_QUAD | SPI_TX_OCTAL))) ||
+ (!tx && (mode & (SPI_RX_QUAD | SPI_RX_OCTAL))))
return 0;
break;
return mtk_nor_cmd_exec(sp, MTK_NOR_CMD_WRITE, 6 * BITS_PER_BYTE);
}
-int mtk_nor_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
+static int mtk_nor_exec_op(struct spi_mem *mem, const struct spi_mem_op *op)
{
struct mtk_nor *sp = spi_controller_get_devdata(mem->spi->master);
int ret;
struct spi_mux_priv *priv = spi_controller_get_devdata(spi->controller);
int ret;
+ ret = mux_control_select(priv->mux, spi->chip_select);
+ if (ret)
+ return ret;
+
if (priv->current_cs == spi->chip_select)
return 0;
priv->spi->mode = spi->mode;
priv->spi->bits_per_word = spi->bits_per_word;
- ret = mux_control_select(priv->mux, spi->chip_select);
- if (ret)
- return ret;
-
priv->current_cs = spi->chip_select;
return 0;
#include <linux/of.h>
#include <linux/of_address.h>
#include <linux/of_device.h>
-#include <linux/of_gpio.h>
#include <linux/clk.h>
#include <linux/sizes.h>
-#include <linux/gpio.h>
#include <asm/unaligned.h>
#define DRIVER_NAME "orion_spi"
struct clk *clk;
struct clk *axi_clk;
const struct orion_spi_dev *devdata;
- int unused_hw_gpio;
struct orion_child_options child[ORION_NUM_CHIPSELECTS];
};
static void orion_spi_set_cs(struct spi_device *spi, bool enable)
{
struct orion_spi *orion_spi;
- int cs;
orion_spi = spi_master_get_devdata(spi->master);
- if (gpio_is_valid(spi->cs_gpio))
- cs = orion_spi->unused_hw_gpio;
- else
- cs = spi->chip_select;
-
+ /*
+ * If this line is using a GPIO to control chip select, this internal
+ * .set_cs() function will still be called, so we clear any previous
+ * chip select. The CS we activate will not have any elecrical effect,
+ * as it is handled by a GPIO, but that doesn't matter. What we need
+ * is to deassert the old chip select and assert some other chip select.
+ */
orion_spi_clrbits(orion_spi, ORION_SPI_IF_CTRL_REG, ORION_SPI_CS_MASK);
orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG,
- ORION_SPI_CS(cs));
+ ORION_SPI_CS(spi->chip_select));
- /* Chip select logic is inverted from spi_set_cs */
+ /*
+ * Chip select logic is inverted from spi_set_cs(). For lines using a
+ * GPIO to do chip select SPI_CS_HIGH is enforced and inversion happens
+ * in the GPIO library, but we don't care about that, because in those
+ * cases we are dealing with an unused native CS anyways so the polarity
+ * doesn't matter.
+ */
if (!enable)
orion_spi_setbits(orion_spi, ORION_SPI_IF_CTRL_REG, 0x1);
else
static int orion_spi_setup(struct spi_device *spi)
{
- if (gpio_is_valid(spi->cs_gpio)) {
- gpio_direction_output(spi->cs_gpio, !(spi->mode & SPI_CS_HIGH));
- }
return orion_spi_setup_transfer(spi, NULL);
}
master->setup = orion_spi_setup;
master->bits_per_word_mask = SPI_BPW_MASK(8) | SPI_BPW_MASK(16);
master->auto_runtime_pm = true;
+ master->use_gpio_descriptors = true;
master->flags = SPI_MASTER_GPIO_SS;
platform_set_drvdata(pdev, master);
spi = spi_master_get_devdata(master);
spi->master = master;
- spi->unused_hw_gpio = -1;
of_id = of_match_device(orion_spi_of_match_table, &pdev->dev);
devdata = (of_id) ? of_id->data : &orion_spi_dev_data;
for_each_available_child_of_node(pdev->dev.of_node, np) {
struct orion_direct_acc *dir_acc;
u32 cs;
- int cs_gpio;
/* Get chip-select number from the "reg" property */
status = of_property_read_u32(np, "reg", &cs);
continue;
}
- /*
- * Initialize the CS GPIO:
- * - properly request the actual GPIO signal
- * - de-assert the logical signal so that all GPIO CS lines
- * are inactive when probing for slaves
- * - find an unused physical CS which will be driven for any
- * slave which uses a CS GPIO
- */
- cs_gpio = of_get_named_gpio(pdev->dev.of_node, "cs-gpios", cs);
- if (cs_gpio > 0) {
- char *gpio_name;
- int cs_flags;
-
- if (spi->unused_hw_gpio == -1) {
- dev_info(&pdev->dev,
- "Selected unused HW CS#%d for any GPIO CSes\n",
- cs);
- spi->unused_hw_gpio = cs;
- }
-
- gpio_name = devm_kasprintf(&pdev->dev, GFP_KERNEL,
- "%s-CS%d", dev_name(&pdev->dev), cs);
- if (!gpio_name) {
- status = -ENOMEM;
- goto out_rel_axi_clk;
- }
-
- cs_flags = of_property_read_bool(np, "spi-cs-high") ?
- GPIOF_OUT_INIT_LOW : GPIOF_OUT_INIT_HIGH;
- status = devm_gpio_request_one(&pdev->dev, cs_gpio,
- cs_flags, gpio_name);
- if (status) {
- dev_err(&pdev->dev,
- "Can't request GPIO for CS %d\n", cs);
- goto out_rel_axi_clk;
- }
- }
-
/*
* Check if an address is configured for this SPI device. If
* not, the MBus mapping via the 'ranges' property in the 'soc'
.tx_threshold_hi = 48,
.cs_sel_shift = 8,
.cs_sel_mask = 3 << 8,
+ .cs_clk_stays_gated = true,
},
{ /* LPSS_CNL_SSP */
.offset = 0x200,
/* Register with the SPI framework */
platform_set_drvdata(pdev, drv_data);
- status = devm_spi_register_controller(&pdev->dev, controller);
+ status = spi_register_controller(controller);
if (status != 0) {
dev_err(&pdev->dev, "problem registering spi controller\n");
goto out_error_pm_runtime_enabled;
return status;
out_error_pm_runtime_enabled:
- pm_runtime_put_noidle(&pdev->dev);
pm_runtime_disable(&pdev->dev);
out_error_clock_enabled:
pm_runtime_get_sync(&pdev->dev);
+ spi_unregister_controller(drv_data->controller);
+
/* Disable the SSP at the peripheral and SOC level */
pxa2xx_spi_write(drv_data, SSCR0, 0);
clk_disable_unprepare(ssp->clk);
#include <linux/platform_device.h>
#include <linux/clk.h>
#include <linux/spi/spi.h>
+#include <linux/of.h>
#include <asm/mach-ath79/ar71xx_regs.h>
if (IS_ERR(ahb_clk))
return PTR_ERR(ahb_clk);
+ master->dev.of_node = pdev->dev.of_node;
master->bus_num = 0;
master->num_chipselect = 3;
master->mode_bits = SPI_TX_DUAL;
master->transfer_one = rb4xx_transfer_one;
master->set_cs = rb4xx_set_cs;
+ rbspi = spi_master_get_devdata(master);
+ rbspi->base = spi_base;
+ rbspi->clk = ahb_clk;
+ platform_set_drvdata(pdev, rbspi);
+
err = devm_spi_register_master(&pdev->dev, master);
if (err) {
dev_err(&pdev->dev, "failed to register SPI master\n");
if (err)
return err;
- rbspi = spi_master_get_devdata(master);
- rbspi->base = spi_base;
- rbspi->clk = ahb_clk;
- platform_set_drvdata(pdev, rbspi);
-
/* Enable SPI */
rb4xx_write(rbspi, AR71XX_SPI_REG_FS, AR71XX_SPI_FS_GPIO);
return 0;
}
+static const struct of_device_id rb4xx_spi_dt_match[] = {
+ { .compatible = "mikrotik,rb4xx-spi" },
+ { },
+};
+MODULE_DEVICE_TABLE(of, rb4xx_spi_dt_match);
+
static struct platform_driver rb4xx_spi_drv = {
.probe = rb4xx_spi_probe,
.remove = rb4xx_spi_remove,
.driver = {
.name = "rb4xx-spi",
+ .of_match_table = of_match_ptr(rb4xx_spi_dt_match),
},
};
u8 rsd;
bool cs_asserted[ROCKCHIP_SPI_MAX_CS_NUM];
+
+ bool slave_abort;
};
static inline void spi_enable_chip(struct rockchip_spi *rs, bool enable)
static void rockchip_spi_set_cs(struct spi_device *spi, bool enable)
{
- struct spi_master *master = spi->master;
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = spi->controller;
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
bool cs_asserted = !enable;
/* Return immediately for no-op */
rs->cs_asserted[spi->chip_select] = cs_asserted;
}
-static void rockchip_spi_handle_err(struct spi_master *master,
+static void rockchip_spi_handle_err(struct spi_controller *ctlr,
struct spi_message *msg)
{
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
/* stop running spi transfer
* this also flushes both rx and tx fifos
writel_relaxed(0, rs->regs + ROCKCHIP_SPI_IMR);
if (atomic_read(&rs->state) & TXDMA)
- dmaengine_terminate_async(master->dma_tx);
+ dmaengine_terminate_async(ctlr->dma_tx);
if (atomic_read(&rs->state) & RXDMA)
- dmaengine_terminate_async(master->dma_rx);
+ dmaengine_terminate_async(ctlr->dma_rx);
}
static void rockchip_spi_pio_writer(struct rockchip_spi *rs)
static irqreturn_t rockchip_spi_isr(int irq, void *dev_id)
{
- struct spi_master *master = dev_id;
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = dev_id;
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
if (rs->tx_left)
rockchip_spi_pio_writer(rs);
if (!rs->rx_left) {
spi_enable_chip(rs, false);
writel_relaxed(0, rs->regs + ROCKCHIP_SPI_IMR);
- spi_finalize_current_transfer(master);
+ spi_finalize_current_transfer(ctlr);
}
return IRQ_HANDLED;
static void rockchip_spi_dma_rxcb(void *data)
{
- struct spi_master *master = data;
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = data;
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
int state = atomic_fetch_andnot(RXDMA, &rs->state);
- if (state & TXDMA)
+ if (state & TXDMA && !rs->slave_abort)
return;
spi_enable_chip(rs, false);
- spi_finalize_current_transfer(master);
+ spi_finalize_current_transfer(ctlr);
}
static void rockchip_spi_dma_txcb(void *data)
{
- struct spi_master *master = data;
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = data;
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
int state = atomic_fetch_andnot(TXDMA, &rs->state);
- if (state & RXDMA)
+ if (state & RXDMA && !rs->slave_abort)
return;
/* Wait until the FIFO data completely. */
wait_for_idle(rs);
spi_enable_chip(rs, false);
- spi_finalize_current_transfer(master);
+ spi_finalize_current_transfer(ctlr);
}
static int rockchip_spi_prepare_dma(struct rockchip_spi *rs,
- struct spi_master *master, struct spi_transfer *xfer)
+ struct spi_controller *ctlr, struct spi_transfer *xfer)
{
struct dma_async_tx_descriptor *rxdesc, *txdesc;
.src_maxburst = 1,
};
- dmaengine_slave_config(master->dma_rx, &rxconf);
+ dmaengine_slave_config(ctlr->dma_rx, &rxconf);
rxdesc = dmaengine_prep_slave_sg(
- master->dma_rx,
+ ctlr->dma_rx,
xfer->rx_sg.sgl, xfer->rx_sg.nents,
DMA_DEV_TO_MEM, DMA_PREP_INTERRUPT);
if (!rxdesc)
return -EINVAL;
rxdesc->callback = rockchip_spi_dma_rxcb;
- rxdesc->callback_param = master;
+ rxdesc->callback_param = ctlr;
}
txdesc = NULL;
.dst_maxburst = rs->fifo_len / 4,
};
- dmaengine_slave_config(master->dma_tx, &txconf);
+ dmaengine_slave_config(ctlr->dma_tx, &txconf);
txdesc = dmaengine_prep_slave_sg(
- master->dma_tx,
+ ctlr->dma_tx,
xfer->tx_sg.sgl, xfer->tx_sg.nents,
DMA_MEM_TO_DEV, DMA_PREP_INTERRUPT);
if (!txdesc) {
if (rxdesc)
- dmaengine_terminate_sync(master->dma_rx);
+ dmaengine_terminate_sync(ctlr->dma_rx);
return -EINVAL;
}
txdesc->callback = rockchip_spi_dma_txcb;
- txdesc->callback_param = master;
+ txdesc->callback_param = ctlr;
}
/* rx must be started before tx due to spi instinct */
if (rxdesc) {
atomic_or(RXDMA, &rs->state);
dmaengine_submit(rxdesc);
- dma_async_issue_pending(master->dma_rx);
+ dma_async_issue_pending(ctlr->dma_rx);
}
spi_enable_chip(rs, true);
if (txdesc) {
atomic_or(TXDMA, &rs->state);
dmaengine_submit(txdesc);
- dma_async_issue_pending(master->dma_tx);
+ dma_async_issue_pending(ctlr->dma_tx);
}
/* 1 means the transfer is in progress */
static void rockchip_spi_config(struct rockchip_spi *rs,
struct spi_device *spi, struct spi_transfer *xfer,
- bool use_dma)
+ bool use_dma, bool slave_mode)
{
u32 cr0 = CR0_FRF_SPI << CR0_FRF_OFFSET
| CR0_BHT_8BIT << CR0_BHT_OFFSET
u32 cr1;
u32 dmacr = 0;
+ if (slave_mode)
+ cr0 |= CR0_OPM_SLAVE << CR0_OPM_OFFSET;
+ rs->slave_abort = false;
+
cr0 |= rs->rsd << CR0_RSD_OFFSET;
cr0 |= (spi->mode & 0x3U) << CR0_SCPH_OFFSET;
if (spi->mode & SPI_LSB_FIRST)
break;
default:
/* we only whitelist 4, 8 and 16 bit words in
- * master->bits_per_word_mask, so this shouldn't
+ * ctlr->bits_per_word_mask, so this shouldn't
* happen
*/
unreachable();
return ROCKCHIP_SPI_MAX_TRANLEN;
}
+static int rockchip_spi_slave_abort(struct spi_controller *ctlr)
+{
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
+
+ rs->slave_abort = true;
+ complete(&ctlr->xfer_completion);
+
+ return 0;
+}
+
static int rockchip_spi_transfer_one(
- struct spi_master *master,
+ struct spi_controller *ctlr,
struct spi_device *spi,
struct spi_transfer *xfer)
{
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
bool use_dma;
WARN_ON(readl_relaxed(rs->regs + ROCKCHIP_SPI_SSIENR) &&
rs->n_bytes = xfer->bits_per_word <= 8 ? 1 : 2;
- use_dma = master->can_dma ? master->can_dma(master, spi, xfer) : false;
+ use_dma = ctlr->can_dma ? ctlr->can_dma(ctlr, spi, xfer) : false;
- rockchip_spi_config(rs, spi, xfer, use_dma);
+ rockchip_spi_config(rs, spi, xfer, use_dma, ctlr->slave);
if (use_dma)
- return rockchip_spi_prepare_dma(rs, master, xfer);
+ return rockchip_spi_prepare_dma(rs, ctlr, xfer);
return rockchip_spi_prepare_irq(rs, xfer);
}
-static bool rockchip_spi_can_dma(struct spi_master *master,
+static bool rockchip_spi_can_dma(struct spi_controller *ctlr,
struct spi_device *spi,
struct spi_transfer *xfer)
{
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
unsigned int bytes_per_word = xfer->bits_per_word <= 8 ? 1 : 2;
/* if the numbor of spi words to transfer is less than the fifo
{
int ret;
struct rockchip_spi *rs;
- struct spi_master *master;
+ struct spi_controller *ctlr;
struct resource *mem;
+ struct device_node *np = pdev->dev.of_node;
u32 rsd_nsecs;
+ bool slave_mode;
+
+ slave_mode = of_property_read_bool(np, "spi-slave");
+
+ if (slave_mode)
+ ctlr = spi_alloc_slave(&pdev->dev,
+ sizeof(struct rockchip_spi));
+ else
+ ctlr = spi_alloc_master(&pdev->dev,
+ sizeof(struct rockchip_spi));
- master = spi_alloc_master(&pdev->dev, sizeof(struct rockchip_spi));
- if (!master)
+ if (!ctlr)
return -ENOMEM;
- platform_set_drvdata(pdev, master);
+ platform_set_drvdata(pdev, ctlr);
- rs = spi_master_get_devdata(master);
+ rs = spi_controller_get_devdata(ctlr);
+ ctlr->slave = slave_mode;
/* Get basic io resource and map it */
mem = platform_get_resource(pdev, IORESOURCE_MEM, 0);
rs->regs = devm_ioremap_resource(&pdev->dev, mem);
if (IS_ERR(rs->regs)) {
ret = PTR_ERR(rs->regs);
- goto err_put_master;
+ goto err_put_ctlr;
}
rs->apb_pclk = devm_clk_get(&pdev->dev, "apb_pclk");
if (IS_ERR(rs->apb_pclk)) {
dev_err(&pdev->dev, "Failed to get apb_pclk\n");
ret = PTR_ERR(rs->apb_pclk);
- goto err_put_master;
+ goto err_put_ctlr;
}
rs->spiclk = devm_clk_get(&pdev->dev, "spiclk");
if (IS_ERR(rs->spiclk)) {
dev_err(&pdev->dev, "Failed to get spi_pclk\n");
ret = PTR_ERR(rs->spiclk);
- goto err_put_master;
+ goto err_put_ctlr;
}
ret = clk_prepare_enable(rs->apb_pclk);
if (ret < 0) {
dev_err(&pdev->dev, "Failed to enable apb_pclk\n");
- goto err_put_master;
+ goto err_put_ctlr;
}
ret = clk_prepare_enable(rs->spiclk);
goto err_disable_spiclk;
ret = devm_request_threaded_irq(&pdev->dev, ret, rockchip_spi_isr, NULL,
- IRQF_ONESHOT, dev_name(&pdev->dev), master);
+ IRQF_ONESHOT, dev_name(&pdev->dev), ctlr);
if (ret)
goto err_disable_spiclk;
pm_runtime_set_active(&pdev->dev);
pm_runtime_enable(&pdev->dev);
- master->auto_runtime_pm = true;
- master->bus_num = pdev->id;
- master->mode_bits = SPI_CPOL | SPI_CPHA | SPI_LOOP | SPI_LSB_FIRST;
- master->num_chipselect = ROCKCHIP_SPI_MAX_CS_NUM;
- master->dev.of_node = pdev->dev.of_node;
- master->bits_per_word_mask = SPI_BPW_MASK(16) | SPI_BPW_MASK(8) | SPI_BPW_MASK(4);
- master->min_speed_hz = rs->freq / BAUDR_SCKDV_MAX;
- master->max_speed_hz = min(rs->freq / BAUDR_SCKDV_MIN, MAX_SCLK_OUT);
-
- master->set_cs = rockchip_spi_set_cs;
- master->transfer_one = rockchip_spi_transfer_one;
- master->max_transfer_size = rockchip_spi_max_transfer_size;
- master->handle_err = rockchip_spi_handle_err;
- master->flags = SPI_MASTER_GPIO_SS;
-
- master->dma_tx = dma_request_chan(rs->dev, "tx");
- if (IS_ERR(master->dma_tx)) {
+ ctlr->auto_runtime_pm = true;
+ ctlr->bus_num = pdev->id;
+ ctlr->mode_bits = SPI_CPOL | SPI_CPHA | SPI_LOOP | SPI_LSB_FIRST;
+ if (slave_mode) {
+ ctlr->mode_bits |= SPI_NO_CS;
+ ctlr->slave_abort = rockchip_spi_slave_abort;
+ } else {
+ ctlr->flags = SPI_MASTER_GPIO_SS;
+ ctlr->max_native_cs = ROCKCHIP_SPI_MAX_CS_NUM;
+ /*
+ * rk spi0 has two native cs, spi1..5 one cs only
+ * if num-cs is missing in the dts, default to 1
+ */
+ if (of_property_read_u16(np, "num-cs", &ctlr->num_chipselect))
+ ctlr->num_chipselect = 1;
+ ctlr->use_gpio_descriptors = true;
+ }
+ ctlr->dev.of_node = pdev->dev.of_node;
+ ctlr->bits_per_word_mask = SPI_BPW_MASK(16) | SPI_BPW_MASK(8) | SPI_BPW_MASK(4);
+ ctlr->min_speed_hz = rs->freq / BAUDR_SCKDV_MAX;
+ ctlr->max_speed_hz = min(rs->freq / BAUDR_SCKDV_MIN, MAX_SCLK_OUT);
+
+ ctlr->set_cs = rockchip_spi_set_cs;
+ ctlr->transfer_one = rockchip_spi_transfer_one;
+ ctlr->max_transfer_size = rockchip_spi_max_transfer_size;
+ ctlr->handle_err = rockchip_spi_handle_err;
+
+ ctlr->dma_tx = dma_request_chan(rs->dev, "tx");
+ if (IS_ERR(ctlr->dma_tx)) {
/* Check tx to see if we need defer probing driver */
- if (PTR_ERR(master->dma_tx) == -EPROBE_DEFER) {
+ if (PTR_ERR(ctlr->dma_tx) == -EPROBE_DEFER) {
ret = -EPROBE_DEFER;
goto err_disable_pm_runtime;
}
dev_warn(rs->dev, "Failed to request TX DMA channel\n");
- master->dma_tx = NULL;
+ ctlr->dma_tx = NULL;
}
- master->dma_rx = dma_request_chan(rs->dev, "rx");
- if (IS_ERR(master->dma_rx)) {
- if (PTR_ERR(master->dma_rx) == -EPROBE_DEFER) {
+ ctlr->dma_rx = dma_request_chan(rs->dev, "rx");
+ if (IS_ERR(ctlr->dma_rx)) {
+ if (PTR_ERR(ctlr->dma_rx) == -EPROBE_DEFER) {
ret = -EPROBE_DEFER;
goto err_free_dma_tx;
}
dev_warn(rs->dev, "Failed to request RX DMA channel\n");
- master->dma_rx = NULL;
+ ctlr->dma_rx = NULL;
}
- if (master->dma_tx && master->dma_rx) {
+ if (ctlr->dma_tx && ctlr->dma_rx) {
rs->dma_addr_tx = mem->start + ROCKCHIP_SPI_TXDR;
rs->dma_addr_rx = mem->start + ROCKCHIP_SPI_RXDR;
- master->can_dma = rockchip_spi_can_dma;
+ ctlr->can_dma = rockchip_spi_can_dma;
}
- ret = devm_spi_register_master(&pdev->dev, master);
+ ret = devm_spi_register_controller(&pdev->dev, ctlr);
if (ret < 0) {
- dev_err(&pdev->dev, "Failed to register master\n");
+ dev_err(&pdev->dev, "Failed to register controller\n");
goto err_free_dma_rx;
}
return 0;
err_free_dma_rx:
- if (master->dma_rx)
- dma_release_channel(master->dma_rx);
+ if (ctlr->dma_rx)
+ dma_release_channel(ctlr->dma_rx);
err_free_dma_tx:
- if (master->dma_tx)
- dma_release_channel(master->dma_tx);
+ if (ctlr->dma_tx)
+ dma_release_channel(ctlr->dma_tx);
err_disable_pm_runtime:
pm_runtime_disable(&pdev->dev);
err_disable_spiclk:
clk_disable_unprepare(rs->spiclk);
err_disable_apbclk:
clk_disable_unprepare(rs->apb_pclk);
-err_put_master:
- spi_master_put(master);
+err_put_ctlr:
+ spi_controller_put(ctlr);
return ret;
}
static int rockchip_spi_remove(struct platform_device *pdev)
{
- struct spi_master *master = spi_master_get(platform_get_drvdata(pdev));
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = spi_controller_get(platform_get_drvdata(pdev));
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
pm_runtime_get_sync(&pdev->dev);
pm_runtime_disable(&pdev->dev);
pm_runtime_set_suspended(&pdev->dev);
- if (master->dma_tx)
- dma_release_channel(master->dma_tx);
- if (master->dma_rx)
- dma_release_channel(master->dma_rx);
+ if (ctlr->dma_tx)
+ dma_release_channel(ctlr->dma_tx);
+ if (ctlr->dma_rx)
+ dma_release_channel(ctlr->dma_rx);
- spi_master_put(master);
+ spi_controller_put(ctlr);
return 0;
}
static int rockchip_spi_suspend(struct device *dev)
{
int ret;
- struct spi_master *master = dev_get_drvdata(dev);
+ struct spi_controller *ctlr = dev_get_drvdata(dev);
- ret = spi_master_suspend(master);
+ ret = spi_controller_suspend(ctlr);
if (ret < 0)
return ret;
static int rockchip_spi_resume(struct device *dev)
{
int ret;
- struct spi_master *master = dev_get_drvdata(dev);
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = dev_get_drvdata(dev);
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
pinctrl_pm_select_default_state(dev);
if (ret < 0)
return ret;
- ret = spi_master_resume(master);
+ ret = spi_controller_resume(ctlr);
if (ret < 0) {
clk_disable_unprepare(rs->spiclk);
clk_disable_unprepare(rs->apb_pclk);
#ifdef CONFIG_PM
static int rockchip_spi_runtime_suspend(struct device *dev)
{
- struct spi_master *master = dev_get_drvdata(dev);
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = dev_get_drvdata(dev);
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
clk_disable_unprepare(rs->spiclk);
clk_disable_unprepare(rs->apb_pclk);
static int rockchip_spi_runtime_resume(struct device *dev)
{
int ret;
- struct spi_master *master = dev_get_drvdata(dev);
- struct rockchip_spi *rs = spi_master_get_devdata(master);
+ struct spi_controller *ctlr = dev_get_drvdata(dev);
+ struct rockchip_spi *rs = spi_controller_get_devdata(ctlr);
ret = clk_prepare_enable(rs->apb_pclk);
if (ret < 0)
module_i2c_driver(sc18is602_driver);
-MODULE_DESCRIPTION("SC18IC602/603 SPI Master Driver");
+MODULE_DESCRIPTION("SC18IS602/603 SPI Master Driver");
MODULE_AUTHOR("Guenter Roeck");
MODULE_LICENSE("GPL");
static SIMPLE_DEV_PM_OPS(sh_msiof_spi_pm_ops, sh_msiof_spi_suspend,
sh_msiof_spi_resume);
-#define DEV_PM_OPS &sh_msiof_spi_pm_ops
+#define DEV_PM_OPS (&sh_msiof_spi_pm_ops)
#else
#define DEV_PM_OPS NULL
#endif /* CONFIG_PM_SLEEP */
static void sprd_adi_set_wdt_rst_mode(struct sprd_adi *sadi)
{
-#ifdef CONFIG_SPRD_WATCHDOG
+#if IS_ENABLED(CONFIG_SPRD_WATCHDOG)
u32 val;
/* Set default watchdog reboot mode */
#include <linux/of.h>
#include <linux/of_device.h>
#include <linux/pinctrl/consumer.h>
+#include <linux/pm_runtime.h>
#include <linux/platform_device.h>
#include <linux/reset.h>
#include <linux/sizes.h>
#define STM32_BUSY_TIMEOUT_US 100000
#define STM32_ABT_TIMEOUT_US 100000
#define STM32_COMP_TIMEOUT_MS 1000
+#define STM32_AUTOSUSPEND_DELAY -1
struct stm32_qspi_flash {
struct stm32_qspi *qspi;
struct stm32_qspi *qspi = spi_controller_get_devdata(mem->spi->master);
int ret;
+ ret = pm_runtime_get_sync(qspi->dev);
+ if (ret < 0)
+ return ret;
+
mutex_lock(&qspi->lock);
ret = stm32_qspi_send(mem, op);
mutex_unlock(&qspi->lock);
+ pm_runtime_mark_last_busy(qspi->dev);
+ pm_runtime_put_autosuspend(qspi->dev);
+
return ret;
}
struct stm32_qspi *qspi = spi_controller_get_devdata(ctrl);
struct stm32_qspi_flash *flash;
u32 presc;
+ int ret;
if (ctrl->busy)
return -EBUSY;
if (!spi->max_speed_hz)
return -EINVAL;
+ ret = pm_runtime_get_sync(qspi->dev);
+ if (ret < 0)
+ return ret;
+
presc = DIV_ROUND_UP(qspi->clk_rate, spi->max_speed_hz) - 1;
flash = &qspi->flash[spi->chip_select];
writel_relaxed(qspi->dcr_reg, qspi->io_base + QSPI_DCR);
mutex_unlock(&qspi->lock);
+ pm_runtime_mark_last_busy(qspi->dev);
+ pm_runtime_put_autosuspend(qspi->dev);
+
return 0;
}
static void stm32_qspi_release(struct stm32_qspi *qspi)
{
+ pm_runtime_get_sync(qspi->dev);
/* disable qspi */
writel_relaxed(0, qspi->io_base + QSPI_CR);
stm32_qspi_dma_free(qspi);
mutex_destroy(&qspi->lock);
+ pm_runtime_put_noidle(qspi->dev);
+ pm_runtime_disable(qspi->dev);
+ pm_runtime_set_suspended(qspi->dev);
+ pm_runtime_dont_use_autosuspend(qspi->dev);
clk_disable_unprepare(qspi->clk);
}
ctrl->num_chipselect = STM32_QSPI_MAX_NORCHIP;
ctrl->dev.of_node = dev->of_node;
+ pm_runtime_set_autosuspend_delay(dev, STM32_AUTOSUSPEND_DELAY);
+ pm_runtime_use_autosuspend(dev);
+ pm_runtime_set_active(dev);
+ pm_runtime_enable(dev);
+ pm_runtime_get_noresume(dev);
+
ret = devm_spi_register_master(dev, ctrl);
- if (!ret)
- return 0;
+ if (ret)
+ goto err_qspi_release;
+
+ pm_runtime_mark_last_busy(dev);
+ pm_runtime_put_autosuspend(dev);
+
+ return 0;
err_qspi_release:
stm32_qspi_release(qspi);
struct stm32_qspi *qspi = platform_get_drvdata(pdev);
stm32_qspi_release(qspi);
+
return 0;
}
-static int __maybe_unused stm32_qspi_suspend(struct device *dev)
+static int __maybe_unused stm32_qspi_runtime_suspend(struct device *dev)
{
struct stm32_qspi *qspi = dev_get_drvdata(dev);
clk_disable_unprepare(qspi->clk);
+
+ return 0;
+}
+
+static int __maybe_unused stm32_qspi_runtime_resume(struct device *dev)
+{
+ struct stm32_qspi *qspi = dev_get_drvdata(dev);
+
+ return clk_prepare_enable(qspi->clk);
+}
+
+static int __maybe_unused stm32_qspi_suspend(struct device *dev)
+{
pinctrl_pm_select_sleep_state(dev);
return 0;
writel_relaxed(qspi->cr_reg, qspi->io_base + QSPI_CR);
writel_relaxed(qspi->dcr_reg, qspi->io_base + QSPI_DCR);
+ pm_runtime_mark_last_busy(qspi->dev);
+ pm_runtime_put_autosuspend(qspi->dev);
+
return 0;
}
-static SIMPLE_DEV_PM_OPS(stm32_qspi_pm_ops, stm32_qspi_suspend, stm32_qspi_resume);
+static const struct dev_pm_ops stm32_qspi_pm_ops = {
+ SET_RUNTIME_PM_OPS(stm32_qspi_runtime_suspend,
+ stm32_qspi_runtime_resume, NULL)
+ SET_SYSTEM_SLEEP_PM_OPS(stm32_qspi_suspend, stm32_qspi_resume)
+};
static const struct of_device_id stm32_qspi_match[] = {
{.compatible = "st,stm32f469-qspi"},
mask |= STM32F4_SPI_SR_TXE;
}
- if (!spi->cur_usedma && spi->cur_comm == SPI_FULL_DUPLEX) {
+ if (!spi->cur_usedma && (spi->cur_comm == SPI_FULL_DUPLEX ||
+ spi->cur_comm == SPI_SIMPLEX_RX ||
+ spi->cur_comm == SPI_3WIRE_RX)) {
/* TXE flag is set and is handled when RXNE flag occurs */
sr &= ~STM32F4_SPI_SR_TXE;
mask |= STM32F4_SPI_SR_RXNE | STM32F4_SPI_SR_OVR;
stm32f4_spi_read_rx(spi);
if (spi->rx_len == 0)
end = true;
- else /* Load data for discontinuous mode */
+ else if (spi->tx_buf)/* Load data for discontinuous mode */
stm32f4_spi_write_tx(spi);
}
/* Enable the interrupts relative to the current communication mode */
if (spi->cur_comm == SPI_SIMPLEX_TX || spi->cur_comm == SPI_3WIRE_TX) {
cr2 |= STM32F4_SPI_CR2_TXEIE;
- } else if (spi->cur_comm == SPI_FULL_DUPLEX) {
+ } else if (spi->cur_comm == SPI_FULL_DUPLEX ||
+ spi->cur_comm == SPI_SIMPLEX_RX ||
+ spi->cur_comm == SPI_3WIRE_RX) {
/* In transmit-only mode, the OVR flag is set in the SR register
* since the received data are never read. Therefore set OVR
* interrupt only when rx buffer is available.
stm32_spi_set_bits(spi, STM32F4_SPI_CR1,
STM32F4_SPI_CR1_BIDIMODE |
STM32F4_SPI_CR1_BIDIOE);
- } else if (comm_type == SPI_FULL_DUPLEX) {
+ } else if (comm_type == SPI_FULL_DUPLEX ||
+ comm_type == SPI_SIMPLEX_RX) {
stm32_spi_clr_bits(spi, STM32F4_SPI_CR1,
STM32F4_SPI_CR1_BIDIMODE |
STM32F4_SPI_CR1_BIDIOE);
+ } else if (comm_type == SPI_3WIRE_RX) {
+ stm32_spi_set_bits(spi, STM32F4_SPI_CR1,
+ STM32F4_SPI_CR1_BIDIMODE);
+ stm32_spi_clr_bits(spi, STM32F4_SPI_CR1,
+ STM32F4_SPI_CR1_BIDIOE);
} else {
return -EINVAL;
}
master->prepare_message = stm32_spi_prepare_msg;
master->transfer_one = stm32_spi_transfer_one;
master->unprepare_message = stm32_spi_unprepare_msg;
+ master->flags = SPI_MASTER_MUST_TX;
spi->dma_tx = dma_request_chan(spi->dev, "tx");
if (IS_ERR(spi->dma_tx)) {
master->max_speed_hz = 100 * 1000 * 1000;
master->min_speed_hz = 3 * 1000;
+ master->use_gpio_descriptors = true;
master->set_cs = sun6i_spi_set_cs;
master->transfer_one = sun6i_spi_transfer_one;
master->num_chipselect = 4;
ret = pm_runtime_get_sync(&pdev->dev);
if (ret < 0) {
dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret);
+ pm_runtime_put_noidle(&pdev->dev);
goto exit_pm_disable;
}
ret = pm_runtime_get_sync(&pdev->dev);
if (ret < 0) {
dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret);
+ pm_runtime_put_noidle(&pdev->dev);
goto exit_pm_disable;
}
ret = pm_runtime_get_sync(&pdev->dev);
if (ret < 0) {
dev_err(&pdev->dev, "pm runtime get failed, e = %d\n", ret);
+ pm_runtime_put_noidle(&pdev->dev);
goto exit_pm_disable;
}
tspi->def_command_reg = SLINK_M_S;
priv->master = master;
priv->is_save_param = false;
- res = platform_get_resource(pdev, IORESOURCE_MEM, 0);
- priv->base = devm_ioremap_resource(&pdev->dev, res);
+ priv->base = devm_platform_get_and_ioremap_resource(pdev, 0, &res);
if (IS_ERR(priv->base)) {
ret = PTR_ERR(priv->base);
goto out_master_put;
master->dma_tx = dma_request_chan(&pdev->dev, "tx");
if (IS_ERR_OR_NULL(master->dma_tx)) {
- if (PTR_ERR(master->dma_tx) == -EPROBE_DEFER)
+ if (PTR_ERR(master->dma_tx) == -EPROBE_DEFER) {
+ ret = -EPROBE_DEFER;
goto out_disable_clk;
+ }
master->dma_tx = NULL;
dma_tx_burst = INT_MAX;
} else {
master->dma_rx = dma_request_chan(&pdev->dev, "rx");
if (IS_ERR_OR_NULL(master->dma_rx)) {
- if (PTR_ERR(master->dma_rx) == -EPROBE_DEFER)
+ if (PTR_ERR(master->dma_rx) == -EPROBE_DEFER) {
+ ret = -EPROBE_DEFER;
goto out_disable_clk;
+ }
master->dma_rx = NULL;
dma_rx_burst = INT_MAX;
} else {
void *tmp;
unsigned int max_tx, max_rx;
- if (ctlr->flags & (SPI_CONTROLLER_MUST_RX | SPI_CONTROLLER_MUST_TX)) {
+ if ((ctlr->flags & (SPI_CONTROLLER_MUST_RX | SPI_CONTROLLER_MUST_TX))
+ && !(msg->spi->mode & SPI_3WIRE)) {
max_tx = 0;
max_rx = 0;
{
struct spi_statistics *statm = &ctlr->statistics;
struct spi_statistics *stats = &msg->spi->statistics;
- unsigned long long ms = 1;
+ unsigned long long ms;
if (spi_controller_is_slave(ctlr)) {
if (wait_for_completion_interruptible(&ctlr->xfer_completion)) {
{
int delay;
+ might_sleep();
+
if (!_delay)
return -EINVAL;
}
lookup->max_speed_hz = sb->connection_speed;
+ lookup->bits_per_word = sb->data_bit_length;
if (sb->clock_phase == ACPI_SPI_SECOND_PHASE)
lookup->mode |= SPI_CPHA;
struct spi_controller *found;
int id = ctlr->bus_num;
+ device_for_each_child(&ctlr->dev, NULL, __unregister);
+
/* First make sure that this controller was ever added */
mutex_lock(&board_lock);
found = idr_find(&spi_master_idr, id);
list_del(&ctlr->list);
mutex_unlock(&board_lock);
- device_for_each_child(&ctlr->dev, NULL, __unregister);
device_unregister(&ctlr->dev);
/* free bus id */
mutex_lock(&board_lock);
* is zero for success, else a negative errno status code.
* This call may only be used from a context that may sleep.
*
- * Parameters to this routine are always copied using a small buffer;
- * portable code should never use this for more than 32 bytes.
+ * Parameters to this routine are always copied using a small buffer.
* Performance-sensitive or bulk transfer code should instead use
* spi_{async,sync}() calls with dma-safe buffers.
*
#define SPI_MODE_MASK (SPI_CPHA | SPI_CPOL | SPI_CS_HIGH \
| SPI_LSB_FIRST | SPI_3WIRE | SPI_LOOP \
| SPI_NO_CS | SPI_READY | SPI_TX_DUAL \
- | SPI_TX_QUAD | SPI_RX_DUAL | SPI_RX_QUAD)
+ | SPI_TX_QUAD | SPI_TX_OCTAL | SPI_RX_DUAL \
+ | SPI_RX_QUAD | SPI_RX_OCTAL)
struct spidev_data {
dev_t devt;
}
if (C_CRTSCTS(tty) && C_BAUD(tty) != B0)
- newline.flow_control |= GB_SERIAL_AUTO_RTSCTS_EN;
+ newline.flow_control = GB_SERIAL_AUTO_RTSCTS_EN;
else
- newline.flow_control &= ~GB_SERIAL_AUTO_RTSCTS_EN;
+ newline.flow_control = 0;
if (memcmp(&gb_tty->line_coding, &newline, sizeof(newline))) {
memcpy(&gb_tty->line_coding, &newline, sizeof(newline));
static int ad2s1210_config_read(struct ad2s1210_state *st,
unsigned char address)
{
- struct spi_transfer xfer = {
- .len = 2,
- .rx_buf = st->rx,
- .tx_buf = st->tx,
+ struct spi_transfer xfers[] = {
+ {
+ .len = 1,
+ .rx_buf = &st->rx[0],
+ .tx_buf = &st->tx[0],
+ .cs_change = 1,
+ }, {
+ .len = 1,
+ .rx_buf = &st->rx[1],
+ .tx_buf = &st->tx[1],
+ },
};
int ret = 0;
ad2s1210_set_mode(MOD_CONFIG, st);
st->tx[0] = address | AD2S1210_MSB_IS_HIGH;
st->tx[1] = AD2S1210_REG_FAULT;
- ret = spi_sync_transfer(st->sdev, &xfer, 1);
+ ret = spi_sync_transfer(st->sdev, xfers, 2);
if (ret < 0)
return ret;
{
int err = 0;
struct kp2000_device *pcard;
- int rv;
unsigned long reg_bar_phys_addr;
unsigned long reg_bar_phys_len;
unsigned long dma_bar_phys_addr;
if (err < 0)
goto err_release_dma;
- rv = request_irq(pcard->pdev->irq, kp2000_irq_handler, IRQF_SHARED,
- pcard->name, pcard);
- if (rv) {
+ err = request_irq(pcard->pdev->irq, kp2000_irq_handler, IRQF_SHARED,
+ pcard->name, pcard);
+ if (err) {
dev_err(&pcard->pdev->dev,
- "%s: failed to request_irq: %d\n", __func__, rv);
+ "%s: failed to request_irq: %d\n", __func__, err);
goto err_disable_msi;
}
wvif->scan_abort = false;
reinit_completion(&wvif->scan_complete);
timeout = hif_scan(wvif, req, start_idx, i - start_idx);
- if (timeout < 0)
+ if (timeout < 0) {
+ wfx_tx_unlock(wvif->wdev);
return timeout;
+ }
ret = wait_for_completion_timeout(&wvif->scan_complete, timeout);
if (req->channels[start_idx]->max_power != wvif->vif->bss_conf.txpower)
hif_set_output_power(wvif, wvif->vif->bss_conf.txpower);
cmd->se_tfo->queue_tm_rsp(cmd);
+ transport_lun_remove_cmd(cmd);
transport_cmd_check_stop_to_fabric(cmd);
return;
static void __ssp_add_console_port(struct sifive_serial_port *ssp)
{
+ spin_lock_init(&ssp->port.lock);
sifive_serial_console_ports[ssp->port.line] = ssp;
}
* @ptr: address of device controller register to be read and changed
* @mask: bits requested to clar
*/
-void cdns3_clear_register_bit(void __iomem *ptr, u32 mask)
+static void cdns3_clear_register_bit(void __iomem *ptr, u32 mask)
{
mask = readl(ptr) & ~mask;
writel(mask, ptr);
*
* Returns buffer or NULL if no buffers in list
*/
-struct cdns3_aligned_buf *cdns3_next_align_buf(struct list_head *list)
+static struct cdns3_aligned_buf *cdns3_next_align_buf(struct list_head *list)
{
return list_first_entry_or_null(list, struct cdns3_aligned_buf, list);
}
*
* Returns request or NULL if no requests in list
*/
-struct cdns3_request *cdns3_next_priv_request(struct list_head *list)
+static struct cdns3_request *cdns3_next_priv_request(struct list_head *list)
{
return list_first_entry_or_null(list, struct cdns3_request, list);
}
return priv_ep->trb_pool_dma + offset;
}
-int cdns3_ring_size(struct cdns3_endpoint *priv_ep)
+static int cdns3_ring_size(struct cdns3_endpoint *priv_ep)
{
switch (priv_ep->type) {
case USB_ENDPOINT_XFER_ISOC:
cdns3_ep_inc_trb(&priv_ep->dequeue, &priv_ep->ccs, priv_ep->num_trbs);
}
-void cdns3_move_deq_to_next_trb(struct cdns3_request *priv_req)
+static void cdns3_move_deq_to_next_trb(struct cdns3_request *priv_req)
{
struct cdns3_endpoint *priv_ep = priv_req->priv_ep;
int current_trb = priv_req->start_trb;
}
}
-struct usb_request *cdns3_wa2_gadget_giveback(struct cdns3_device *priv_dev,
+static struct usb_request *cdns3_wa2_gadget_giveback(struct cdns3_device *priv_dev,
struct cdns3_endpoint *priv_ep,
struct cdns3_request *priv_req)
{
return &priv_req->request;
}
-int cdns3_wa2_gadget_ep_queue(struct cdns3_device *priv_dev,
+static int cdns3_wa2_gadget_ep_queue(struct cdns3_device *priv_dev,
struct cdns3_endpoint *priv_ep,
struct cdns3_request *priv_req)
{
cdns3_gadget_ep_free_request(&priv_ep->endpoint, request);
}
-void cdns3_wa1_restore_cycle_bit(struct cdns3_endpoint *priv_ep)
+static void cdns3_wa1_restore_cycle_bit(struct cdns3_endpoint *priv_ep)
{
/* Work around for stale data address in TRB*/
if (priv_ep->wa1_set) {
return 0;
}
-void cdns3_stream_ep_reconfig(struct cdns3_device *priv_dev,
+static void cdns3_stream_ep_reconfig(struct cdns3_device *priv_dev,
struct cdns3_endpoint *priv_ep)
{
if (!priv_ep->use_streams || priv_dev->gadget.speed < USB_SPEED_SUPER)
EP_CFG_TDL_CHK | EP_CFG_SID_CHK);
}
-void cdns3_configure_dmult(struct cdns3_device *priv_dev,
+static void cdns3_configure_dmult(struct cdns3_device *priv_dev,
struct cdns3_endpoint *priv_ep)
{
struct cdns3_usb_regs __iomem *regs = priv_dev->regs;
link_trb = priv_req->trb;
/* Update ring only if removed request is on pending_req_list list */
- if (req_on_hw_ring) {
+ if (req_on_hw_ring && link_trb) {
link_trb->buffer = TRB_BUFFER(priv_ep->trb_pool_dma +
((priv_req->end_trb + 1) * TRB_SIZE));
link_trb->control = (link_trb->control & TRB_CYCLE) |
usbm->vma_use_count = 1;
INIT_LIST_HEAD(&usbm->memlist);
- if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle, size)) {
- dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
- return -EAGAIN;
+ if (hcd->localmem_pool || !hcd_uses_dma(hcd)) {
+ if (remap_pfn_range(vma, vma->vm_start,
+ virt_to_phys(usbm->mem) >> PAGE_SHIFT,
+ size, vma->vm_page_prot) < 0) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
+ } else {
+ if (dma_mmap_coherent(hcd->self.sysdev, vma, mem, dma_handle,
+ size)) {
+ dec_usb_memory_use_count(usbm, &usbm->vma_use_count);
+ return -EAGAIN;
+ }
}
vma->vm_flags |= VM_IO;
#define USB_VENDOR_GENESYS_LOGIC 0x05e3
#define USB_VENDOR_SMSC 0x0424
+#define USB_PRODUCT_USB5534B 0x5534
#define HUB_QUIRK_CHECK_PORT_AUTOSUSPEND 0x01
#define HUB_QUIRK_DISABLE_AUTOSUSPEND 0x02
}
static const struct usb_device_id hub_id_table[] = {
- { .match_flags = USB_DEVICE_ID_MATCH_VENDOR | USB_DEVICE_ID_MATCH_INT_CLASS,
+ { .match_flags = USB_DEVICE_ID_MATCH_VENDOR
+ | USB_DEVICE_ID_MATCH_PRODUCT
+ | USB_DEVICE_ID_MATCH_INT_CLASS,
.idVendor = USB_VENDOR_SMSC,
+ .idProduct = USB_PRODUCT_USB5534B,
.bInterfaceClass = USB_CLASS_HUB,
.driver_info = HUB_QUIRK_DISABLE_AUTOSUSPEND},
{ .match_flags = USB_DEVICE_ID_MATCH_VENDOR
tristate "DesignWare USB3 DRD Core Support"
depends on (USB || USB_GADGET) && HAS_DMA
select USB_XHCI_PLATFORM if USB_XHCI_HCD
+ select USB_ROLE_SWITCH if USB_DWC3_DUAL_ROLE
help
Say Y or M here if your system has a Dual Role SuperSpeed
USB controller based on the DesignWare USB3 IP Core.
static const struct property_entry dwc3_pci_mrfld_properties[] = {
PROPERTY_ENTRY_STRING("dr_mode", "otg"),
+ PROPERTY_ENTRY_STRING("linux,extcon-name", "mrfld_bcove_pwrsrc"),
PROPERTY_ENTRY_BOOL("linux,sysdev_is_parent"),
{}
};
for_each_sg(sg, s, pending, i) {
trb = &dep->trb_pool[dep->trb_dequeue];
- if (trb->ctrl & DWC3_TRB_CTRL_HWO)
- break;
-
req->sg = sg_next(s);
req->num_pending_sgs--;
char *name;
int ret;
+ if (strlen(page) < len)
+ return -EOVERFLOW;
+
name = kstrdup(page, GFP_KERNEL);
if (!name)
return -ENOMEM;
struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(cdev->gadget);
- if (!usb_desc)
+ if (!usb_desc) {
+ status = -ENOMEM;
goto fail;
+ }
usb_otg_descriptor_init(cdev->gadget, usb_desc);
otg_desc[0] = usb_desc;
otg_desc[1] = NULL;
struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget);
- if (!usb_desc)
+ if (!usb_desc) {
+ status = -ENOMEM;
goto fail1;
+ }
usb_otg_descriptor_init(gadget, usb_desc);
otg_desc[0] = usb_desc;
otg_desc[1] = NULL;
req->buf = dev->rbuf;
req->context = NULL;
- value = -EOPNOTSUPP;
switch (ctrl->bRequest) {
case USB_REQ_GET_DESCRIPTOR:
dev_config (struct file *fd, const char __user *buf, size_t len, loff_t *ptr)
{
struct dev_data *dev = fd->private_data;
- ssize_t value = len, length = len;
+ ssize_t value, length = len;
unsigned total;
u32 tag;
char *kbuf;
struct usb_descriptor_header *usb_desc;
usb_desc = usb_otg_descriptor_alloc(gadget);
- if (!usb_desc)
+ if (!usb_desc) {
+ status = -ENOMEM;
goto fail;
+ }
usb_otg_descriptor_init(gadget, usb_desc);
otg_desc[0] = usb_desc;
otg_desc[1] = NULL;
*/
#include <linux/compiler.h>
+#include <linux/ctype.h>
#include <linux/debugfs.h>
#include <linux/delay.h>
#include <linux/kref.h>
struct raw_dev;
-#define USB_RAW_MAX_ENDPOINTS 32
-
enum ep_state {
STATE_EP_DISABLED,
STATE_EP_ENABLED,
struct raw_dev *dev;
enum ep_state state;
struct usb_ep *ep;
+ u8 addr;
struct usb_request *req;
bool urb_queued;
bool disabling;
bool ep0_out_pending;
bool ep0_urb_queued;
ssize_t ep0_status;
- struct raw_ep eps[USB_RAW_MAX_ENDPOINTS];
+ struct raw_ep eps[USB_RAW_EPS_NUM_MAX];
+ int eps_num;
struct completion ep0_done;
struct raw_event_queue queue;
usb_ep_free_request(dev->gadget->ep0, dev->req);
}
raw_event_queue_destroy(&dev->queue);
- for (i = 0; i < USB_RAW_MAX_ENDPOINTS; i++) {
- if (dev->eps[i].state != STATE_EP_ENABLED)
+ for (i = 0; i < dev->eps_num; i++) {
+ if (dev->eps[i].state == STATE_EP_DISABLED)
continue;
usb_ep_disable(dev->eps[i].ep);
usb_ep_free_request(dev->eps[i].ep, dev->eps[i].req);
complete(&dev->ep0_done);
}
+static u8 get_ep_addr(const char *name)
+{
+ /* If the endpoint has fixed function (named as e.g. "ep12out-bulk"),
+ * parse the endpoint address from its name. We deliberately use
+ * deprecated simple_strtoul() function here, as the number isn't
+ * followed by '\0' nor '\n'.
+ */
+ if (isdigit(name[2]))
+ return simple_strtoul(&name[2], NULL, 10);
+ /* Otherwise the endpoint is configurable (named as e.g. "ep-a"). */
+ return USB_RAW_EP_ADDR_ANY;
+}
+
static int gadget_bind(struct usb_gadget *gadget,
struct usb_gadget_driver *driver)
{
- int ret = 0;
+ int ret = 0, i = 0;
struct raw_dev *dev = container_of(driver, struct raw_dev, driver);
struct usb_request *req;
+ struct usb_ep *ep;
unsigned long flags;
if (strcmp(gadget->name, dev->udc_name) != 0)
dev->req->context = dev;
dev->req->complete = gadget_ep0_complete;
dev->gadget = gadget;
+ gadget_for_each_ep(ep, dev->gadget) {
+ dev->eps[i].ep = ep;
+ dev->eps[i].addr = get_ep_addr(ep->name);
+ dev->eps[i].state = STATE_EP_DISABLED;
+ i++;
+ }
+ dev->eps_num = i;
spin_unlock_irqrestore(&dev->lock, flags);
/* Matches kref_put() in gadget_unbind(). */
if (copy_from_user(io, ptr, sizeof(*io)))
return ERR_PTR(-EFAULT);
- if (io->ep >= USB_RAW_MAX_ENDPOINTS)
+ if (io->ep >= USB_RAW_EPS_NUM_MAX)
return ERR_PTR(-EINVAL);
if (!usb_raw_io_flags_valid(io->flags))
return ERR_PTR(-EINVAL);
if (IS_ERR(data))
return PTR_ERR(data);
ret = raw_process_ep0_io(dev, &io, data, false);
- if (ret)
+ if (ret < 0)
goto free;
length = min(io.length, (unsigned int)ret);
if (copy_to_user((void __user *)(value + sizeof(io)), data, length))
ret = -EFAULT;
+ else
+ ret = length;
free:
kfree(data);
return ret;
}
-static bool check_ep_caps(struct usb_ep *ep,
- struct usb_endpoint_descriptor *desc)
+static int raw_ioctl_ep0_stall(struct raw_dev *dev, unsigned long value)
{
- switch (usb_endpoint_type(desc)) {
- case USB_ENDPOINT_XFER_ISOC:
- if (!ep->caps.type_iso)
- return false;
- break;
- case USB_ENDPOINT_XFER_BULK:
- if (!ep->caps.type_bulk)
- return false;
- break;
- case USB_ENDPOINT_XFER_INT:
- if (!ep->caps.type_int)
- return false;
- break;
- default:
- return false;
+ int ret = 0;
+ unsigned long flags;
+
+ if (value)
+ return -EINVAL;
+ spin_lock_irqsave(&dev->lock, flags);
+ if (dev->state != STATE_DEV_RUNNING) {
+ dev_dbg(dev->dev, "fail, device is not running\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (!dev->gadget) {
+ dev_dbg(dev->dev, "fail, gadget is not bound\n");
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ if (dev->ep0_urb_queued) {
+ dev_dbg(&dev->gadget->dev, "fail, urb already queued\n");
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ if (!dev->ep0_in_pending && !dev->ep0_out_pending) {
+ dev_dbg(&dev->gadget->dev, "fail, no request pending\n");
+ ret = -EBUSY;
+ goto out_unlock;
}
- if (usb_endpoint_dir_in(desc) && !ep->caps.dir_in)
- return false;
- if (usb_endpoint_dir_out(desc) && !ep->caps.dir_out)
- return false;
+ ret = usb_ep_set_halt(dev->gadget->ep0);
+ if (ret < 0)
+ dev_err(&dev->gadget->dev,
+ "fail, usb_ep_set_halt returned %d\n", ret);
+
+ if (dev->ep0_in_pending)
+ dev->ep0_in_pending = false;
+ else
+ dev->ep0_out_pending = false;
- return true;
+out_unlock:
+ spin_unlock_irqrestore(&dev->lock, flags);
+ return ret;
}
static int raw_ioctl_ep_enable(struct raw_dev *dev, unsigned long value)
int ret = 0, i;
unsigned long flags;
struct usb_endpoint_descriptor *desc;
- struct usb_ep *ep = NULL;
+ struct raw_ep *ep;
desc = memdup_user((void __user *)value, sizeof(*desc));
if (IS_ERR(desc))
goto out_free;
}
- for (i = 0; i < USB_RAW_MAX_ENDPOINTS; i++) {
- if (dev->eps[i].state == STATE_EP_ENABLED)
+ for (i = 0; i < dev->eps_num; i++) {
+ ep = &dev->eps[i];
+ if (ep->state != STATE_EP_DISABLED)
continue;
- break;
- }
- if (i == USB_RAW_MAX_ENDPOINTS) {
- dev_dbg(&dev->gadget->dev,
- "fail, no device endpoints available\n");
- ret = -EBUSY;
- goto out_free;
- }
-
- gadget_for_each_ep(ep, dev->gadget) {
- if (ep->enabled)
+ if (ep->addr != usb_endpoint_num(desc) &&
+ ep->addr != USB_RAW_EP_ADDR_ANY)
continue;
- if (!check_ep_caps(ep, desc))
+ if (!usb_gadget_ep_match_desc(dev->gadget, ep->ep, desc, NULL))
continue;
- ep->desc = desc;
- ret = usb_ep_enable(ep);
+ ep->ep->desc = desc;
+ ret = usb_ep_enable(ep->ep);
if (ret < 0) {
dev_err(&dev->gadget->dev,
"fail, usb_ep_enable returned %d\n", ret);
goto out_free;
}
- dev->eps[i].req = usb_ep_alloc_request(ep, GFP_ATOMIC);
- if (!dev->eps[i].req) {
+ ep->req = usb_ep_alloc_request(ep->ep, GFP_ATOMIC);
+ if (!ep->req) {
dev_err(&dev->gadget->dev,
"fail, usb_ep_alloc_request failed\n");
- usb_ep_disable(ep);
+ usb_ep_disable(ep->ep);
ret = -ENOMEM;
goto out_free;
}
- dev->eps[i].ep = ep;
- dev->eps[i].state = STATE_EP_ENABLED;
- ep->driver_data = &dev->eps[i];
+ ep->state = STATE_EP_ENABLED;
+ ep->ep->driver_data = ep;
ret = i;
goto out_unlock;
}
{
int ret = 0, i = value;
unsigned long flags;
- const void *desc;
-
- if (i < 0 || i >= USB_RAW_MAX_ENDPOINTS)
- return -EINVAL;
spin_lock_irqsave(&dev->lock, flags);
if (dev->state != STATE_DEV_RUNNING) {
ret = -EBUSY;
goto out_unlock;
}
- if (dev->eps[i].state != STATE_EP_ENABLED) {
+ if (i < 0 || i >= dev->eps_num) {
+ dev_dbg(dev->dev, "fail, invalid endpoint\n");
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ if (dev->eps[i].state == STATE_EP_DISABLED) {
dev_dbg(&dev->gadget->dev, "fail, endpoint is not enabled\n");
ret = -EINVAL;
goto out_unlock;
spin_lock_irqsave(&dev->lock, flags);
usb_ep_free_request(dev->eps[i].ep, dev->eps[i].req);
- desc = dev->eps[i].ep->desc;
- dev->eps[i].ep = NULL;
+ kfree(dev->eps[i].ep->desc);
dev->eps[i].state = STATE_EP_DISABLED;
- kfree(desc);
dev->eps[i].disabling = false;
out_unlock:
return ret;
}
+static int raw_ioctl_ep_set_clear_halt_wedge(struct raw_dev *dev,
+ unsigned long value, bool set, bool halt)
+{
+ int ret = 0, i = value;
+ unsigned long flags;
+
+ spin_lock_irqsave(&dev->lock, flags);
+ if (dev->state != STATE_DEV_RUNNING) {
+ dev_dbg(dev->dev, "fail, device is not running\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (!dev->gadget) {
+ dev_dbg(dev->dev, "fail, gadget is not bound\n");
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ if (i < 0 || i >= dev->eps_num) {
+ dev_dbg(dev->dev, "fail, invalid endpoint\n");
+ ret = -EBUSY;
+ goto out_unlock;
+ }
+ if (dev->eps[i].state == STATE_EP_DISABLED) {
+ dev_dbg(&dev->gadget->dev, "fail, endpoint is not enabled\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (dev->eps[i].disabling) {
+ dev_dbg(&dev->gadget->dev,
+ "fail, disable is in progress\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (dev->eps[i].urb_queued) {
+ dev_dbg(&dev->gadget->dev,
+ "fail, waiting for urb completion\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ if (usb_endpoint_xfer_isoc(dev->eps[i].ep->desc)) {
+ dev_dbg(&dev->gadget->dev,
+ "fail, can't halt/wedge ISO endpoint\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+
+ if (set && halt) {
+ ret = usb_ep_set_halt(dev->eps[i].ep);
+ if (ret < 0)
+ dev_err(&dev->gadget->dev,
+ "fail, usb_ep_set_halt returned %d\n", ret);
+ } else if (!set && halt) {
+ ret = usb_ep_clear_halt(dev->eps[i].ep);
+ if (ret < 0)
+ dev_err(&dev->gadget->dev,
+ "fail, usb_ep_clear_halt returned %d\n", ret);
+ } else if (set && !halt) {
+ ret = usb_ep_set_wedge(dev->eps[i].ep);
+ if (ret < 0)
+ dev_err(&dev->gadget->dev,
+ "fail, usb_ep_set_wedge returned %d\n", ret);
+ }
+
+out_unlock:
+ spin_unlock_irqrestore(&dev->lock, flags);
+ return ret;
+}
+
static void gadget_ep_complete(struct usb_ep *ep, struct usb_request *req)
{
struct raw_ep *r_ep = (struct raw_ep *)ep->driver_data;
{
int ret = 0;
unsigned long flags;
- struct raw_ep *ep = &dev->eps[io->ep];
+ struct raw_ep *ep;
DECLARE_COMPLETION_ONSTACK(done);
spin_lock_irqsave(&dev->lock, flags);
ret = -EBUSY;
goto out_unlock;
}
+ if (io->ep >= dev->eps_num) {
+ dev_dbg(&dev->gadget->dev, "fail, invalid endpoint\n");
+ ret = -EINVAL;
+ goto out_unlock;
+ }
+ ep = &dev->eps[io->ep];
if (ep->state != STATE_EP_ENABLED) {
dev_dbg(&dev->gadget->dev, "fail, endpoint is not enabled\n");
ret = -EBUSY;
if (IS_ERR(data))
return PTR_ERR(data);
ret = raw_process_ep_io(dev, &io, data, false);
- if (ret)
+ if (ret < 0)
goto free;
length = min(io.length, (unsigned int)ret);
if (copy_to_user((void __user *)(value + sizeof(io)), data, length))
ret = -EFAULT;
+ else
+ ret = length;
free:
kfree(data);
return ret;
return ret;
}
+static void fill_ep_caps(struct usb_ep_caps *caps,
+ struct usb_raw_ep_caps *raw_caps)
+{
+ raw_caps->type_control = caps->type_control;
+ raw_caps->type_iso = caps->type_iso;
+ raw_caps->type_bulk = caps->type_bulk;
+ raw_caps->type_int = caps->type_int;
+ raw_caps->dir_in = caps->dir_in;
+ raw_caps->dir_out = caps->dir_out;
+}
+
+static void fill_ep_limits(struct usb_ep *ep, struct usb_raw_ep_limits *limits)
+{
+ limits->maxpacket_limit = ep->maxpacket_limit;
+ limits->max_streams = ep->max_streams;
+}
+
+static int raw_ioctl_eps_info(struct raw_dev *dev, unsigned long value)
+{
+ int ret = 0, i;
+ unsigned long flags;
+ struct usb_raw_eps_info *info;
+ struct raw_ep *ep;
+
+ info = kmalloc(sizeof(*info), GFP_KERNEL);
+ if (!info) {
+ ret = -ENOMEM;
+ goto out;
+ }
+
+ spin_lock_irqsave(&dev->lock, flags);
+ if (dev->state != STATE_DEV_RUNNING) {
+ dev_dbg(dev->dev, "fail, device is not running\n");
+ ret = -EINVAL;
+ spin_unlock_irqrestore(&dev->lock, flags);
+ goto out_free;
+ }
+ if (!dev->gadget) {
+ dev_dbg(dev->dev, "fail, gadget is not bound\n");
+ ret = -EBUSY;
+ spin_unlock_irqrestore(&dev->lock, flags);
+ goto out_free;
+ }
+
+ memset(info, 0, sizeof(*info));
+ for (i = 0; i < dev->eps_num; i++) {
+ ep = &dev->eps[i];
+ strscpy(&info->eps[i].name[0], ep->ep->name,
+ USB_RAW_EP_NAME_MAX);
+ info->eps[i].addr = ep->addr;
+ fill_ep_caps(&ep->ep->caps, &info->eps[i].caps);
+ fill_ep_limits(ep->ep, &info->eps[i].limits);
+ }
+ ret = dev->eps_num;
+ spin_unlock_irqrestore(&dev->lock, flags);
+
+ if (copy_to_user((void __user *)value, info, sizeof(*info)))
+ ret = -EFAULT;
+
+out_free:
+ kfree(info);
+out:
+ return ret;
+}
+
static long raw_ioctl(struct file *fd, unsigned int cmd, unsigned long value)
{
struct raw_dev *dev = fd->private_data;
case USB_RAW_IOCTL_VBUS_DRAW:
ret = raw_ioctl_vbus_draw(dev, value);
break;
+ case USB_RAW_IOCTL_EPS_INFO:
+ ret = raw_ioctl_eps_info(dev, value);
+ break;
+ case USB_RAW_IOCTL_EP0_STALL:
+ ret = raw_ioctl_ep0_stall(dev, value);
+ break;
+ case USB_RAW_IOCTL_EP_SET_HALT:
+ ret = raw_ioctl_ep_set_clear_halt_wedge(
+ dev, value, true, true);
+ break;
+ case USB_RAW_IOCTL_EP_CLEAR_HALT:
+ ret = raw_ioctl_ep_set_clear_halt_wedge(
+ dev, value, false, true);
+ break;
+ case USB_RAW_IOCTL_EP_SET_WEDGE:
+ ret = raw_ioctl_ep_set_clear_halt_wedge(
+ dev, value, true, false);
+ break;
default:
ret = -EINVAL;
}
return 0;
}
-const struct file_operations queue_dbg_fops = {
+static const struct file_operations queue_dbg_fops = {
.owner = THIS_MODULE,
.open = queue_dbg_open,
.llseek = no_llseek,
.release = queue_dbg_release,
};
-const struct file_operations regs_dbg_fops = {
+static const struct file_operations regs_dbg_fops = {
.owner = THIS_MODULE,
.open = regs_dbg_open,
.llseek = generic_file_llseek,
err_req:
release_mem_region(base, len);
err:
+ kfree(dev);
+
return ret;
}
flush_work(&xudc->usb_role_sw_work);
- /* Forcibly disconnect before powergating. */
- tegra_xudc_device_mode_off(xudc);
-
- if (!pm_runtime_status_suspended(dev))
+ if (!pm_runtime_status_suspended(dev)) {
+ /* Forcibly disconnect before powergating. */
+ tegra_xudc_device_mode_off(xudc);
tegra_xudc_powergate(xudc);
+ }
pm_runtime_disable(dev);
struct clk *reg_clk = xhci->reg_clk;
struct usb_hcd *shared_hcd = xhci->shared_hcd;
+ pm_runtime_get_sync(&dev->dev);
xhci->xhc_state |= XHCI_STATE_REMOVING;
usb_remove_hcd(shared_hcd);
clk_disable_unprepare(reg_clk);
usb_put_hcd(hcd);
- pm_runtime_set_suspended(&dev->dev);
pm_runtime_disable(&dev->dev);
+ pm_runtime_put_noidle(&dev->dev);
+ pm_runtime_set_suspended(&dev->dev);
return 0;
}
/* New sg entry */
--num_sgs;
sent_len -= block_len;
- if (num_sgs != 0) {
- sg = sg_next(sg);
+ sg = sg_next(sg);
+ if (num_sgs != 0 && sg) {
block_len = sg_dma_len(sg);
addr = (u64) sg_dma_address(sg);
addr += sent_len;
.release = single_release,
};
-static struct debugfs_reg32 mtu3_prb_regs[] = {
+static const struct debugfs_reg32 mtu3_prb_regs[] = {
dump_prb_reg("enable", U3D_SSUSB_PRB_CTRL0),
dump_prb_reg("byte-sell", U3D_SSUSB_PRB_CTRL1),
dump_prb_reg("byte-selh", U3D_SSUSB_PRB_CTRL2),
static void mtu3_debugfs_create_prb_files(struct mtu3 *mtu)
{
struct ssusb_mtk *ssusb = mtu->ssusb;
- struct debugfs_reg32 *regs;
+ const struct debugfs_reg32 *regs;
struct dentry *dir_prb;
int i;
if (status < 0) {
dev_err(&pdev->dev, "can't get IRQ %d, err %d\n",
twl->irq1, status);
- return status;
+ goto err_put_regulator;
}
status = request_threaded_irq(twl->irq2, NULL, twl6030_usb_irq,
if (status < 0) {
dev_err(&pdev->dev, "can't get IRQ %d, err %d\n",
twl->irq2, status);
- free_irq(twl->irq1, twl);
- return status;
+ goto err_free_irq1;
}
twl->asleep = 0;
dev_info(&pdev->dev, "Initialized TWL6030 USB module\n");
return 0;
+
+err_free_irq1:
+ free_irq(twl->irq1, twl);
+err_put_regulator:
+ regulator_put(twl->usb3v3);
+
+ return status;
}
static int twl6030_usb_remove(struct platform_device *pdev)
#define PMC_USB_ALTMODE_DP_MODE_SHIFT 8
/* TBT specific Mode Data bits */
+#define PMC_USB_ALTMODE_HPD_HIGH BIT(14)
#define PMC_USB_ALTMODE_TBT_TYPE BIT(17)
#define PMC_USB_ALTMODE_CABLE_TYPE BIT(18)
#define PMC_USB_ALTMODE_ACTIVE_LINK BIT(20)
#define PMC_USB_ALTMODE_TBT_GEN(_g_) (((_g_) & GENMASK(1, 0)) << 28)
/* Display HPD Request bits */
+#define PMC_USB_DP_HPD_LVL BIT(4)
#define PMC_USB_DP_HPD_IRQ BIT(5)
-#define PMC_USB_DP_HPD_LVL BIT(6)
struct pmc_usb;
PMC_USB_ALTMODE_DP_MODE_SHIFT;
if (data->status & DP_STATUS_HPD_STATE)
- req.mode_data |= PMC_USB_DP_HPD_LVL <<
- PMC_USB_ALTMODE_DP_MODE_SHIFT;
+ req.mode_data |= PMC_USB_ALTMODE_HPD_HIGH;
return pmc_usb_command(port, (void *)&req, sizeof(req));
}
static void vdpasim_queue_ready(struct vdpasim *vdpasim, unsigned int idx)
{
struct vdpasim_virtqueue *vq = &vdpasim->vqs[idx];
- int ret;
- ret = vringh_init_iotlb(&vq->vring, vdpasim_features,
- VDPASIM_QUEUE_MAX, false,
- (struct vring_desc *)(uintptr_t)vq->desc_addr,
- (struct vring_avail *)
- (uintptr_t)vq->driver_addr,
- (struct vring_used *)
- (uintptr_t)vq->device_addr);
+ vringh_init_iotlb(&vq->vring, vdpasim_features,
+ VDPASIM_QUEUE_MAX, false,
+ (struct vring_desc *)(uintptr_t)vq->desc_addr,
+ (struct vring_avail *)
+ (uintptr_t)vq->driver_addr,
+ (struct vring_used *)
+ (uintptr_t)vq->device_addr);
}
static void vdpasim_vq_reset(struct vdpasim_virtqueue *vq)
if (!map)
return NULL;
- return (void *)(uintptr_t)(map->addr + addr - map->start);
+ return (void __user *)(uintptr_t)(map->addr + addr - map->start);
}
/* Can we switch to this memory table? */
* not happen in this case.
*/
static inline void __user *__vhost_get_user(struct vhost_virtqueue *vq,
- void *addr, unsigned int size,
+ void __user *addr, unsigned int size,
int type)
{
void __user *uaddr = vhost_vq_meta_fetch(vq,
struct afs_server *server = call->server;
unsigned int server_index = call->server_index;
unsigned int index = call->addr_ix;
- unsigned int rtt = UINT_MAX;
+ unsigned int rtt_us = 0;
bool have_result = false;
- u64 _rtt;
int ret = call->error;
_enter("%pU,%u", &server->uuid, index);
}
}
- /* Get the RTT and scale it to fit into a 32-bit value that represents
- * over a minute of time so that we can access it with one instruction
- * on a 32-bit system.
- */
- _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall);
- _rtt /= 64;
- rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt;
- if (rtt < server->probe.rtt) {
- server->probe.rtt = rtt;
+ rtt_us = rxrpc_kernel_get_srtt(call->net->socket, call->rxcall);
+ if (rtt_us < server->probe.rtt) {
+ server->probe.rtt = rtt_us;
alist->preferred = index;
have_result = true;
}
spin_unlock(&server->probe_lock);
_debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
- server_index, index, &alist->addrs[index].transport,
- (unsigned int)rtt, ret);
+ server_index, index, &alist->addrs[index].transport, rtt_us, ret);
have_result |= afs_fs_probe_done(server);
if (have_result)
ASSERTCMP(req->offset, <=, PAGE_SIZE);
if (req->offset == PAGE_SIZE) {
req->offset = 0;
- if (req->page_done)
- req->page_done(req);
req->index++;
if (req->remain > 0)
goto begin_page;
if (req->offset < PAGE_SIZE)
zero_user_segment(req->pages[req->index],
req->offset, PAGE_SIZE);
- if (req->page_done)
- req->page_done(req);
req->offset = 0;
}
+ if (req->page_done)
+ for (req->index = 0; req->index < req->nr_pages; req->index++)
+ req->page_done(req);
+
_leave(" = 0 [done]");
return 0;
}
struct afs_addr_list *alist = call->alist;
struct afs_vlserver *server = call->vlserver;
unsigned int server_index = call->server_index;
+ unsigned int rtt_us = 0;
unsigned int index = call->addr_ix;
- unsigned int rtt = UINT_MAX;
bool have_result = false;
- u64 _rtt;
int ret = call->error;
_enter("%s,%u,%u,%d,%d", server->name, server_index, index, ret, call->abort_code);
}
}
- /* Get the RTT and scale it to fit into a 32-bit value that represents
- * over a minute of time so that we can access it with one instruction
- * on a 32-bit system.
- */
- _rtt = rxrpc_kernel_get_rtt(call->net->socket, call->rxcall);
- _rtt /= 64;
- rtt = (_rtt > UINT_MAX) ? UINT_MAX : _rtt;
- if (rtt < server->probe.rtt) {
- server->probe.rtt = rtt;
+ rtt_us = rxrpc_kernel_get_srtt(call->net->socket, call->rxcall);
+ if (rtt_us < server->probe.rtt) {
+ server->probe.rtt = rtt_us;
alist->preferred = index;
have_result = true;
}
spin_unlock(&server->probe_lock);
_debug("probe [%u][%u] %pISpc rtt=%u ret=%d",
- server_index, index, &alist->addrs[index].transport,
- (unsigned int)rtt, ret);
+ server_index, index, &alist->addrs[index].transport, rtt_us, ret);
have_result |= afs_vl_probe_done(server);
if (have_result) {
ASSERTCMP(req->offset, <=, PAGE_SIZE);
if (req->offset == PAGE_SIZE) {
req->offset = 0;
- if (req->page_done)
- req->page_done(req);
req->index++;
if (req->remain > 0)
goto begin_page;
if (req->offset < PAGE_SIZE)
zero_user_segment(req->pages[req->index],
req->offset, PAGE_SIZE);
- if (req->page_done)
- req->page_done(req);
req->offset = 0;
}
+ if (req->page_done)
+ for (req->index = 0; req->index < req->nr_pages; req->index++)
+ req->page_done(req);
+
_leave(" = 0 [done]");
return 0;
}
(!regset->active || regset->active(t->task, regset) > 0)) {
int ret;
size_t size = regset_size(t->task, regset);
- void *data = kmalloc(size, GFP_KERNEL);
+ void *data = kzalloc(size, GFP_KERNEL);
if (unlikely(!data))
return 0;
ret = regset->get(t->task, regset,
object = container_of(op->op.object, struct cachefiles_object, fscache);
spin_lock(&object->work_lock);
list_add_tail(&monitor->op_link, &op->to_do);
+ fscache_enqueue_retrieval(op);
spin_unlock(&object->work_lock);
- fscache_enqueue_retrieval(op);
fscache_put_retrieval(op);
return 0;
}
struct inode *inode;
sector_t block;
unsigned shift;
- int ret;
+ int ret, ret2;
object = container_of(op->op.object,
struct cachefiles_object, fscache);
block = page->index;
block <<= shift;
- ret = bmap(inode, &block);
- ASSERT(ret < 0);
+ ret2 = bmap(inode, &block);
+ ASSERT(ret2 == 0);
_debug("%llx -> %llx",
(unsigned long long) (page->index << shift),
block = page->index;
block <<= shift;
- ret = bmap(inode, &block);
- ASSERT(!ret);
+ ret2 = bmap(inode, &block);
+ ASSERT(ret2 == 0);
_debug("%llx -> %llx",
(unsigned long long) (page->index << shift),
__ceph_queue_cap_release(session, cap);
spin_unlock(&session->s_cap_lock);
}
- goto done;
+ goto flush_cap_releases;
}
/* these will work even if we don't have a cap yet */
}
}
+ kref_put(&wdata2->refcount, cifs_writedata_release);
if (rc) {
- kref_put(&wdata2->refcount, cifs_writedata_release);
if (is_retryable_error(rc))
continue;
i += nr_pages;
* than it negotiated since it will refuse the read
* then.
*/
- if ((tcon->ses) && !(tcon->ses->capabilities &
+ if (!(tcon->ses->capabilities &
tcon->ses->server->vals->cap_large_files)) {
current_read_size = min_t(uint,
current_read_size, CIFSMaxBufSize);
* cifs_backup_query_path_info - SMB1 fallback code to get ino
*
* Fallback code to get file metadata when we don't have access to
- * @full_path (EACCESS) and have backup creds.
+ * @full_path (EACCES) and have backup creds.
*
* @data will be set to search info result buffer
* @resp_buf will be set to cifs resp buf and needs to be freed with
/**
* fscrypt_free_bounce_page() - free a ciphertext bounce page
+ * @bounce_page: the bounce page to free, or NULL
*
* Free a bounce page that was allocated by fscrypt_encrypt_pagecache_blocks(),
* or by fscrypt_alloc_bounce_page() directly.
memset(iv, 0, ci->ci_mode->ivsize);
if (flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) {
- WARN_ON_ONCE((u32)lblk_num != lblk_num);
+ WARN_ON_ONCE(lblk_num > U32_MAX);
+ WARN_ON_ONCE(ci->ci_inode->i_ino > U32_MAX);
lblk_num |= (u64)ci->ci_inode->i_ino << 32;
+ } else if (flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32) {
+ WARN_ON_ONCE(lblk_num > U32_MAX);
+ lblk_num = (u32)(ci->ci_hashed_ino + lblk_num);
} else if (flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY) {
memcpy(iv->nonce, ci->ci_nonce, FS_KEY_DERIVATION_NONCE_SIZE);
}
}
/**
- * fscrypt_encrypt_pagecache_blocks() - Encrypt filesystem blocks from a pagecache page
+ * fscrypt_encrypt_pagecache_blocks() - Encrypt filesystem blocks from a
+ * pagecache page
* @page: The locked pagecache page containing the block(s) to encrypt
* @len: Total size of the block(s) to encrypt. Must be a nonzero
* multiple of the filesystem's block size.
EXPORT_SYMBOL(fscrypt_encrypt_block_inplace);
/**
- * fscrypt_decrypt_pagecache_blocks() - Decrypt filesystem blocks in a pagecache page
+ * fscrypt_decrypt_pagecache_blocks() - Decrypt filesystem blocks in a
+ * pagecache page
* @page: The locked pagecache page containing the block(s) to decrypt
* @len: Total size of the block(s) to decrypt. Must be a nonzero
* multiple of the filesystem's block size.
/**
* fscrypt_init() - Set up for fs encryption.
+ *
+ * Return: 0 on success; -errno on failure
*/
static int __init fscrypt_init(void)
{
#include <crypto/skcipher.h>
#include "fscrypt_private.h"
-/**
+/*
* struct fscrypt_nokey_name - identifier for directory entry when key is absent
*
* When userspace lists an encrypted directory without access to the key, the
tfm = prev_tfm;
}
}
- {
- SHASH_DESC_ON_STACK(desc, tfm);
- desc->tfm = tfm;
-
- return crypto_shash_digest(desc, data, data_len, result);
- }
+ return crypto_shash_tfm_digest(tfm, data, data_len, result);
}
static inline bool fscrypt_is_dot_dotdot(const struct qstr *str)
/**
* fscrypt_fname_encrypt() - encrypt a filename
- *
- * The output buffer must be at least as large as the input buffer.
- * Any extra space is filled with NUL padding before encryption.
+ * @inode: inode of the parent directory (for regular filenames)
+ * or of the symlink (for symlink targets)
+ * @iname: the filename to encrypt
+ * @out: (output) the encrypted filename
+ * @olen: size of the encrypted filename. It must be at least @iname->len.
+ * Any extra space is filled with NUL padding before encryption.
*
* Return: 0 on success, -errno on failure
*/
/**
* fname_decrypt() - decrypt a filename
- *
- * The caller must have allocated sufficient memory for the @oname string.
+ * @inode: inode of the parent directory (for regular filenames)
+ * or of the symlink (for symlink targets)
+ * @iname: the encrypted filename to decrypt
+ * @oname: (output) the decrypted filename. The caller must have allocated
+ * enough space for this, e.g. using fscrypt_fname_alloc_buffer().
*
* Return: 0 on success, -errno on failure
*/
#define BASE64_CHARS(nbytes) DIV_ROUND_UP((nbytes) * 4, 3)
/**
- * base64_encode() -
+ * base64_encode() - base64-encode some bytes
+ * @src: the bytes to encode
+ * @len: number of bytes to encode
+ * @dst: (output) the base64-encoded string. Not NUL-terminated.
*
* Encodes the input string using characters from the set [A-Za-z0-9+,].
* The encoded string is roughly 4/3 times the size of the input string.
}
/**
- * fscrypt_fname_alloc_buffer - allocate a buffer for presented filenames
+ * fscrypt_fname_alloc_buffer() - allocate a buffer for presented filenames
+ * @inode: inode of the parent directory (for regular filenames)
+ * or of the symlink (for symlink targets)
+ * @max_encrypted_len: maximum length of encrypted filenames the buffer will be
+ * used to present
+ * @crypto_str: (output) buffer to allocate
*
* Allocate a buffer that is large enough to hold any decrypted or encoded
* filename (null-terminated), for the given maximum encrypted filename length.
EXPORT_SYMBOL(fscrypt_fname_alloc_buffer);
/**
- * fscrypt_fname_free_buffer - free the buffer for presented filenames
+ * fscrypt_fname_free_buffer() - free a buffer for presented filenames
+ * @crypto_str: the buffer to free
*
- * Free the buffer allocated by fscrypt_fname_alloc_buffer().
+ * Free a buffer that was allocated by fscrypt_fname_alloc_buffer().
*/
void fscrypt_fname_free_buffer(struct fscrypt_str *crypto_str)
{
EXPORT_SYMBOL(fscrypt_fname_free_buffer);
/**
- * fscrypt_fname_disk_to_usr() - converts a filename from disk space to user
- * space
- *
- * The caller must have allocated sufficient memory for the @oname string.
+ * fscrypt_fname_disk_to_usr() - convert an encrypted filename to
+ * user-presentable form
+ * @inode: inode of the parent directory (for regular filenames)
+ * or of the symlink (for symlink targets)
+ * @hash: first part of the name's dirhash, if applicable. This only needs to
+ * be provided if the filename is located in an indexed directory whose
+ * encryption key may be unavailable. Not needed for symlink targets.
+ * @minor_hash: second part of the name's dirhash, if applicable
+ * @iname: encrypted filename to convert. May also be "." or "..", which
+ * aren't actually encrypted.
+ * @oname: output buffer for the user-presentable filename. The caller must
+ * have allocated enough space for this, e.g. using
+ * fscrypt_fname_alloc_buffer().
*
* If the key is available, we'll decrypt the disk name. Otherwise, we'll
* encode it for presentation in fscrypt_nokey_name format.
u8 nonce[FS_KEY_DERIVATION_NONCE_SIZE];
};
-/**
+/*
* fscrypt_context - the encryption context of an inode
*
* This is the on-disk equivalent of an fscrypt_policy, stored alongside each
BUG();
}
-/**
+/*
* For encrypted symlinks, the ciphertext length is stored at the beginning
* of the string in little-endian format.
*/
/* This inode's nonce, copied from the fscrypt_context */
u8 ci_nonce[FS_KEY_DERIVATION_NONCE_SIZE];
+
+ /* Hashed inode number. Only set for IV_INO_LBLK_32 */
+ u32 ci_hashed_ino;
};
typedef enum {
/* crypto.c */
extern struct kmem_cache *fscrypt_info_cachep;
-extern int fscrypt_initialize(unsigned int cop_flags);
-extern int fscrypt_crypt_block(const struct inode *inode,
- fscrypt_direction_t rw, u64 lblk_num,
- struct page *src_page, struct page *dest_page,
- unsigned int len, unsigned int offs,
- gfp_t gfp_flags);
-extern struct page *fscrypt_alloc_bounce_page(gfp_t gfp_flags);
-
-extern void __printf(3, 4) __cold
+int fscrypt_initialize(unsigned int cop_flags);
+int fscrypt_crypt_block(const struct inode *inode, fscrypt_direction_t rw,
+ u64 lblk_num, struct page *src_page,
+ struct page *dest_page, unsigned int len,
+ unsigned int offs, gfp_t gfp_flags);
+struct page *fscrypt_alloc_bounce_page(gfp_t gfp_flags);
+
+void __printf(3, 4) __cold
fscrypt_msg(const struct inode *inode, const char *level, const char *fmt, ...);
#define fscrypt_warn(inode, fmt, ...) \
const struct fscrypt_info *ci);
/* fname.c */
-extern int fscrypt_fname_encrypt(const struct inode *inode,
- const struct qstr *iname,
- u8 *out, unsigned int olen);
-extern bool fscrypt_fname_encrypted_size(const struct inode *inode,
- u32 orig_len, u32 max_len,
- u32 *encrypted_len_ret);
+int fscrypt_fname_encrypt(const struct inode *inode, const struct qstr *iname,
+ u8 *out, unsigned int olen);
+bool fscrypt_fname_encrypted_size(const struct inode *inode, u32 orig_len,
+ u32 max_len, u32 *encrypted_len_ret);
extern const struct dentry_operations fscrypt_d_ops;
/* hkdf.c */
struct crypto_shash *hmac_tfm;
};
-extern int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
- unsigned int master_key_size);
+int fscrypt_init_hkdf(struct fscrypt_hkdf *hkdf, const u8 *master_key,
+ unsigned int master_key_size);
/*
* The list of contexts in which fscrypt uses HKDF. These values are used as
#define HKDF_CONTEXT_DIRECT_KEY 3
#define HKDF_CONTEXT_IV_INO_LBLK_64_KEY 4
#define HKDF_CONTEXT_DIRHASH_KEY 5
+#define HKDF_CONTEXT_IV_INO_LBLK_32_KEY 6
+#define HKDF_CONTEXT_INODE_HASH_KEY 7
-extern int fscrypt_hkdf_expand(const struct fscrypt_hkdf *hkdf, u8 context,
- const u8 *info, unsigned int infolen,
- u8 *okm, unsigned int okmlen);
+int fscrypt_hkdf_expand(const struct fscrypt_hkdf *hkdf, u8 context,
+ const u8 *info, unsigned int infolen,
+ u8 *okm, unsigned int okmlen);
-extern void fscrypt_destroy_hkdf(struct fscrypt_hkdf *hkdf);
+void fscrypt_destroy_hkdf(struct fscrypt_hkdf *hkdf);
/* keyring.c */
struct list_head mk_decrypted_inodes;
spinlock_t mk_decrypted_inodes_lock;
- /* Crypto API transforms for DIRECT_KEY policies, allocated on-demand */
- struct crypto_skcipher *mk_direct_tfms[__FSCRYPT_MODE_MAX + 1];
-
/*
- * Crypto API transforms for filesystem-layer implementation of
- * IV_INO_LBLK_64 policies, allocated on-demand.
+ * Per-mode encryption keys for the various types of encryption policies
+ * that use them. Allocated and derived on-demand.
*/
- struct crypto_skcipher *mk_iv_ino_lblk_64_tfms[__FSCRYPT_MODE_MAX + 1];
+ struct crypto_skcipher *mk_direct_keys[__FSCRYPT_MODE_MAX + 1];
+ struct crypto_skcipher *mk_iv_ino_lblk_64_keys[__FSCRYPT_MODE_MAX + 1];
+ struct crypto_skcipher *mk_iv_ino_lblk_32_keys[__FSCRYPT_MODE_MAX + 1];
+
+ /* Hash key for inode numbers. Initialized only when needed. */
+ siphash_key_t mk_ino_hash_key;
+ bool mk_ino_hash_key_initialized;
} __randomize_layout;
return 0;
}
-extern struct key *
+struct key *
fscrypt_find_master_key(struct super_block *sb,
const struct fscrypt_key_specifier *mk_spec);
-extern int fscrypt_verify_key_added(struct super_block *sb,
- const u8 identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]);
+int fscrypt_add_test_dummy_key(struct super_block *sb,
+ struct fscrypt_key_specifier *key_spec);
+
+int fscrypt_verify_key_added(struct super_block *sb,
+ const u8 identifier[FSCRYPT_KEY_IDENTIFIER_SIZE]);
-extern int __init fscrypt_init_keyring(void);
+int __init fscrypt_init_keyring(void);
/* keysetup.c */
extern struct fscrypt_mode fscrypt_modes[];
-extern struct crypto_skcipher *
-fscrypt_allocate_skcipher(struct fscrypt_mode *mode, const u8 *raw_key,
- const struct inode *inode);
+struct crypto_skcipher *fscrypt_allocate_skcipher(struct fscrypt_mode *mode,
+ const u8 *raw_key,
+ const struct inode *inode);
-extern int fscrypt_set_per_file_enc_key(struct fscrypt_info *ci,
- const u8 *raw_key);
+int fscrypt_set_per_file_enc_key(struct fscrypt_info *ci, const u8 *raw_key);
-extern int fscrypt_derive_dirhash_key(struct fscrypt_info *ci,
- const struct fscrypt_master_key *mk);
+int fscrypt_derive_dirhash_key(struct fscrypt_info *ci,
+ const struct fscrypt_master_key *mk);
/* keysetup_v1.c */
-extern void fscrypt_put_direct_key(struct fscrypt_direct_key *dk);
+void fscrypt_put_direct_key(struct fscrypt_direct_key *dk);
+
+int fscrypt_setup_v1_file_key(struct fscrypt_info *ci,
+ const u8 *raw_master_key);
-extern int fscrypt_setup_v1_file_key(struct fscrypt_info *ci,
- const u8 *raw_master_key);
+int fscrypt_setup_v1_file_key_via_subscribed_keyrings(struct fscrypt_info *ci);
-extern int fscrypt_setup_v1_file_key_via_subscribed_keyrings(
- struct fscrypt_info *ci);
/* policy.c */
-extern bool fscrypt_policies_equal(const union fscrypt_policy *policy1,
- const union fscrypt_policy *policy2);
-extern bool fscrypt_supported_policy(const union fscrypt_policy *policy_u,
- const struct inode *inode);
-extern int fscrypt_policy_from_context(union fscrypt_policy *policy_u,
- const union fscrypt_context *ctx_u,
- int ctx_size);
+bool fscrypt_policies_equal(const union fscrypt_policy *policy1,
+ const union fscrypt_policy *policy2);
+bool fscrypt_supported_policy(const union fscrypt_policy *policy_u,
+ const struct inode *inode);
+int fscrypt_policy_from_context(union fscrypt_policy *policy_u,
+ const union fscrypt_context *ctx_u,
+ int ctx_size);
#endif /* _FSCRYPT_PRIVATE_H */
unsigned int ikmlen, u8 prk[HKDF_HASHLEN])
{
static const u8 default_salt[HKDF_HASHLEN];
- SHASH_DESC_ON_STACK(desc, hmac_tfm);
int err;
err = crypto_shash_setkey(hmac_tfm, default_salt, HKDF_HASHLEN);
if (err)
return err;
- desc->tfm = hmac_tfm;
- err = crypto_shash_digest(desc, ikm, ikmlen, prk);
- shash_desc_zero(desc);
- return err;
+ return crypto_shash_tfm_digest(hmac_tfm, ikm, ikmlen, prk);
}
/*
#include "fscrypt_private.h"
/**
- * fscrypt_file_open - prepare to open a possibly-encrypted regular file
+ * fscrypt_file_open() - prepare to open a possibly-encrypted regular file
* @inode: the inode being opened
* @filp: the struct file being set up
*
EXPORT_SYMBOL_GPL(__fscrypt_encrypt_symlink);
/**
- * fscrypt_get_symlink - get the target of an encrypted symlink
+ * fscrypt_get_symlink() - get the target of an encrypted symlink
* @inode: the symlink inode
* @caddr: the on-disk contents of the symlink
* @max_size: size of @caddr buffer
#include <crypto/skcipher.h>
#include <linux/key-type.h>
+#include <linux/random.h>
#include <linux/seq_file.h>
#include "fscrypt_private.h"
wipe_master_key_secret(&mk->mk_secret);
for (i = 0; i <= __FSCRYPT_MODE_MAX; i++) {
- crypto_free_skcipher(mk->mk_direct_tfms[i]);
- crypto_free_skcipher(mk->mk_iv_ino_lblk_64_tfms[i]);
+ crypto_free_skcipher(mk->mk_direct_keys[i]);
+ crypto_free_skcipher(mk->mk_iv_ino_lblk_64_keys[i]);
+ crypto_free_skcipher(mk->mk_iv_ino_lblk_32_keys[i]);
}
key_put(mk->mk_users);
return 0;
}
-static int add_master_key(struct super_block *sb,
- struct fscrypt_master_key_secret *secret,
- const struct fscrypt_key_specifier *mk_spec)
+static int do_add_master_key(struct super_block *sb,
+ struct fscrypt_master_key_secret *secret,
+ const struct fscrypt_key_specifier *mk_spec)
{
static DEFINE_MUTEX(fscrypt_add_key_mutex);
struct key *key;
return err;
}
+static int add_master_key(struct super_block *sb,
+ struct fscrypt_master_key_secret *secret,
+ struct fscrypt_key_specifier *key_spec)
+{
+ int err;
+
+ if (key_spec->type == FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER) {
+ err = fscrypt_init_hkdf(&secret->hkdf, secret->raw,
+ secret->size);
+ if (err)
+ return err;
+
+ /*
+ * Now that the HKDF context is initialized, the raw key is no
+ * longer needed.
+ */
+ memzero_explicit(secret->raw, secret->size);
+
+ /* Calculate the key identifier */
+ err = fscrypt_hkdf_expand(&secret->hkdf,
+ HKDF_CONTEXT_KEY_IDENTIFIER, NULL, 0,
+ key_spec->u.identifier,
+ FSCRYPT_KEY_IDENTIFIER_SIZE);
+ if (err)
+ return err;
+ }
+ return do_add_master_key(sb, secret, key_spec);
+}
+
static int fscrypt_provisioning_key_preparse(struct key_preparsed_payload *prep)
{
const struct fscrypt_provisioning_key_payload *payload = prep->data;
if (memchr_inv(arg.__reserved, 0, sizeof(arg.__reserved)))
return -EINVAL;
+ /*
+ * Only root can add keys that are identified by an arbitrary descriptor
+ * rather than by a cryptographic hash --- since otherwise a malicious
+ * user could add the wrong key.
+ */
+ if (arg.key_spec.type == FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR &&
+ !capable(CAP_SYS_ADMIN))
+ return -EACCES;
+
memset(&secret, 0, sizeof(secret));
if (arg.key_id) {
if (arg.raw_size != 0)
goto out_wipe_secret;
}
- switch (arg.key_spec.type) {
- case FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR:
- /*
- * Only root can add keys that are identified by an arbitrary
- * descriptor rather than by a cryptographic hash --- since
- * otherwise a malicious user could add the wrong key.
- */
- err = -EACCES;
- if (!capable(CAP_SYS_ADMIN))
- goto out_wipe_secret;
- break;
- case FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER:
- err = fscrypt_init_hkdf(&secret.hkdf, secret.raw, secret.size);
- if (err)
- goto out_wipe_secret;
-
- /*
- * Now that the HKDF context is initialized, the raw key is no
- * longer needed.
- */
- memzero_explicit(secret.raw, secret.size);
-
- /* Calculate the key identifier and return it to userspace. */
- err = fscrypt_hkdf_expand(&secret.hkdf,
- HKDF_CONTEXT_KEY_IDENTIFIER,
- NULL, 0, arg.key_spec.u.identifier,
- FSCRYPT_KEY_IDENTIFIER_SIZE);
- if (err)
- goto out_wipe_secret;
- err = -EFAULT;
- if (copy_to_user(uarg->key_spec.u.identifier,
- arg.key_spec.u.identifier,
- FSCRYPT_KEY_IDENTIFIER_SIZE))
- goto out_wipe_secret;
- break;
- default:
- WARN_ON(1);
- err = -EINVAL;
+ err = add_master_key(sb, &secret, &arg.key_spec);
+ if (err)
goto out_wipe_secret;
- }
- err = add_master_key(sb, &secret, &arg.key_spec);
+ /* Return the key identifier to userspace, if applicable */
+ err = -EFAULT;
+ if (arg.key_spec.type == FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER &&
+ copy_to_user(uarg->key_spec.u.identifier, arg.key_spec.u.identifier,
+ FSCRYPT_KEY_IDENTIFIER_SIZE))
+ goto out_wipe_secret;
+ err = 0;
out_wipe_secret:
wipe_master_key_secret(&secret);
return err;
}
EXPORT_SYMBOL_GPL(fscrypt_ioctl_add_key);
+/*
+ * Add the key for '-o test_dummy_encryption' to the filesystem keyring.
+ *
+ * Use a per-boot random key to prevent people from misusing this option.
+ */
+int fscrypt_add_test_dummy_key(struct super_block *sb,
+ struct fscrypt_key_specifier *key_spec)
+{
+ static u8 test_key[FSCRYPT_MAX_KEY_SIZE];
+ struct fscrypt_master_key_secret secret;
+ int err;
+
+ get_random_once(test_key, FSCRYPT_MAX_KEY_SIZE);
+
+ memset(&secret, 0, sizeof(secret));
+ secret.size = FSCRYPT_MAX_KEY_SIZE;
+ memcpy(secret.raw, test_key, FSCRYPT_MAX_KEY_SIZE);
+
+ err = add_master_key(sb, &secret, key_spec);
+ wipe_master_key_secret(&secret);
+ return err;
+}
+
/*
* Verify that the current user has added a master key with the given identifier
* (returns -ENOKEY if not). This is needed to prevent a user from encrypting
},
};
+static DEFINE_MUTEX(fscrypt_mode_key_setup_mutex);
+
static struct fscrypt_mode *
select_encryption_mode(const union fscrypt_policy *policy,
const struct inode *inode)
const struct super_block *sb = inode->i_sb;
struct fscrypt_mode *mode = ci->ci_mode;
const u8 mode_num = mode - fscrypt_modes;
- struct crypto_skcipher *tfm, *prev_tfm;
+ struct crypto_skcipher *tfm;
u8 mode_key[FSCRYPT_MAX_KEY_SIZE];
u8 hkdf_info[sizeof(mode_num) + sizeof(sb->s_uuid)];
unsigned int hkdf_infolen = 0;
if (WARN_ON(mode_num > __FSCRYPT_MODE_MAX))
return -EINVAL;
- /* pairs with cmpxchg() below */
+ /* pairs with smp_store_release() below */
tfm = READ_ONCE(tfms[mode_num]);
- if (likely(tfm != NULL))
- goto done;
+ if (likely(tfm != NULL)) {
+ ci->ci_ctfm = tfm;
+ return 0;
+ }
+
+ mutex_lock(&fscrypt_mode_key_setup_mutex);
+
+ if (tfms[mode_num])
+ goto done_unlock;
BUILD_BUG_ON(sizeof(mode_num) != 1);
BUILD_BUG_ON(sizeof(sb->s_uuid) != 16);
hkdf_context, hkdf_info, hkdf_infolen,
mode_key, mode->keysize);
if (err)
- return err;
+ goto out_unlock;
tfm = fscrypt_allocate_skcipher(mode, mode_key, inode);
memzero_explicit(mode_key, mode->keysize);
- if (IS_ERR(tfm))
- return PTR_ERR(tfm);
-
- /* pairs with READ_ONCE() above */
- prev_tfm = cmpxchg(&tfms[mode_num], NULL, tfm);
- if (prev_tfm != NULL) {
- crypto_free_skcipher(tfm);
- tfm = prev_tfm;
+ if (IS_ERR(tfm)) {
+ err = PTR_ERR(tfm);
+ goto out_unlock;
}
-done:
+ /* pairs with READ_ONCE() above */
+ smp_store_release(&tfms[mode_num], tfm);
+done_unlock:
ci->ci_ctfm = tfm;
- return 0;
+ err = 0;
+out_unlock:
+ mutex_unlock(&fscrypt_mode_key_setup_mutex);
+ return err;
}
int fscrypt_derive_dirhash_key(struct fscrypt_info *ci,
return 0;
}
+static int fscrypt_setup_iv_ino_lblk_32_key(struct fscrypt_info *ci,
+ struct fscrypt_master_key *mk)
+{
+ int err;
+
+ err = setup_per_mode_enc_key(ci, mk, mk->mk_iv_ino_lblk_32_keys,
+ HKDF_CONTEXT_IV_INO_LBLK_32_KEY, true);
+ if (err)
+ return err;
+
+ /* pairs with smp_store_release() below */
+ if (!smp_load_acquire(&mk->mk_ino_hash_key_initialized)) {
+
+ mutex_lock(&fscrypt_mode_key_setup_mutex);
+
+ if (mk->mk_ino_hash_key_initialized)
+ goto unlock;
+
+ err = fscrypt_hkdf_expand(&mk->mk_secret.hkdf,
+ HKDF_CONTEXT_INODE_HASH_KEY, NULL, 0,
+ (u8 *)&mk->mk_ino_hash_key,
+ sizeof(mk->mk_ino_hash_key));
+ if (err)
+ goto unlock;
+ /* pairs with smp_load_acquire() above */
+ smp_store_release(&mk->mk_ino_hash_key_initialized, true);
+unlock:
+ mutex_unlock(&fscrypt_mode_key_setup_mutex);
+ if (err)
+ return err;
+ }
+
+ ci->ci_hashed_ino = (u32)siphash_1u64(ci->ci_inode->i_ino,
+ &mk->mk_ino_hash_key);
+ return 0;
+}
+
static int fscrypt_setup_v2_file_key(struct fscrypt_info *ci,
struct fscrypt_master_key *mk)
{
* encryption key. This ensures that the master key is
* consistently used only for HKDF, avoiding key reuse issues.
*/
- err = setup_per_mode_enc_key(ci, mk, mk->mk_direct_tfms,
+ err = setup_per_mode_enc_key(ci, mk, mk->mk_direct_keys,
HKDF_CONTEXT_DIRECT_KEY, false);
} else if (ci->ci_policy.v2.flags &
FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) {
* IV_INO_LBLK_64: encryption keys are derived from (master_key,
* mode_num, filesystem_uuid), and inode number is included in
* the IVs. This format is optimized for use with inline
- * encryption hardware compliant with the UFS or eMMC standards.
+ * encryption hardware compliant with the UFS standard.
*/
- err = setup_per_mode_enc_key(ci, mk, mk->mk_iv_ino_lblk_64_tfms,
+ err = setup_per_mode_enc_key(ci, mk, mk->mk_iv_ino_lblk_64_keys,
HKDF_CONTEXT_IV_INO_LBLK_64_KEY,
true);
+ } else if (ci->ci_policy.v2.flags &
+ FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32) {
+ err = fscrypt_setup_iv_ino_lblk_32_key(ci, mk);
} else {
u8 derived_key[FSCRYPT_MAX_KEY_SIZE];
res = inode->i_sb->s_cop->get_context(inode, &ctx, sizeof(ctx));
if (res < 0) {
- if (!fscrypt_dummy_context_enabled(inode) ||
- IS_ENCRYPTED(inode)) {
+ const union fscrypt_context *dummy_ctx =
+ fscrypt_get_dummy_context(inode->i_sb);
+
+ if (IS_ENCRYPTED(inode) || !dummy_ctx) {
fscrypt_warn(inode,
"Error %d getting encryption context",
res);
return res;
}
/* Fake up a context for an unencrypted directory */
- memset(&ctx, 0, sizeof(ctx));
- ctx.version = FSCRYPT_CONTEXT_V1;
- ctx.v1.contents_encryption_mode = FSCRYPT_MODE_AES_256_XTS;
- ctx.v1.filenames_encryption_mode = FSCRYPT_MODE_AES_256_CTS;
- memset(ctx.v1.master_key_descriptor, 0x42,
- FSCRYPT_KEY_DESCRIPTOR_SIZE);
- res = sizeof(ctx.v1);
+ res = fscrypt_context_size(dummy_ctx);
+ memcpy(&ctx, dummy_ctx, res);
}
crypt_info = kmem_cache_zalloc(fscrypt_info_cachep, GFP_NOFS);
EXPORT_SYMBOL(fscrypt_get_encryption_info);
/**
- * fscrypt_put_encryption_info - free most of an inode's fscrypt data
+ * fscrypt_put_encryption_info() - free most of an inode's fscrypt data
+ * @inode: an inode being evicted
*
* Free the inode's fscrypt_info. Filesystems must call this when the inode is
* being evicted. An RCU grace period need not have elapsed yet.
EXPORT_SYMBOL(fscrypt_put_encryption_info);
/**
- * fscrypt_free_inode - free an inode's fscrypt data requiring RCU delay
+ * fscrypt_free_inode() - free an inode's fscrypt data requiring RCU delay
+ * @inode: an inode being freed
*
* Free the inode's cached decrypted symlink target, if any. Filesystems must
* call this after an RCU grace period, just before they free the inode.
EXPORT_SYMBOL(fscrypt_free_inode);
/**
- * fscrypt_drop_inode - check whether the inode's master key has been removed
+ * fscrypt_drop_inode() - check whether the inode's master key has been removed
+ * @inode: an inode being considered for eviction
*
* Filesystems supporting fscrypt must call this from their ->drop_inode()
* method so that encrypted inodes are evicted as soon as they're no longer in
*/
#include <linux/random.h>
+#include <linux/seq_file.h>
#include <linux/string.h>
#include <linux/mount.h>
#include "fscrypt_private.h"
/**
- * fscrypt_policies_equal - check whether two encryption policies are the same
+ * fscrypt_policies_equal() - check whether two encryption policies are the same
+ * @policy1: the first policy
+ * @policy2: the second policy
*
* Return: %true if equal, else %false
*/
return true;
}
-static bool supported_iv_ino_lblk_64_policy(
- const struct fscrypt_policy_v2 *policy,
- const struct inode *inode)
+static bool supported_iv_ino_lblk_policy(const struct fscrypt_policy_v2 *policy,
+ const struct inode *inode,
+ const char *type,
+ int max_ino_bits, int max_lblk_bits)
{
struct super_block *sb = inode->i_sb;
int ino_bits = 64, lblk_bits = 64;
- if (policy->flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY) {
- fscrypt_warn(inode,
- "The DIRECT_KEY and IV_INO_LBLK_64 flags are mutually exclusive");
- return false;
- }
/*
* It's unsafe to include inode numbers in the IVs if the filesystem can
* potentially renumber inodes, e.g. via filesystem shrinking.
if (!sb->s_cop->has_stable_inodes ||
!sb->s_cop->has_stable_inodes(sb)) {
fscrypt_warn(inode,
- "Can't use IV_INO_LBLK_64 policy on filesystem '%s' because it doesn't have stable inode numbers",
- sb->s_id);
+ "Can't use %s policy on filesystem '%s' because it doesn't have stable inode numbers",
+ type, sb->s_id);
return false;
}
if (sb->s_cop->get_ino_and_lblk_bits)
sb->s_cop->get_ino_and_lblk_bits(sb, &ino_bits, &lblk_bits);
- if (ino_bits > 32 || lblk_bits > 32) {
+ if (ino_bits > max_ino_bits) {
+ fscrypt_warn(inode,
+ "Can't use %s policy on filesystem '%s' because its inode numbers are too long",
+ type, sb->s_id);
+ return false;
+ }
+ if (lblk_bits > max_lblk_bits) {
fscrypt_warn(inode,
- "Can't use IV_INO_LBLK_64 policy on filesystem '%s' because it doesn't use 32-bit inode and block numbers",
- sb->s_id);
+ "Can't use %s policy on filesystem '%s' because its block numbers are too long",
+ type, sb->s_id);
return false;
}
return true;
static bool fscrypt_supported_v2_policy(const struct fscrypt_policy_v2 *policy,
const struct inode *inode)
{
+ int count = 0;
+
if (!fscrypt_valid_enc_modes(policy->contents_encryption_mode,
policy->filenames_encryption_mode)) {
fscrypt_warn(inode,
return false;
}
+ count += !!(policy->flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY);
+ count += !!(policy->flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64);
+ count += !!(policy->flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32);
+ if (count > 1) {
+ fscrypt_warn(inode, "Mutually exclusive encryption flags (0x%02x)",
+ policy->flags);
+ return false;
+ }
+
if ((policy->flags & FSCRYPT_POLICY_FLAG_DIRECT_KEY) &&
!supported_direct_key_modes(inode, policy->contents_encryption_mode,
policy->filenames_encryption_mode))
return false;
if ((policy->flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64) &&
- !supported_iv_ino_lblk_64_policy(policy, inode))
+ !supported_iv_ino_lblk_policy(policy, inode, "IV_INO_LBLK_64",
+ 32, 32))
+ return false;
+
+ if ((policy->flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32) &&
+ /* This uses hashed inode numbers, so ino_bits doesn't matter. */
+ !supported_iv_ino_lblk_policy(policy, inode, "IV_INO_LBLK_32",
+ INT_MAX, 32))
return false;
if (memchr_inv(policy->__reserved, 0, sizeof(policy->__reserved))) {
}
/**
- * fscrypt_supported_policy - check whether an encryption policy is supported
+ * fscrypt_supported_policy() - check whether an encryption policy is supported
+ * @policy_u: the encryption policy
+ * @inode: the inode on which the policy will be used
*
* Given an encryption policy, check whether all its encryption modes and other
* settings are supported by this kernel on the given inode. (But we don't
}
/**
- * fscrypt_new_context_from_policy - create a new fscrypt_context from a policy
+ * fscrypt_new_context_from_policy() - create a new fscrypt_context from
+ * an fscrypt_policy
+ * @ctx_u: output context
+ * @policy_u: input policy
*
* Create an fscrypt_context for an inode that is being assigned the given
* encryption policy. A new nonce is randomly generated.
}
/**
- * fscrypt_policy_from_context - convert an fscrypt_context to an fscrypt_policy
+ * fscrypt_policy_from_context() - convert an fscrypt_context to
+ * an fscrypt_policy
+ * @policy_u: output policy
+ * @ctx_u: input context
+ * @ctx_size: size of input context in bytes
*
* Given an fscrypt_context, build the corresponding fscrypt_policy.
*
policy->v2.master_key_identifier);
if (err)
return err;
+ if (policy->v2.flags & FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32)
+ pr_warn_once("%s (pid %d) is setting an IV_INO_LBLK_32 encryption policy. This should only be used if there are certain hardware limitations.\n",
+ current->comm, current->pid);
break;
default:
WARN_ON(1);
return preload ? fscrypt_get_encryption_info(child): 0;
}
EXPORT_SYMBOL(fscrypt_inherit_context);
+
+/**
+ * fscrypt_set_test_dummy_encryption() - handle '-o test_dummy_encryption'
+ * @sb: the filesystem on which test_dummy_encryption is being specified
+ * @arg: the argument to the test_dummy_encryption option.
+ * If no argument was specified, then @arg->from == NULL.
+ * @dummy_ctx: the filesystem's current dummy context (input/output, see below)
+ *
+ * Handle the test_dummy_encryption mount option by creating a dummy encryption
+ * context, saving it in @dummy_ctx, and adding the corresponding dummy
+ * encryption key to the filesystem. If the @dummy_ctx is already set, then
+ * instead validate that it matches @arg. Don't support changing it via
+ * remount, as that is difficult to do safely.
+ *
+ * The reason we use an fscrypt_context rather than an fscrypt_policy is because
+ * we mustn't generate a new nonce each time we access a dummy-encrypted
+ * directory, as that would change the way filenames are encrypted.
+ *
+ * Return: 0 on success (dummy context set, or the same context is already set);
+ * -EEXIST if a different dummy context is already set;
+ * or another -errno value.
+ */
+int fscrypt_set_test_dummy_encryption(struct super_block *sb,
+ const substring_t *arg,
+ struct fscrypt_dummy_context *dummy_ctx)
+{
+ const char *argstr = "v2";
+ const char *argstr_to_free = NULL;
+ struct fscrypt_key_specifier key_spec = { 0 };
+ int version;
+ union fscrypt_context *ctx = NULL;
+ int err;
+
+ if (arg->from) {
+ argstr = argstr_to_free = match_strdup(arg);
+ if (!argstr)
+ return -ENOMEM;
+ }
+
+ if (!strcmp(argstr, "v1")) {
+ version = FSCRYPT_CONTEXT_V1;
+ key_spec.type = FSCRYPT_KEY_SPEC_TYPE_DESCRIPTOR;
+ memset(key_spec.u.descriptor, 0x42,
+ FSCRYPT_KEY_DESCRIPTOR_SIZE);
+ } else if (!strcmp(argstr, "v2")) {
+ version = FSCRYPT_CONTEXT_V2;
+ key_spec.type = FSCRYPT_KEY_SPEC_TYPE_IDENTIFIER;
+ /* key_spec.u.identifier gets filled in when adding the key */
+ } else {
+ err = -EINVAL;
+ goto out;
+ }
+
+ if (dummy_ctx->ctx) {
+ /*
+ * Note: if we ever make test_dummy_encryption support
+ * specifying other encryption settings, such as the encryption
+ * modes, we'll need to compare those settings here.
+ */
+ if (dummy_ctx->ctx->version == version)
+ err = 0;
+ else
+ err = -EEXIST;
+ goto out;
+ }
+
+ ctx = kzalloc(sizeof(*ctx), GFP_KERNEL);
+ if (!ctx) {
+ err = -ENOMEM;
+ goto out;
+ }
+
+ err = fscrypt_add_test_dummy_key(sb, &key_spec);
+ if (err)
+ goto out;
+
+ ctx->version = version;
+ switch (ctx->version) {
+ case FSCRYPT_CONTEXT_V1:
+ ctx->v1.contents_encryption_mode = FSCRYPT_MODE_AES_256_XTS;
+ ctx->v1.filenames_encryption_mode = FSCRYPT_MODE_AES_256_CTS;
+ memcpy(ctx->v1.master_key_descriptor, key_spec.u.descriptor,
+ FSCRYPT_KEY_DESCRIPTOR_SIZE);
+ break;
+ case FSCRYPT_CONTEXT_V2:
+ ctx->v2.contents_encryption_mode = FSCRYPT_MODE_AES_256_XTS;
+ ctx->v2.filenames_encryption_mode = FSCRYPT_MODE_AES_256_CTS;
+ memcpy(ctx->v2.master_key_identifier, key_spec.u.identifier,
+ FSCRYPT_KEY_IDENTIFIER_SIZE);
+ break;
+ default:
+ WARN_ON(1);
+ err = -EINVAL;
+ goto out;
+ }
+ dummy_ctx->ctx = ctx;
+ ctx = NULL;
+ err = 0;
+out:
+ kfree(ctx);
+ kfree(argstr_to_free);
+ return err;
+}
+EXPORT_SYMBOL_GPL(fscrypt_set_test_dummy_encryption);
+
+/**
+ * fscrypt_show_test_dummy_encryption() - show '-o test_dummy_encryption'
+ * @seq: the seq_file to print the option to
+ * @sep: the separator character to use
+ * @sb: the filesystem whose options are being shown
+ *
+ * Show the test_dummy_encryption mount option, if it was specified.
+ * This is mainly used for /proc/mounts.
+ */
+void fscrypt_show_test_dummy_encryption(struct seq_file *seq, char sep,
+ struct super_block *sb)
+{
+ const union fscrypt_context *ctx = fscrypt_get_dummy_context(sb);
+
+ if (!ctx)
+ return;
+ seq_printf(seq, "%ctest_dummy_encryption=v%d", sep, ctx->version);
+}
+EXPORT_SYMBOL_GPL(fscrypt_show_test_dummy_encryption);
}
}
-static int ecryptfs_hash_digest(struct crypto_shash *tfm,
- char *src, int len, char *dst)
-{
- SHASH_DESC_ON_STACK(desc, tfm);
- int err;
-
- desc->tfm = tfm;
- err = crypto_shash_digest(desc, src, len, dst);
- shash_desc_zero(desc);
- return err;
-}
-
/**
* ecryptfs_calculate_md5 - calculates the md5 of @src
* @dst: Pointer to 16 bytes of allocated memory
struct ecryptfs_crypt_stat *crypt_stat,
char *src, int len)
{
- struct crypto_shash *tfm;
- int rc = 0;
+ int rc = crypto_shash_tfm_digest(crypt_stat->hash_tfm, src, len, dst);
- tfm = crypt_stat->hash_tfm;
- rc = ecryptfs_hash_digest(tfm, src, len, dst);
if (rc) {
printk(KERN_ERR
"%s: Error computing crypto hash; rc = [%d]\n",
* event delivery.
*/
init_wait(&wait);
- write_lock_irq(&ep->lock);
- __add_wait_queue_exclusive(&ep->wq, &wait);
- write_unlock_irq(&ep->lock);
+ write_lock_irq(&ep->lock);
/*
- * We don't want to sleep if the ep_poll_callback() sends us
- * a wakeup in between. That's why we set the task state
- * to TASK_INTERRUPTIBLE before doing the checks.
+ * Barrierless variant, waitqueue_active() is called under
+ * the same lock on wakeup ep_poll_callback() side, so it
+ * is safe to avoid an explicit barrier.
*/
- set_current_state(TASK_INTERRUPTIBLE);
+ __set_current_state(TASK_INTERRUPTIBLE);
+
/*
- * Always short-circuit for fatal signals to allow
- * threads to make a timely exit without the chance of
- * finding more events available and fetching
- * repeatedly.
+ * Do the final check under the lock. ep_scan_ready_list()
+ * plays with two lists (->rdllist and ->ovflist) and there
+ * is always a race when both lists are empty for short
+ * period of time although events are pending, so lock is
+ * important.
*/
- if (fatal_signal_pending(current)) {
- res = -EINTR;
- break;
+ eavail = ep_events_available(ep);
+ if (!eavail) {
+ if (signal_pending(current))
+ res = -EINTR;
+ else
+ __add_wait_queue_exclusive(&ep->wq, &wait);
}
+ write_unlock_irq(&ep->lock);
- eavail = ep_events_available(ep);
- if (eavail)
- break;
- if (signal_pending(current)) {
- res = -EINTR;
+ if (eavail || res)
break;
- }
if (!schedule_hrtimeout_range(to, slack, HRTIMER_MODE_ABS)) {
timed_out = 1;
}
send_events:
+ if (fatal_signal_pending(current)) {
+ /*
+ * Always short-circuit for fatal signals to allow
+ * threads to make a timely exit without the chance of
+ * finding more events available and fetching
+ * repeatedly.
+ */
+ res = -EINTR;
+ }
/*
* Try to transfer events to user space. In case we get 0 events and
* there's still timeout left over, we go trying again in search of
*/
set_mm_exe_file(bprm->mm, bprm->file);
+ would_dump(bprm, bprm->file);
+
/*
* Release all of the old mmap stuff
*/
if (retval < 0)
goto out;
- would_dump(bprm, bprm->file);
-
retval = exec_binprm(bprm);
if (retval < 0)
goto out;
}
const struct file_operations exfat_file_operations = {
- .llseek = generic_file_llseek,
- .read_iter = generic_file_read_iter,
- .write_iter = generic_file_write_iter,
- .mmap = generic_file_mmap,
- .fsync = generic_file_fsync,
- .splice_read = generic_file_splice_read,
+ .llseek = generic_file_llseek,
+ .read_iter = generic_file_read_iter,
+ .write_iter = generic_file_write_iter,
+ .mmap = generic_file_mmap,
+ .fsync = generic_file_fsync,
+ .splice_read = generic_file_splice_read,
+ .splice_write = iter_file_splice_write,
};
const struct inode_operations exfat_file_inode_operations = {
exfat_fs_error(sb,
"non-zero size file starts with zero cluster (size : %llu, p_dir : %u, entry : 0x%08x)",
i_size_read(dir), ei->dir.dir, ei->entry);
+ kfree(es);
return -EIO;
}
Opt_errors,
Opt_discard,
Opt_time_offset,
+
+ /* Deprecated options */
+ Opt_utf8,
+ Opt_debug,
+ Opt_namecase,
+ Opt_codepage,
};
static const struct constant_table exfat_param_enums[] = {
fsparam_enum("errors", Opt_errors, exfat_param_enums),
fsparam_flag("discard", Opt_discard),
fsparam_s32("time_offset", Opt_time_offset),
+ __fsparam(NULL, "utf8", Opt_utf8, fs_param_deprecated,
+ NULL),
+ __fsparam(NULL, "debug", Opt_debug, fs_param_deprecated,
+ NULL),
+ __fsparam(fs_param_is_u32, "namecase", Opt_namecase,
+ fs_param_deprecated, NULL),
+ __fsparam(fs_param_is_u32, "codepage", Opt_codepage,
+ fs_param_deprecated, NULL),
{}
};
return -EINVAL;
opts->time_offset = result.int_32;
break;
+ case Opt_utf8:
+ case Opt_debug:
+ case Opt_namecase:
+ case Opt_codepage:
+ break;
default:
return -EINVAL;
}
#define EXT4_MAX_BLOCK_FILE_PHYS 0xFFFFFFFF
/* Max logical block we can support */
-#define EXT4_MAX_LOGICAL_BLOCK 0xFFFFFFFF
+#define EXT4_MAX_LOGICAL_BLOCK 0xFFFFFFFE
/*
* Structure of an inode on the disk
*/
#define EXT4_MF_MNTDIR_SAMPLED 0x0001
#define EXT4_MF_FS_ABORTED 0x0002 /* Fatal error detected */
-#define EXT4_MF_TEST_DUMMY_ENCRYPTION 0x0004
#ifdef CONFIG_FS_ENCRYPTION
-#define DUMMY_ENCRYPTION_ENABLED(sbi) (unlikely((sbi)->s_mount_flags & \
- EXT4_MF_TEST_DUMMY_ENCRYPTION))
+#define DUMMY_ENCRYPTION_ENABLED(sbi) ((sbi)->s_dummy_enc_ctx.ctx != NULL)
#else
#define DUMMY_ENCRYPTION_ENABLED(sbi) (0)
#endif
struct ratelimit_state s_warning_ratelimit_state;
struct ratelimit_state s_msg_ratelimit_state;
+ /* Encryption context for '-o test_dummy_encryption' */
+ struct fscrypt_dummy_context s_dummy_enc_ctx;
+
/*
* Barrier between writepages ops and changing any inode's JOURNAL_DATA
* or EXTENTS flag.
.iomap_begin = ext4_iomap_xattr_begin,
};
+static int ext4_fiemap_check_ranges(struct inode *inode, u64 start, u64 *len)
+{
+ u64 maxbytes;
+
+ if (ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS))
+ maxbytes = inode->i_sb->s_maxbytes;
+ else
+ maxbytes = EXT4_SB(inode->i_sb)->s_bitmap_maxbytes;
+
+ if (*len == 0)
+ return -EINVAL;
+ if (start > maxbytes)
+ return -EFBIG;
+
+ /*
+ * Shrink request scope to what the fs can actually handle.
+ */
+ if (*len > maxbytes || (maxbytes - *len) < start)
+ *len = maxbytes - start;
+ return 0;
+}
+
static int _ext4_fiemap(struct inode *inode, struct fiemap_extent_info *fieinfo,
__u64 start, __u64 len, bool from_es_cache)
{
if (fiemap_check_flags(fieinfo, ext4_fiemap_flags))
return -EBADR;
+ /*
+ * For bitmap files the maximum size limit could be smaller than
+ * s_maxbytes, so check len here manually instead of just relying on the
+ * generic check.
+ */
+ error = ext4_fiemap_check_ranges(inode, start, &len);
+ if (error)
+ return error;
+
if (fieinfo->fi_flags & FIEMAP_FLAG_XATTR) {
fieinfo->fi_flags &= ~FIEMAP_FLAG_XATTR;
error = iomap_fiemap(inode, fieinfo, start, len,
fa->fsx_projid = from_kprojid(&init_user_ns, ei->i_projid);
}
-/* copied from fs/ioctl.c */
-static int fiemap_check_ranges(struct super_block *sb,
- u64 start, u64 len, u64 *new_len)
-{
- u64 maxbytes = (u64) sb->s_maxbytes;
-
- *new_len = len;
-
- if (len == 0)
- return -EINVAL;
-
- if (start > maxbytes)
- return -EFBIG;
-
- /*
- * Shrink request scope to what the fs can actually handle.
- */
- if (len > maxbytes || (maxbytes - len) < start)
- *new_len = maxbytes - start;
-
- return 0;
-}
-
/* So that the fiemap access checks can't overflow on 32 bit machines. */
#define FIEMAP_MAX_EXTENTS (UINT_MAX / sizeof(struct fiemap_extent))
struct fiemap __user *ufiemap = (struct fiemap __user *) arg;
struct fiemap_extent_info fieinfo = { 0, };
struct inode *inode = file_inode(filp);
- struct super_block *sb = inode->i_sb;
- u64 len;
int error;
if (copy_from_user(&fiemap, ufiemap, sizeof(fiemap)))
if (fiemap.fm_extent_count > FIEMAP_MAX_EXTENTS)
return -EINVAL;
- error = fiemap_check_ranges(sb, fiemap.fm_start, fiemap.fm_length,
- &len);
- if (error)
- return error;
-
fieinfo.fi_flags = fiemap.fm_flags;
fieinfo.fi_extents_max = fiemap.fm_extent_count;
fieinfo.fi_extents_start = ufiemap->fm_extents;
if (fieinfo.fi_flags & FIEMAP_FLAG_SYNC)
filemap_write_and_wait(inode->i_mapping);
- error = ext4_get_es_cache(inode, &fieinfo, fiemap.fm_start, len);
+ error = ext4_get_es_cache(inode, &fieinfo, fiemap.fm_start,
+ fiemap.fm_length);
fiemap.fm_flags = fieinfo.fi_flags;
fiemap.fm_mapped_extents = fieinfo.fi_extents_mapped;
if (copy_to_user(ufiemap, &fiemap, sizeof(fiemap)))
crypto_free_shash(sbi->s_chksum_driver);
kfree(sbi->s_blockgroup_lock);
fs_put_dax(sbi->s_daxdev);
+ fscrypt_free_dummy_context(&sbi->s_dummy_enc_ctx);
#ifdef CONFIG_UNICODE
utf8_unload(sbi->s_encoding);
#endif
return res;
}
-static bool ext4_dummy_context(struct inode *inode)
+static const union fscrypt_context *
+ext4_get_dummy_context(struct super_block *sb)
{
- return DUMMY_ENCRYPTION_ENABLED(EXT4_SB(inode->i_sb));
+ return EXT4_SB(sb)->s_dummy_enc_ctx.ctx;
}
static bool ext4_has_stable_inodes(struct super_block *sb)
.key_prefix = "ext4:",
.get_context = ext4_get_context,
.set_context = ext4_set_context,
- .dummy_context = ext4_dummy_context,
+ .get_dummy_context = ext4_get_dummy_context,
.empty_dir = ext4_empty_dir,
.max_namelen = EXT4_NAME_LEN,
.has_stable_inodes = ext4_has_stable_inodes,
{Opt_init_itable, "init_itable"},
{Opt_noinit_itable, "noinit_itable"},
{Opt_max_dir_size_kb, "max_dir_size_kb=%u"},
+ {Opt_test_dummy_encryption, "test_dummy_encryption=%s"},
{Opt_test_dummy_encryption, "test_dummy_encryption"},
{Opt_nombcache, "nombcache"},
{Opt_nombcache, "no_mbcache"}, /* for backward compatibility */
{Opt_jqfmt_vfsv0, QFMT_VFS_V0, MOPT_QFMT},
{Opt_jqfmt_vfsv1, QFMT_VFS_V1, MOPT_QFMT},
{Opt_max_dir_size_kb, 0, MOPT_GTE0},
- {Opt_test_dummy_encryption, 0, MOPT_GTE0},
+ {Opt_test_dummy_encryption, 0, MOPT_STRING},
{Opt_nombcache, EXT4_MOUNT_NO_MBCACHE, MOPT_SET},
{Opt_err, 0, 0}
};
}
#endif
+static int ext4_set_test_dummy_encryption(struct super_block *sb,
+ const char *opt,
+ const substring_t *arg,
+ bool is_remount)
+{
+#ifdef CONFIG_FS_ENCRYPTION
+ struct ext4_sb_info *sbi = EXT4_SB(sb);
+ int err;
+
+ /*
+ * This mount option is just for testing, and it's not worthwhile to
+ * implement the extra complexity (e.g. RCU protection) that would be
+ * needed to allow it to be set or changed during remount. We do allow
+ * it to be specified during remount, but only if there is no change.
+ */
+ if (is_remount && !sbi->s_dummy_enc_ctx.ctx) {
+ ext4_msg(sb, KERN_WARNING,
+ "Can't set test_dummy_encryption on remount");
+ return -1;
+ }
+ err = fscrypt_set_test_dummy_encryption(sb, arg, &sbi->s_dummy_enc_ctx);
+ if (err) {
+ if (err == -EEXIST)
+ ext4_msg(sb, KERN_WARNING,
+ "Can't change test_dummy_encryption on remount");
+ else if (err == -EINVAL)
+ ext4_msg(sb, KERN_WARNING,
+ "Value of option \"%s\" is unrecognized", opt);
+ else
+ ext4_msg(sb, KERN_WARNING,
+ "Error processing option \"%s\" [%d]",
+ opt, err);
+ return -1;
+ }
+ ext4_msg(sb, KERN_WARNING, "Test dummy encryption mode enabled");
+#else
+ ext4_msg(sb, KERN_WARNING,
+ "Test dummy encryption mount option ignored");
+#endif
+ return 1;
+}
+
static int handle_mount_opt(struct super_block *sb, char *opt, int token,
substring_t *args, unsigned long *journal_devnum,
unsigned int *journal_ioprio, int is_remount)
*journal_ioprio =
IOPRIO_PRIO_VALUE(IOPRIO_CLASS_BE, arg);
} else if (token == Opt_test_dummy_encryption) {
-#ifdef CONFIG_FS_ENCRYPTION
- sbi->s_mount_flags |= EXT4_MF_TEST_DUMMY_ENCRYPTION;
- ext4_msg(sb, KERN_WARNING,
- "Test dummy encryption mode enabled");
-#else
- ext4_msg(sb, KERN_WARNING,
- "Test dummy encryption mount option ignored");
-#endif
+ return ext4_set_test_dummy_encryption(sb, opt, &args[0],
+ is_remount);
} else if (m->flags & MOPT_DATAJ) {
if (is_remount) {
if (!sbi->s_journal)
SEQ_OPTS_PRINT("max_dir_size_kb=%u", sbi->s_max_dir_size_kb);
if (test_opt(sb, DATA_ERR_ABORT))
SEQ_OPTS_PUTS("data_err=abort");
- if (DUMMY_ENCRYPTION_ENABLED(sbi))
- SEQ_OPTS_PUTS("test_dummy_encryption");
+
+ fscrypt_show_test_dummy_encryption(seq, sep, sb);
ext4_show_quota_options(seq, sb);
return 0;
for (i = 0; i < EXT4_MAXQUOTAS; i++)
kfree(get_qf_name(sb, sbi, i));
#endif
+ fscrypt_free_dummy_context(&sbi->s_dummy_enc_ctx);
ext4_blkdev_remove(sbi);
brelse(bh);
out_fail:
EXT4_ATTR_FEATURE(meta_bg_resize);
#ifdef CONFIG_FS_ENCRYPTION
EXT4_ATTR_FEATURE(encryption);
+EXT4_ATTR_FEATURE(test_dummy_encryption_v2);
#endif
#ifdef CONFIG_UNICODE
EXT4_ATTR_FEATURE(casefold);
ATTR_LIST(meta_bg_resize),
#ifdef CONFIG_FS_ENCRYPTION
ATTR_LIST(encryption),
+ ATTR_LIST(test_dummy_encryption_v2),
#endif
#ifdef CONFIG_UNICODE
ATTR_LIST(casefold),
int fsync_mode; /* fsync policy */
int fs_mode; /* fs mode: LFS or ADAPTIVE */
int bggc_mode; /* bggc mode: off, on or sync */
- bool test_dummy_encryption; /* test dummy encryption */
+ struct fscrypt_dummy_context dummy_enc_ctx; /* test dummy encryption */
block_t unusable_cap; /* Amount of space allowed to be
* unusable when disabling checkpoint
*/
#ifdef CONFIG_FS_ENCRYPTION
#define DUMMY_ENCRYPTION_ENABLED(sbi) \
- (unlikely(F2FS_OPTION(sbi).test_dummy_encryption))
+ (unlikely(F2FS_OPTION(sbi).dummy_enc_ctx.ctx != NULL))
#else
#define DUMMY_ENCRYPTION_ENABLED(sbi) (0)
#endif
#include <linux/types.h>
#include <linux/fs.h>
#include <linux/f2fs_fs.h>
-#include <linux/cryptohash.h>
#include <linux/pagemap.h>
#include <linux/unicode.h>
{Opt_whint, "whint_mode=%s"},
{Opt_alloc, "alloc_mode=%s"},
{Opt_fsync, "fsync_mode=%s"},
+ {Opt_test_dummy_encryption, "test_dummy_encryption=%s"},
{Opt_test_dummy_encryption, "test_dummy_encryption"},
{Opt_checkpoint_disable, "checkpoint=disable"},
{Opt_checkpoint_disable_cap, "checkpoint=disable:%u"},
}
#endif
-static int parse_options(struct super_block *sb, char *options)
+static int f2fs_set_test_dummy_encryption(struct super_block *sb,
+ const char *opt,
+ const substring_t *arg,
+ bool is_remount)
+{
+ struct f2fs_sb_info *sbi = F2FS_SB(sb);
+#ifdef CONFIG_FS_ENCRYPTION
+ int err;
+
+ if (!f2fs_sb_has_encrypt(sbi)) {
+ f2fs_err(sbi, "Encrypt feature is off");
+ return -EINVAL;
+ }
+
+ /*
+ * This mount option is just for testing, and it's not worthwhile to
+ * implement the extra complexity (e.g. RCU protection) that would be
+ * needed to allow it to be set or changed during remount. We do allow
+ * it to be specified during remount, but only if there is no change.
+ */
+ if (is_remount && !F2FS_OPTION(sbi).dummy_enc_ctx.ctx) {
+ f2fs_warn(sbi, "Can't set test_dummy_encryption on remount");
+ return -EINVAL;
+ }
+ err = fscrypt_set_test_dummy_encryption(
+ sb, arg, &F2FS_OPTION(sbi).dummy_enc_ctx);
+ if (err) {
+ if (err == -EEXIST)
+ f2fs_warn(sbi,
+ "Can't change test_dummy_encryption on remount");
+ else if (err == -EINVAL)
+ f2fs_warn(sbi, "Value of option \"%s\" is unrecognized",
+ opt);
+ else
+ f2fs_warn(sbi, "Error processing option \"%s\" [%d]",
+ opt, err);
+ return -EINVAL;
+ }
+ f2fs_warn(sbi, "Test dummy encryption mode enabled");
+#else
+ f2fs_warn(sbi, "Test dummy encryption mount option ignored");
+#endif
+ return 0;
+}
+
+static int parse_options(struct super_block *sb, char *options, bool is_remount)
{
struct f2fs_sb_info *sbi = F2FS_SB(sb);
substring_t args[MAX_OPT_ARGS];
int arg = 0, ext_cnt;
kuid_t uid;
kgid_t gid;
-#ifdef CONFIG_QUOTA
int ret;
-#endif
if (!options)
return 0;
kvfree(name);
break;
case Opt_test_dummy_encryption:
-#ifdef CONFIG_FS_ENCRYPTION
- if (!f2fs_sb_has_encrypt(sbi)) {
- f2fs_err(sbi, "Encrypt feature is off");
- return -EINVAL;
- }
-
- F2FS_OPTION(sbi).test_dummy_encryption = true;
- f2fs_info(sbi, "Test dummy encryption mode enabled");
-#else
- f2fs_info(sbi, "Test dummy encryption mount option ignored");
-#endif
+ ret = f2fs_set_test_dummy_encryption(sb, p, &args[0],
+ is_remount);
+ if (ret)
+ return ret;
break;
case Opt_checkpoint_disable_cap_perc:
if (args->from && match_int(args, &arg))
for (i = 0; i < MAXQUOTAS; i++)
kvfree(F2FS_OPTION(sbi).s_qf_names[i]);
#endif
+ fscrypt_free_dummy_context(&F2FS_OPTION(sbi).dummy_enc_ctx);
destroy_percpu_info(sbi);
for (i = 0; i < NR_PAGE_TYPE; i++)
kvfree(sbi->write_io[i]);
seq_printf(seq, ",whint_mode=%s", "user-based");
else if (F2FS_OPTION(sbi).whint_mode == WHINT_MODE_FS)
seq_printf(seq, ",whint_mode=%s", "fs-based");
-#ifdef CONFIG_FS_ENCRYPTION
- if (F2FS_OPTION(sbi).test_dummy_encryption)
- seq_puts(seq, ",test_dummy_encryption");
-#endif
+
+ fscrypt_show_test_dummy_encryption(seq, ',', sbi->sb);
if (F2FS_OPTION(sbi).alloc_mode == ALLOC_MODE_DEFAULT)
seq_printf(seq, ",alloc_mode=%s", "default");
F2FS_OPTION(sbi).whint_mode = WHINT_MODE_OFF;
F2FS_OPTION(sbi).alloc_mode = ALLOC_MODE_DEFAULT;
F2FS_OPTION(sbi).fsync_mode = FSYNC_MODE_POSIX;
- F2FS_OPTION(sbi).test_dummy_encryption = false;
F2FS_OPTION(sbi).s_resuid = make_kuid(&init_user_ns, F2FS_DEF_RESUID);
F2FS_OPTION(sbi).s_resgid = make_kgid(&init_user_ns, F2FS_DEF_RESGID);
F2FS_OPTION(sbi).compress_algorithm = COMPRESS_LZ4;
default_options(sbi);
/* parse mount options */
- err = parse_options(sb, data);
+ err = parse_options(sb, data, true);
if (err)
goto restore_opts;
checkpoint_changed =
ctx, len, fs_data, XATTR_CREATE);
}
-static bool f2fs_dummy_context(struct inode *inode)
+static const union fscrypt_context *
+f2fs_get_dummy_context(struct super_block *sb)
{
- return DUMMY_ENCRYPTION_ENABLED(F2FS_I_SB(inode));
+ return F2FS_OPTION(F2FS_SB(sb)).dummy_enc_ctx.ctx;
}
static bool f2fs_has_stable_inodes(struct super_block *sb)
.key_prefix = "f2fs:",
.get_context = f2fs_get_context,
.set_context = f2fs_set_context,
- .dummy_context = f2fs_dummy_context,
+ .get_dummy_context = f2fs_get_dummy_context,
.empty_dir = f2fs_empty_dir,
.max_namelen = F2FS_NAME_LEN,
.has_stable_inodes = f2fs_has_stable_inodes,
goto free_sb_buf;
}
- err = parse_options(sb, options);
+ err = parse_options(sb, options, false);
if (err)
goto free_options;
for (i = 0; i < MAXQUOTAS; i++)
kvfree(F2FS_OPTION(sbi).s_qf_names[i]);
#endif
+ fscrypt_free_dummy_context(&F2FS_OPTION(sbi).dummy_enc_ctx);
kvfree(options);
free_sb_buf:
kvfree(raw_super);
FEAT_SB_CHECKSUM,
FEAT_CASEFOLD,
FEAT_COMPRESSION,
+ FEAT_TEST_DUMMY_ENCRYPTION_V2,
};
static ssize_t f2fs_feature_show(struct f2fs_attr *a,
case FEAT_SB_CHECKSUM:
case FEAT_CASEFOLD:
case FEAT_COMPRESSION:
+ case FEAT_TEST_DUMMY_ENCRYPTION_V2:
return sprintf(buf, "supported\n");
}
return 0;
#ifdef CONFIG_FS_ENCRYPTION
F2FS_FEATURE_RO_ATTR(encryption, FEAT_CRYPTO);
+F2FS_FEATURE_RO_ATTR(test_dummy_encryption_v2, FEAT_TEST_DUMMY_ENCRYPTION_V2);
#endif
#ifdef CONFIG_BLK_DEV_ZONED
F2FS_FEATURE_RO_ATTR(block_zoned, FEAT_BLKZONED);
static struct attribute *f2fs_feat_attrs[] = {
#ifdef CONFIG_FS_ENCRYPTION
ATTR_LIST(encryption),
+ ATTR_LIST(test_dummy_encryption_v2),
#endif
#ifdef CONFIG_BLK_DEV_ZONED
ATTR_LIST(block_zoned),
*/
static void copy_fdtable(struct fdtable *nfdt, struct fdtable *ofdt)
{
- unsigned int cpy, set;
+ size_t cpy, set;
BUG_ON(nfdt->max_fds < ofdt->max_fds);
/* Advance in metadata tree. */
(mp->mp_list[hgt])++;
- if (mp->mp_list[hgt] >= sdp->sd_inptrs) {
- if (!hgt)
+ if (hgt) {
+ if (mp->mp_list[hgt] >= sdp->sd_inptrs)
+ goto lower_metapath;
+ } else {
+ if (mp->mp_list[hgt] >= sdp->sd_diptrs)
break;
- goto lower_metapath;
}
fill_up_metapath:
ret = -ENOENT;
goto unlock;
} else {
- /* report a hole */
iomap->offset = pos;
iomap->length = length;
- goto do_alloc;
+ goto hole_found;
}
}
iomap->length = size;
return ret;
do_alloc:
- iomap->addr = IOMAP_NULL_ADDR;
- iomap->type = IOMAP_HOLE;
if (flags & IOMAP_REPORT) {
if (pos >= size)
ret = -ENOENT;
if (pos < size && height == ip->i_height)
ret = gfs2_hole_size(inode, lblock, len, mp, iomap);
}
+hole_found:
+ iomap->addr = IOMAP_NULL_ADDR;
+ iomap->type = IOMAP_HOLE;
goto out;
}
fs_err(sdp, "Error %d syncing glock \n", ret);
gfs2_dump_glock(NULL, gl, true);
}
- return;
+ goto skip_inval;
}
}
if (test_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags)) {
clear_bit(GLF_INVALIDATE_IN_PROGRESS, &gl->gl_flags);
}
+skip_inval:
gfs2_glock_hold(gl);
/*
* Check for an error encountered since we called go_sync and go_inval.
goto out_unlock;
if (nonblock)
goto out_sched;
- smp_mb();
- if (atomic_read(&gl->gl_revokes) != 0)
- goto out_sched;
set_bit(GLF_DEMOTE_IN_PROGRESS, &gl->gl_flags);
GLOCK_BUG_ON(gl, gl->gl_demote_state == LM_ST_EXCLUSIVE);
gl->gl_target = gl->gl_demote_state;
error = finish_no_open(file, NULL);
}
gfs2_glock_dq_uninit(ghs);
- return error;
+ goto fail;
} else if (error != -ENOENT) {
goto fail_gunlock;
}
error = finish_open(file, dentry, gfs2_open_common);
}
gfs2_glock_dq_uninit(ghs);
+ gfs2_qa_put(ip);
gfs2_glock_dq_uninit(ghs + 1);
clear_bit(GLF_INODE_CREATING, &io_gl->gl_flags);
gfs2_glock_put(io_gl);
+ gfs2_qa_put(dip);
return error;
fail_gunlock3:
clear_bit(GLF_INODE_CREATING, &io_gl->gl_flags);
gfs2_glock_put(io_gl);
fail_free_inode:
- gfs2_qa_put(ip);
if (ip->i_gl) {
glock_clear_object(ip->i_gl, ip);
gfs2_glock_put(ip->i_gl);
out_child:
gfs2_glock_dq(ghs);
out_parent:
- gfs2_qa_put(ip);
+ gfs2_qa_put(dip);
gfs2_holder_uninit(ghs);
gfs2_holder_uninit(ghs + 1);
return error;
struct buffer_head *bh = bd->bd_bh;
struct gfs2_glock *gl = bd->bd_gl;
+ sdp->sd_log_num_revoke++;
+ if (atomic_inc_return(&gl->gl_revokes) == 1)
+ gfs2_glock_hold(gl);
bh->b_private = NULL;
bd->bd_blkno = bh->b_blocknr;
gfs2_remove_from_ail(bd); /* drops ref on bh */
bd->bd_bh = NULL;
- sdp->sd_log_num_revoke++;
- if (atomic_inc_return(&gl->gl_revokes) == 1)
- gfs2_glock_hold(gl);
set_bit(GLF_LFLUSH, &gl->gl_flags);
list_add(&bd->bd_list, &sdp->sd_log_revokes);
}
while (!kthread_should_stop()) {
+ if (gfs2_withdrawn(sdp)) {
+ msleep_interruptible(HZ);
+ continue;
+ }
/* Check for errors writing to the journal */
if (sdp->sd_log_error) {
gfs2_lm(sdp,
"prevent further damage.\n",
sdp->sd_fsname, sdp->sd_log_error);
gfs2_withdraw(sdp);
+ continue;
}
did_flush = false;
struct super_block *sb = sdp->sd_vfs;
struct bio *bio = bio_alloc(GFP_NOIO, BIO_MAX_PAGES);
- bio->bi_iter.bi_sector = blkno << (sb->s_blocksize_bits - 9);
+ bio->bi_iter.bi_sector = blkno << sdp->sd_fsb2bb_shift;
bio_set_dev(bio, sb->s_bdev);
bio->bi_end_io = end_io;
bio->bi_private = sdp;
unsigned int bsize = sdp->sd_sb.sb_bsize, off;
unsigned int bsize_shift = sdp->sd_sb.sb_bsize_shift;
unsigned int shift = PAGE_SHIFT - bsize_shift;
- unsigned int readahead_blocks = BIO_MAX_PAGES << shift;
+ unsigned int max_blocks = 2 * 1024 * 1024 >> bsize_shift;
struct gfs2_journal_extent *je;
int sz, ret = 0;
struct bio *bio = NULL;
struct page *page = NULL;
- bool bio_chained = false, done = false;
+ bool done = false;
errseq_t since;
memset(head, 0, sizeof(*head));
off = 0;
}
- if (!bio || (bio_chained && !off)) {
- /* start new bio */
- } else {
- sz = bio_add_page(bio, page, bsize, off);
- if (sz == bsize)
- goto block_added;
+ if (bio && (off || block < blocks_submitted + max_blocks)) {
+ sector_t sector = dblock << sdp->sd_fsb2bb_shift;
+
+ if (bio_end_sector(bio) == sector) {
+ sz = bio_add_page(bio, page, bsize, off);
+ if (sz == bsize)
+ goto block_added;
+ }
if (off) {
unsigned int blocks =
(PAGE_SIZE - off) >> bsize_shift;
bio = gfs2_chain_bio(bio, blocks);
- bio_chained = true;
goto add_block_to_new_bio;
}
}
if (bio) {
- blocks_submitted = block + 1;
+ blocks_submitted = block;
submit_bio(bio);
}
bio = gfs2_log_alloc_bio(sdp, dblock, gfs2_end_log_read);
bio->bi_opf = REQ_OP_READ;
- bio_chained = false;
add_block_to_new_bio:
sz = bio_add_page(bio, page, bsize, off);
BUG_ON(sz != bsize);
off += bsize;
if (off == PAGE_SIZE)
page = NULL;
- if (blocks_submitted < blocks_read + readahead_blocks) {
+ if (blocks_submitted <= blocks_read + max_blocks) {
/* Keep at least one bio in flight */
continue;
}
int num = 0;
if (unlikely(gfs2_withdrawn(sdp)) &&
- (!sdp->sd_jdesc || (blkno != sdp->sd_jdesc->jd_no_addr))) {
+ (!sdp->sd_jdesc || gl != sdp->sd_jinode_gl)) {
*bhp = NULL;
return -EIO;
}
u32 x;
int error = 0;
- if (capable(CAP_SYS_RESOURCE) ||
- sdp->sd_args.ar_quota != GFS2_QUOTA_ON)
+ if (sdp->sd_args.ar_quota != GFS2_QUOTA_ON)
return 0;
error = gfs2_quota_hold(ip, uid, gid);
int found;
if (!test_and_clear_bit(GIF_QD_LOCKED, &ip->i_flags))
- goto out;
+ return;
for (x = 0; x < ip->i_qadata->qa_qd_num; x++) {
struct gfs2_quota_data *qd;
qd_unlock(qda[x]);
}
-out:
gfs2_quota_unhold(ip);
}
if (!test_bit(GIF_QD_LOCKED, &ip->i_flags))
return 0;
- if (sdp->sd_args.ar_quota != GFS2_QUOTA_ON)
- return 0;
-
for (x = 0; x < ip->i_qadata->qa_qd_num; x++) {
qd = ip->i_qadata->qa_qd[x];
if (ip->i_diskflags & GFS2_DIF_SYSTEM)
return;
- BUG_ON(ip->i_qadata->qa_ref <= 0);
+ if (gfs2_assert_withdraw(sdp, ip->i_qadata &&
+ ip->i_qadata->qa_ref > 0))
+ return;
for (x = 0; x < ip->i_qadata->qa_qd_num; x++) {
qd = ip->i_qadata->qa_qd[x];
int ret;
ap->allowed = UINT_MAX; /* Assume we are permitted a whole lot */
- if (sdp->sd_args.ar_quota == GFS2_QUOTA_OFF)
+ if (capable(CAP_SYS_RESOURCE) ||
+ sdp->sd_args.ar_quota == GFS2_QUOTA_OFF)
return 0;
ret = gfs2_quota_lock(ip, NO_UID_QUOTA_CHANGE, NO_GID_QUOTA_CHANGE);
if (ret)
if (ip->i_qadata)
gfs2_assert_warn(sdp, ip->i_qadata->qa_ref == 0);
gfs2_rs_delete(ip, NULL);
- gfs2_qa_put(ip);
gfs2_ordered_del_inode(ip);
clear_inode(inode);
gfs2_dir_hash_inval(ip);
if (!sb_rdonly(sdp->sd_vfs))
ret = gfs2_make_fs_ro(sdp);
+ if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */
+ if (!ret)
+ ret = -EIO;
+ clear_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags);
+ goto skip_recovery;
+ }
/*
* Drop the glock for our journal so another node can recover it.
*/
wait_on_bit(&gl->gl_flags, GLF_FREEING, TASK_UNINTERRUPTIBLE);
}
- if (sdp->sd_lockstruct.ls_ops->lm_lock == NULL) { /* lock_nolock */
- clear_bit(SDF_WITHDRAW_RECOVERY, &sdp->sd_flags);
- goto skip_recovery;
- }
/*
* Dequeue the "live" glock, but keep a reference so it's never freed.
*/
bool needs_fixed_file;
u8 opcode;
+ u16 buf_index;
+
struct io_ring_ctx *ctx;
struct list_head list;
unsigned int flags;
goto err;
ctx->flags = p->flags;
+ init_waitqueue_head(&ctx->sqo_wait);
init_waitqueue_head(&ctx->cq_wait);
INIT_LIST_HEAD(&ctx->cq_overflow_list);
init_completion(&ctx->completions[0]);
for (i = 0; i < rb->to_free; i++) {
struct io_kiocb *req = rb->reqs[i];
- if (req->flags & REQ_F_FIXED_FILE) {
- req->file = NULL;
- percpu_ref_put(req->fixed_file_refs);
- }
if (req->flags & REQ_F_INFLIGHT)
inflight++;
__io_req_aux_free(req);
if ((req->flags & REQ_F_LINK_HEAD) || io_is_fallback_req(req))
return false;
- if (!(req->flags & REQ_F_FIXED_FILE) || req->io)
+ if (req->file || req->io)
rb->need_iter++;
rb->reqs[rb->to_free++] = req;
req->rw.addr = READ_ONCE(sqe->addr);
req->rw.len = READ_ONCE(sqe->len);
- /* we own ->private, reuse it for the buffer index / buffer ID */
- req->rw.kiocb.private = (void *) (unsigned long)
- READ_ONCE(sqe->buf_index);
+ req->buf_index = READ_ONCE(sqe->buf_index);
return 0;
}
struct io_ring_ctx *ctx = req->ctx;
size_t len = req->rw.len;
struct io_mapped_ubuf *imu;
- unsigned index, buf_index;
+ u16 index, buf_index;
size_t offset;
u64 buf_addr;
if (unlikely(!ctx->user_bufs))
return -EFAULT;
- buf_index = (unsigned long) req->rw.kiocb.private;
+ buf_index = req->buf_index;
if (unlikely(buf_index >= ctx->nr_user_bufs))
return -EFAULT;
bool needs_lock)
{
struct io_buffer *kbuf;
- int bgid;
+ u16 bgid;
kbuf = (struct io_buffer *) (unsigned long) req->rw.addr;
- bgid = (int) (unsigned long) req->rw.kiocb.private;
+ bgid = req->buf_index;
kbuf = io_buffer_select(req, len, bgid, kbuf, needs_lock);
if (IS_ERR(kbuf))
return kbuf;
}
/* buffer index only valid with fixed read/write, or buffer select */
- if (req->rw.kiocb.private && !(req->flags & REQ_F_BUFFER_SELECT))
+ if (req->buf_index && !(req->flags & REQ_F_BUFFER_SELECT))
return -EINVAL;
if (opcode == IORING_OP_READ || opcode == IORING_OP_WRITE) {
struct file *out = sp->file_out;
unsigned int flags = sp->flags & ~SPLICE_F_FD_IN_FIXED;
loff_t *poff_in, *poff_out;
- long ret;
+ long ret = 0;
if (force_nonblock)
return -EAGAIN;
poff_in = (sp->off_in == -1) ? NULL : &sp->off_in;
poff_out = (sp->off_out == -1) ? NULL : &sp->off_out;
- ret = do_splice(in, poff_in, out, poff_out, sp->len, flags);
- if (force_nonblock && ret == -EAGAIN)
- return -EAGAIN;
+
+ if (sp->len)
+ ret = do_splice(in, poff_in, out, poff_out, sp->len, flags);
io_put_file(req, in, (sp->flags & SPLICE_F_FD_IN_FIXED));
req->flags &= ~REQ_F_NEED_CLEANUP;
req->result = mask;
init_task_work(&req->task_work, func);
/*
- * If this fails, then the task is exiting. Punt to one of the io-wq
- * threads to ensure the work gets run, we can't always rely on exit
- * cancelation taking care of this.
+ * If this fails, then the task is exiting. When a task exits, the
+ * work gets canceled, so just cancel this request as well instead
+ * of executing it. We can't safely execute it anyway, as we may not
+ * have the needed state needed for it anyway.
*/
ret = task_work_add(tsk, &req->task_work, true);
if (unlikely(ret)) {
+ WRITE_ONCE(poll->canceled, true);
tsk = io_wq_get_task(req->ctx->io_wq);
task_work_add(tsk, &req->task_work, true);
}
if (!req_need_defer(req) && list_empty_careful(&ctx->defer_list))
return 0;
- if (!req->io && io_alloc_async_ctx(req))
- return -EAGAIN;
-
- ret = io_req_defer_prep(req, sqe);
- if (ret < 0)
- return ret;
+ if (!req->io) {
+ if (io_alloc_async_ctx(req))
+ return -EAGAIN;
+ ret = io_req_defer_prep(req, sqe);
+ if (ret < 0)
+ return ret;
+ }
spin_lock_irq(&ctx->completion_lock);
if (!req_need_defer(req) && list_empty(&ctx->defer_list)) {
if (ret)
return ret;
- if (ctx->flags & IORING_SETUP_IOPOLL) {
+ /* If the op doesn't have a file, we're not polling for it */
+ if ((ctx->flags & IORING_SETUP_IOPOLL) && req->file) {
const bool in_async = io_wq_current_is_worker();
if (req->result == -EAGAIN)
io_double_put_req(req);
}
} else if (req->flags & REQ_F_FORCE_ASYNC) {
- ret = io_req_defer_prep(req, sqe);
- if (unlikely(ret < 0))
- goto fail_req;
+ if (!req->io) {
+ ret = -EAGAIN;
+ if (io_alloc_async_ctx(req))
+ goto fail_req;
+ ret = io_req_defer_prep(req, sqe);
+ if (unlikely(ret < 0))
+ goto fail_req;
+ }
+
/*
* Never try inline submit of IOSQE_ASYNC is set, go straight
* to async execution.
finish_wait(&ctx->sqo_wait, &wait);
ctx->rings->sq_flags &= ~IORING_SQ_NEED_WAKEUP;
+ ret = 0;
continue;
}
finish_wait(&ctx->sqo_wait, &wait);
{
int ret;
- init_waitqueue_head(&ctx->sqo_wait);
mmgrab(current->mm);
ctx->sqo_mm = current->mm;
nfss->fscache_key = NULL;
nfss->fscache = NULL;
- if (!(nfss->options & NFS_OPTION_FSCACHE))
- return;
if (!uniq) {
uniq = "";
ulen = 1;
/* create a cache index for looking up filehandles */
nfss->fscache = fscache_acquire_cookie(nfss->nfs_client->fscache,
&nfs_fscache_super_index_def,
- key, sizeof(*key) + ulen,
+ &key->key,
+ sizeof(key->key) + ulen,
NULL, 0,
nfss, 0, true);
dfprintk(FSCACHE, "NFS: get superblock cookie (0x%p/0x%p)\n",
}
}
+static void nfs_fscache_update_auxdata(struct nfs_fscache_inode_auxdata *auxdata,
+ struct nfs_inode *nfsi)
+{
+ memset(auxdata, 0, sizeof(*auxdata));
+ auxdata->mtime_sec = nfsi->vfs_inode.i_mtime.tv_sec;
+ auxdata->mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
+ auxdata->ctime_sec = nfsi->vfs_inode.i_ctime.tv_sec;
+ auxdata->ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
+
+ if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
+ auxdata->change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
+}
+
/*
* Initialise the per-inode cache cookie pointer for an NFS inode.
*/
if (!(nfss->fscache && S_ISREG(inode->i_mode)))
return;
- memset(&auxdata, 0, sizeof(auxdata));
- auxdata.mtime_sec = nfsi->vfs_inode.i_mtime.tv_sec;
- auxdata.mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
- auxdata.ctime_sec = nfsi->vfs_inode.i_ctime.tv_sec;
- auxdata.ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
-
- if (NFS_SERVER(&nfsi->vfs_inode)->nfs_client->rpc_ops->version == 4)
- auxdata.change_attr = inode_peek_iversion_raw(&nfsi->vfs_inode);
+ nfs_fscache_update_auxdata(&auxdata, nfsi);
nfsi->fscache = fscache_acquire_cookie(NFS_SB(inode->i_sb)->fscache,
&nfs_fscache_inode_object_def,
dfprintk(FSCACHE, "NFS: clear cookie (0x%p/0x%p)\n", nfsi, cookie);
- memset(&auxdata, 0, sizeof(auxdata));
- auxdata.mtime_sec = nfsi->vfs_inode.i_mtime.tv_sec;
- auxdata.mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
- auxdata.ctime_sec = nfsi->vfs_inode.i_ctime.tv_sec;
- auxdata.ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
+ nfs_fscache_update_auxdata(&auxdata, nfsi);
fscache_relinquish_cookie(cookie, &auxdata, false);
nfsi->fscache = NULL;
}
if (!fscache_cookie_valid(cookie))
return;
- memset(&auxdata, 0, sizeof(auxdata));
- auxdata.mtime_sec = nfsi->vfs_inode.i_mtime.tv_sec;
- auxdata.mtime_nsec = nfsi->vfs_inode.i_mtime.tv_nsec;
- auxdata.ctime_sec = nfsi->vfs_inode.i_ctime.tv_sec;
- auxdata.ctime_nsec = nfsi->vfs_inode.i_ctime.tv_nsec;
+ nfs_fscache_update_auxdata(&auxdata, nfsi);
if (inode_is_open_for_write(inode)) {
dfprintk(FSCACHE, "NFS: nfsi 0x%p disabling cache\n", nfsi);
#define encode_dirpath_sz (1 + XDR_QUADLEN(MNTPATHLEN))
#define MNT_status_sz (1)
#define MNT_fhandle_sz XDR_QUADLEN(NFS2_FHSIZE)
+#define MNT_fhandlev3_sz XDR_QUADLEN(NFS3_FHSIZE)
#define MNT_authflav3_sz (1 + NFS_MAX_SECFLAVORS)
/*
*/
#define MNT_enc_dirpath_sz encode_dirpath_sz
#define MNT_dec_mountres_sz (MNT_status_sz + MNT_fhandle_sz)
-#define MNT_dec_mountres3_sz (MNT_status_sz + MNT_fhandle_sz + \
+#define MNT_dec_mountres3_sz (MNT_status_sz + MNT_fhandlev3_sz + \
MNT_authflav3_sz)
/*
.rpc_client = server->client,
.rpc_message = &msg,
.callback_ops = &nfs4_delegreturn_ops,
- .flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF | RPC_TASK_TIMEOUT,
+ .flags = RPC_TASK_ASYNC | RPC_TASK_TIMEOUT,
};
int status = 0;
state = new;
state->owner = owner;
atomic_inc(&owner->so_count);
- list_add_rcu(&state->inode_states, &nfsi->open_states);
ihold(inode);
state->inode = inode;
+ list_add_rcu(&state->inode_states, &nfsi->open_states);
spin_unlock(&inode->i_lock);
/* Note: The reclaim code dictates that we add stateless
* and read-only stateids to the end of the list */
.callback_ops = call_ops,
.callback_data = hdr,
.workqueue = nfsiod_workqueue,
- .flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF | flags,
+ .flags = RPC_TASK_ASYNC | flags,
};
hdr->rw_ops->rw_initiate(hdr, &msg, rpc_ops, &task_setup_data, how);
hdr->cred,
NFS_PROTO(hdr->inode),
desc->pg_rpc_callops,
- desc->pg_ioflags, 0);
+ desc->pg_ioflags,
+ RPC_TASK_CRED_NOREF);
return ret;
}
nfs_init_commit(data, NULL, NULL, cinfo);
nfs_initiate_commit(NFS_CLIENT(inode), data,
NFS_PROTO(data->inode),
- data->mds_ops, how, 0);
+ data->mds_ops, how,
+ RPC_TASK_CRED_NOREF);
} else {
nfs_init_commit(data, NULL, data->lseg, cinfo);
initiate_commit(data, how);
uniq = ctx->fscache_uniq;
ulen = strlen(ctx->fscache_uniq);
}
- return;
}
nfs_fscache_get_super_cookie(sb, uniq, ulen);
.callback_ops = call_ops,
.callback_data = data,
.workqueue = nfsiod_workqueue,
- .flags = RPC_TASK_ASYNC | RPC_TASK_CRED_NOREF | flags,
+ .flags = RPC_TASK_ASYNC | flags,
.priority = priority,
};
/* Set up the initial task struct. */
nfs_init_commit(data, head, NULL, cinfo);
atomic_inc(&cinfo->mds->rpcs_out);
return nfs_initiate_commit(NFS_CLIENT(inode), data, NFS_PROTO(inode),
- data->mds_ops, how, 0);
+ data->mds_ops, how, RPC_TASK_CRED_NOREF);
}
/*
goto out;
}
- {
- SHASH_DESC_ON_STACK(desc, tfm);
-
- desc->tfm = tfm;
-
- status = crypto_shash_digest(desc, clname->data, clname->len,
- cksum.data);
- shash_desc_zero(desc);
- }
-
+ status = crypto_shash_tfm_digest(tfm, clname->data, clname->len,
+ cksum.data);
if (status)
goto out;
struct crypto_shash *tfm = cn->cn_tfm;
struct xdr_netobj cksum;
char *principal = NULL;
- SHASH_DESC_ON_STACK(desc, tfm);
/* Don't upcall if it's already stored */
if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
else if (clp->cl_cred.cr_principal)
principal = clp->cl_cred.cr_principal;
if (principal) {
- desc->tfm = tfm;
cksum.len = crypto_shash_digestsize(tfm);
cksum.data = kmalloc(cksum.len, GFP_KERNEL);
if (cksum.data == NULL) {
ret = -ENOMEM;
goto out;
}
- ret = crypto_shash_digest(desc, principal, strlen(principal),
- cksum.data);
- shash_desc_zero(desc);
+ ret = crypto_shash_tfm_digest(tfm, principal, strlen(principal),
+ cksum.data);
if (ret) {
kfree(cksum.data);
goto out;
struct crypto_shash *tfm = cn->cn_tfm;
struct xdr_netobj cksum;
char *principal = NULL;
- SHASH_DESC_ON_STACK(desc, tfm);
/* did we already find that this client is stable? */
if (test_bit(NFSD4_CLIENT_STABLE, &clp->cl_flags))
principal = clp->cl_cred.cr_principal;
if (principal == NULL)
return -ENOENT;
- desc->tfm = tfm;
cksum.len = crypto_shash_digestsize(tfm);
cksum.data = kmalloc(cksum.len, GFP_KERNEL);
if (cksum.data == NULL)
return -ENOENT;
- status = crypto_shash_digest(desc, principal, strlen(principal),
- cksum.data);
- shash_desc_zero(desc);
+ status = crypto_shash_tfm_digest(tfm, principal,
+ strlen(principal), cksum.data);
if (status) {
kfree(cksum.data);
return -ENOENT;
BUILD_BUG_ON(FAN_OPEN_EXEC != FS_OPEN_EXEC);
BUILD_BUG_ON(FAN_OPEN_EXEC_PERM != FS_OPEN_EXEC_PERM);
- BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 20);
+ BUILD_BUG_ON(HWEIGHT32(ALL_FANOTIFY_EVENT_BITS) != 19);
mask = fanotify_group_event_mask(group, iter_info, mask, data,
data_type);
if (fh_type != OVL_FILEID_V0)
return ERR_PTR(-EINVAL);
+ if (buflen <= OVL_FH_WIRE_OFFSET)
+ return ERR_PTR(-EINVAL);
+
fh = kzalloc(buflen, GFP_KERNEL);
if (!fh)
return ERR_PTR(-ENOMEM);
if (attr->ia_valid & (ATTR_KILL_SUID|ATTR_KILL_SGID))
attr->ia_valid &= ~ATTR_MODE;
+ /*
+ * We might have to translate ovl file into real file object
+ * once use cases emerge. For now, simply don't let underlying
+ * filesystem rely on attr->ia_file
+ */
+ attr->ia_valid &= ~ATTR_FILE;
+
+ /*
+ * If open(O_TRUNC) is done, VFS calls ->setattr with ATTR_OPEN
+ * set. Overlayfs does not pass O_TRUNC flag to underlying
+ * filesystem during open -> do not pass ATTR_OPEN. This
+ * disables optimization in fuse which assumes open(O_TRUNC)
+ * already set file size to 0. But we never passed O_TRUNC to
+ * fuse. So by clearing ATTR_OPEN, fuse will be forced to send
+ * setattr request to server.
+ */
+ attr->ia_valid &= ~ATTR_OPEN;
+
inode_lock(upperdentry->d_inode);
old_cred = ovl_override_creds(dentry->d_sb);
err = notify_change(upperdentry, attr, NULL);
[ilog2(VM_GROWSDOWN)] = "gd",
[ilog2(VM_PFNMAP)] = "pf",
[ilog2(VM_DENYWRITE)] = "dw",
-#ifdef CONFIG_X86_INTEL_MPX
- [ilog2(VM_MPX)] = "mp",
-#endif
[ilog2(VM_LOCKED)] = "lo",
[ilog2(VM_IO)] = "io",
[ilog2(VM_SEQ_READ)] = "sr",
"ramoops.ko".
For more information, see Documentation/admin-guide/ramoops.rst.
+
+config PSTORE_ZONE
+ tristate
+ depends on PSTORE
+ help
+ The common layer for pstore/blk (and pstore/ram in the future)
+ to manage storage in zones.
+
+config PSTORE_BLK
+ tristate "Log panic/oops to a block device"
+ depends on PSTORE
+ depends on BLOCK
+ select PSTORE_ZONE
+ default n
+ help
+ This enables panic and oops message to be logged to a block dev
+ where it can be read back at some later point.
+
+ For more information, see Documentation/admin-guide/pstore-blk.rst
+
+ If unsure, say N.
+
+config PSTORE_BLK_BLKDEV
+ string "block device identifier"
+ depends on PSTORE_BLK
+ default ""
+ help
+ Which block device should be used for pstore/blk.
+
+ It accepts the following variants:
+ 1) <hex_major><hex_minor> device number in hexadecimal representation,
+ with no leading 0x, for example b302.
+ 2) /dev/<disk_name> represents the device name of disk
+ 3) /dev/<disk_name><decimal> represents the device name and number
+ of partition - device number of disk plus the partition number
+ 4) /dev/<disk_name>p<decimal> - same as the above, this form is
+ used when disk name of partitioned disk ends with a digit.
+ 5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ unique id of a partition if the partition table provides it.
+ The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+ partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+ filled hex representation of the 32-bit "NT disk signature", and PP
+ is a zero-filled hex representation of the 1-based partition number.
+ 6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation
+ to a partition with a known unique id.
+ 7) <major>:<minor> major and minor number of the device separated by
+ a colon.
+
+ NOTE that, both Kconfig and module parameters can configure
+ pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_KMSG_SIZE
+ int "Size in Kbytes of kmsg dump log to store"
+ depends on PSTORE_BLK
+ default 64
+ help
+ This just sets size of kmsg dump (oops, panic, etc) log for
+ pstore/blk. The size is in KB and must be a multiple of 4.
+
+ NOTE that, both Kconfig and module parameters can configure
+ pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_MAX_REASON
+ int "Maximum kmsg dump reason to store"
+ depends on PSTORE_BLK
+ default 2
+ help
+ The maximum reason for kmsg dumps to store. The default is
+ 2 (KMSG_DUMP_OOPS), see include/linux/kmsg_dump.h's
+ enum kmsg_dump_reason for more details.
+
+ NOTE that, both Kconfig and module parameters can configure
+ pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_PMSG_SIZE
+ int "Size in Kbytes of pmsg to store"
+ depends on PSTORE_BLK
+ depends on PSTORE_PMSG
+ default 64
+ help
+ This just sets size of pmsg (pmsg_size) for pstore/blk. The size is
+ in KB and must be a multiple of 4.
+
+ NOTE that, both Kconfig and module parameters can configure
+ pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_CONSOLE_SIZE
+ int "Size in Kbytes of console log to store"
+ depends on PSTORE_BLK
+ depends on PSTORE_CONSOLE
+ default 64
+ help
+ This just sets size of console log (console_size) to store via
+ pstore/blk. The size is in KB and must be a multiple of 4.
+
+ NOTE that, both Kconfig and module parameters can configure
+ pstore/blk, but module parameters have priority over Kconfig.
+
+config PSTORE_BLK_FTRACE_SIZE
+ int "Size in Kbytes of ftrace log to store"
+ depends on PSTORE_BLK
+ depends on PSTORE_FTRACE
+ default 64
+ help
+ This just sets size of ftrace log (ftrace_size) for pstore/blk. The
+ size is in KB and must be a multiple of 4.
+
+ NOTE that, both Kconfig and module parameters can configure
+ pstore/blk, but module parameters have priority over Kconfig.
ramoops-objs += ram.o ram_core.o
obj-$(CONFIG_PSTORE_RAM) += ramoops.o
+
+pstore_zone-objs += zone.o
+obj-$(CONFIG_PSTORE_ZONE) += pstore_zone.o
+
+pstore_blk-objs += blk.o
+obj-$(CONFIG_PSTORE_BLK) += pstore_blk.o
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Implements pstore backend driver that write to block (or non-block) storage
+ * devices, using the pstore/zone API.
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include "../../block/blk.h"
+#include <linux/blkdev.h>
+#include <linux/string.h>
+#include <linux/of.h>
+#include <linux/of_address.h>
+#include <linux/platform_device.h>
+#include <linux/pstore_blk.h>
+#include <linux/mount.h>
+#include <linux/uio.h>
+
+static long kmsg_size = CONFIG_PSTORE_BLK_KMSG_SIZE;
+module_param(kmsg_size, long, 0400);
+MODULE_PARM_DESC(kmsg_size, "kmsg dump record size in kbytes");
+
+static int max_reason = CONFIG_PSTORE_BLK_MAX_REASON;
+module_param(max_reason, int, 0400);
+MODULE_PARM_DESC(max_reason,
+ "maximum reason for kmsg dump (default 2: Oops and Panic)");
+
+#if IS_ENABLED(CONFIG_PSTORE_PMSG)
+static long pmsg_size = CONFIG_PSTORE_BLK_PMSG_SIZE;
+#else
+static long pmsg_size = -1;
+#endif
+module_param(pmsg_size, long, 0400);
+MODULE_PARM_DESC(pmsg_size, "pmsg size in kbytes");
+
+#if IS_ENABLED(CONFIG_PSTORE_CONSOLE)
+static long console_size = CONFIG_PSTORE_BLK_CONSOLE_SIZE;
+#else
+static long console_size = -1;
+#endif
+module_param(console_size, long, 0400);
+MODULE_PARM_DESC(console_size, "console size in kbytes");
+
+#if IS_ENABLED(CONFIG_PSTORE_FTRACE)
+static long ftrace_size = CONFIG_PSTORE_BLK_FTRACE_SIZE;
+#else
+static long ftrace_size = -1;
+#endif
+module_param(ftrace_size, long, 0400);
+MODULE_PARM_DESC(ftrace_size, "ftrace size in kbytes");
+
+static bool best_effort;
+module_param(best_effort, bool, 0400);
+MODULE_PARM_DESC(best_effort, "use best effort to write (i.e. do not require storage driver pstore support, default: off)");
+
+/*
+ * blkdev - the block device to use for pstore storage
+ *
+ * Usually, this will be a partition of a block device.
+ *
+ * blkdev accepts the following variants:
+ * 1) <hex_major><hex_minor> device number in hexadecimal representation,
+ * with no leading 0x, for example b302.
+ * 2) /dev/<disk_name> represents the device number of disk
+ * 3) /dev/<disk_name><decimal> represents the device number
+ * of partition - device number of disk plus the partition number
+ * 4) /dev/<disk_name>p<decimal> - same as the above, that form is
+ * used when disk name of partitioned disk ends on a digit.
+ * 5) PARTUUID=00112233-4455-6677-8899-AABBCCDDEEFF representing the
+ * unique id of a partition if the partition table provides it.
+ * The UUID may be either an EFI/GPT UUID, or refer to an MSDOS
+ * partition using the format SSSSSSSS-PP, where SSSSSSSS is a zero-
+ * filled hex representation of the 32-bit "NT disk signature", and PP
+ * is a zero-filled hex representation of the 1-based partition number.
+ * 6) PARTUUID=<UUID>/PARTNROFF=<int> to select a partition in relation to
+ * a partition with a known unique id.
+ * 7) <major>:<minor> major and minor number of the device separated by
+ * a colon.
+ */
+static char blkdev[80] = CONFIG_PSTORE_BLK_BLKDEV;
+module_param_string(blkdev, blkdev, 80, 0400);
+MODULE_PARM_DESC(blkdev, "block device for pstore storage");
+
+/*
+ * All globals must only be accessed under the pstore_blk_lock
+ * during the register/unregister functions.
+ */
+static DEFINE_MUTEX(pstore_blk_lock);
+static struct block_device *psblk_bdev;
+static struct pstore_zone_info *pstore_zone_info;
+static pstore_blk_panic_write_op blkdev_panic_write;
+
+struct bdev_info {
+ dev_t devt;
+ sector_t nr_sects;
+ sector_t start_sect;
+};
+
+#define check_size(name, alignsize) ({ \
+ long _##name_ = (name); \
+ _##name_ = _##name_ <= 0 ? 0 : (_##name_ * 1024); \
+ if (_##name_ & ((alignsize) - 1)) { \
+ pr_info(#name " must align to %d\n", \
+ (alignsize)); \
+ _##name_ = ALIGN(name, (alignsize)); \
+ } \
+ _##name_; \
+})
+
+static int __register_pstore_device(struct pstore_device_info *dev)
+{
+ int ret;
+
+ lockdep_assert_held(&pstore_blk_lock);
+
+ if (!dev || !dev->total_size || !dev->read || !dev->write)
+ return -EINVAL;
+
+ /* someone already registered before */
+ if (pstore_zone_info)
+ return -EBUSY;
+
+ pstore_zone_info = kzalloc(sizeof(struct pstore_zone_info), GFP_KERNEL);
+ if (!pstore_zone_info)
+ return -ENOMEM;
+
+ /* zero means not limit on which backends to attempt to store. */
+ if (!dev->flags)
+ dev->flags = UINT_MAX;
+
+#define verify_size(name, alignsize, enabled) { \
+ long _##name_; \
+ if (enabled) \
+ _##name_ = check_size(name, alignsize); \
+ else \
+ _##name_ = 0; \
+ name = _##name_ / 1024; \
+ pstore_zone_info->name = _##name_; \
+ }
+
+ verify_size(kmsg_size, 4096, dev->flags & PSTORE_FLAGS_DMESG);
+ verify_size(pmsg_size, 4096, dev->flags & PSTORE_FLAGS_PMSG);
+ verify_size(console_size, 4096, dev->flags & PSTORE_FLAGS_CONSOLE);
+ verify_size(ftrace_size, 4096, dev->flags & PSTORE_FLAGS_FTRACE);
+#undef verify_size
+
+ pstore_zone_info->total_size = dev->total_size;
+ pstore_zone_info->max_reason = max_reason;
+ pstore_zone_info->read = dev->read;
+ pstore_zone_info->write = dev->write;
+ pstore_zone_info->erase = dev->erase;
+ pstore_zone_info->panic_write = dev->panic_write;
+ pstore_zone_info->name = KBUILD_MODNAME;
+ pstore_zone_info->owner = THIS_MODULE;
+
+ ret = register_pstore_zone(pstore_zone_info);
+ if (ret) {
+ kfree(pstore_zone_info);
+ pstore_zone_info = NULL;
+ }
+ return ret;
+}
+/**
+ * register_pstore_device() - register non-block device to pstore/blk
+ *
+ * @dev: non-block device information
+ *
+ * Return:
+ * * 0 - OK
+ * * Others - something error.
+ */
+int register_pstore_device(struct pstore_device_info *dev)
+{
+ int ret;
+
+ mutex_lock(&pstore_blk_lock);
+ ret = __register_pstore_device(dev);
+ mutex_unlock(&pstore_blk_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(register_pstore_device);
+
+static void __unregister_pstore_device(struct pstore_device_info *dev)
+{
+ lockdep_assert_held(&pstore_blk_lock);
+ if (pstore_zone_info && pstore_zone_info->read == dev->read) {
+ unregister_pstore_zone(pstore_zone_info);
+ kfree(pstore_zone_info);
+ pstore_zone_info = NULL;
+ }
+}
+
+/**
+ * unregister_pstore_device() - unregister non-block device from pstore/blk
+ *
+ * @dev: non-block device information
+ */
+void unregister_pstore_device(struct pstore_device_info *dev)
+{
+ mutex_lock(&pstore_blk_lock);
+ __unregister_pstore_device(dev);
+ mutex_unlock(&pstore_blk_lock);
+}
+EXPORT_SYMBOL_GPL(unregister_pstore_device);
+
+/**
+ * psblk_get_bdev() - open block device
+ *
+ * @holder: Exclusive holder identifier
+ * @info: Information about bdev to fill in
+ *
+ * Return: pointer to block device on success and others on error.
+ *
+ * On success, the returned block_device has reference count of one.
+ */
+static struct block_device *psblk_get_bdev(void *holder,
+ struct bdev_info *info)
+{
+ struct block_device *bdev = ERR_PTR(-ENODEV);
+ fmode_t mode = FMODE_READ | FMODE_WRITE;
+ sector_t nr_sects;
+
+ lockdep_assert_held(&pstore_blk_lock);
+
+ if (pstore_zone_info)
+ return ERR_PTR(-EBUSY);
+
+ if (!blkdev[0])
+ return ERR_PTR(-ENODEV);
+
+ if (holder)
+ mode |= FMODE_EXCL;
+ bdev = blkdev_get_by_path(blkdev, mode, holder);
+ if (IS_ERR(bdev)) {
+ dev_t devt;
+
+ devt = name_to_dev_t(blkdev);
+ if (devt == 0)
+ return ERR_PTR(-ENODEV);
+ bdev = blkdev_get_by_dev(devt, mode, holder);
+ if (IS_ERR(bdev))
+ return bdev;
+ }
+
+ nr_sects = part_nr_sects_read(bdev->bd_part);
+ if (!nr_sects) {
+ pr_err("not enough space for '%s'\n", blkdev);
+ blkdev_put(bdev, mode);
+ return ERR_PTR(-ENOSPC);
+ }
+
+ if (info) {
+ info->devt = bdev->bd_dev;
+ info->nr_sects = nr_sects;
+ info->start_sect = get_start_sect(bdev);
+ }
+
+ return bdev;
+}
+
+static void psblk_put_bdev(struct block_device *bdev, void *holder)
+{
+ fmode_t mode = FMODE_READ | FMODE_WRITE;
+
+ lockdep_assert_held(&pstore_blk_lock);
+
+ if (!bdev)
+ return;
+
+ if (holder)
+ mode |= FMODE_EXCL;
+ blkdev_put(bdev, mode);
+}
+
+static ssize_t psblk_generic_blk_read(char *buf, size_t bytes, loff_t pos)
+{
+ struct block_device *bdev = psblk_bdev;
+ struct file file;
+ struct kiocb kiocb;
+ struct iov_iter iter;
+ struct kvec iov = {.iov_base = buf, .iov_len = bytes};
+
+ if (!bdev)
+ return -ENODEV;
+
+ memset(&file, 0, sizeof(struct file));
+ file.f_mapping = bdev->bd_inode->i_mapping;
+ file.f_flags = O_DSYNC | __O_SYNC | O_NOATIME;
+ file.f_inode = bdev->bd_inode;
+ file_ra_state_init(&file.f_ra, file.f_mapping);
+
+ init_sync_kiocb(&kiocb, &file);
+ kiocb.ki_pos = pos;
+ iov_iter_kvec(&iter, READ, &iov, 1, bytes);
+
+ return generic_file_read_iter(&kiocb, &iter);
+}
+
+static ssize_t psblk_generic_blk_write(const char *buf, size_t bytes,
+ loff_t pos)
+{
+ struct block_device *bdev = psblk_bdev;
+ struct iov_iter iter;
+ struct kiocb kiocb;
+ struct file file;
+ ssize_t ret;
+ struct kvec iov = {.iov_base = (void *)buf, .iov_len = bytes};
+
+ if (!bdev)
+ return -ENODEV;
+
+ /* Console/Ftrace backend may handle buffer until flush dirty zones */
+ if (in_interrupt() || irqs_disabled())
+ return -EBUSY;
+
+ memset(&file, 0, sizeof(struct file));
+ file.f_mapping = bdev->bd_inode->i_mapping;
+ file.f_flags = O_DSYNC | __O_SYNC | O_NOATIME;
+ file.f_inode = bdev->bd_inode;
+
+ init_sync_kiocb(&kiocb, &file);
+ kiocb.ki_pos = pos;
+ iov_iter_kvec(&iter, WRITE, &iov, 1, bytes);
+
+ inode_lock(bdev->bd_inode);
+ ret = generic_write_checks(&kiocb, &iter);
+ if (ret > 0)
+ ret = generic_perform_write(&file, &iter, pos);
+ inode_unlock(bdev->bd_inode);
+
+ if (likely(ret > 0)) {
+ const struct file_operations f_op = {.fsync = blkdev_fsync};
+
+ file.f_op = &f_op;
+ kiocb.ki_pos += ret;
+ ret = generic_write_sync(&kiocb, ret);
+ }
+ return ret;
+}
+
+static ssize_t psblk_blk_panic_write(const char *buf, size_t size,
+ loff_t off)
+{
+ int ret;
+
+ if (!blkdev_panic_write)
+ return -EOPNOTSUPP;
+
+ /* size and off must align to SECTOR_SIZE for block device */
+ ret = blkdev_panic_write(buf, off >> SECTOR_SHIFT,
+ size >> SECTOR_SHIFT);
+ /* try next zone */
+ if (ret == -ENOMSG)
+ return ret;
+ return ret ? -EIO : size;
+}
+
+static int __register_pstore_blk(struct pstore_blk_info *info)
+{
+ char bdev_name[BDEVNAME_SIZE];
+ struct block_device *bdev;
+ struct pstore_device_info dev;
+ struct bdev_info binfo;
+ void *holder = blkdev;
+ int ret = -ENODEV;
+
+ lockdep_assert_held(&pstore_blk_lock);
+
+ /* hold bdev exclusively */
+ memset(&binfo, 0, sizeof(binfo));
+ bdev = psblk_get_bdev(holder, &binfo);
+ if (IS_ERR(bdev)) {
+ pr_err("failed to open '%s'!\n", blkdev);
+ return PTR_ERR(bdev);
+ }
+
+ /* only allow driver matching the @blkdev */
+ if (!binfo.devt || (!best_effort &&
+ MAJOR(binfo.devt) != info->major)) {
+ pr_debug("invalid major %u (expect %u)\n",
+ info->major, MAJOR(binfo.devt));
+ ret = -ENODEV;
+ goto err_put_bdev;
+ }
+
+ /* psblk_bdev must be assigned before register to pstore/blk */
+ psblk_bdev = bdev;
+ blkdev_panic_write = info->panic_write;
+
+ /* Copy back block device details. */
+ info->devt = binfo.devt;
+ info->nr_sects = binfo.nr_sects;
+ info->start_sect = binfo.start_sect;
+
+ memset(&dev, 0, sizeof(dev));
+ dev.total_size = info->nr_sects << SECTOR_SHIFT;
+ dev.flags = info->flags;
+ dev.read = psblk_generic_blk_read;
+ dev.write = psblk_generic_blk_write;
+ dev.erase = NULL;
+ dev.panic_write = info->panic_write ? psblk_blk_panic_write : NULL;
+
+ ret = __register_pstore_device(&dev);
+ if (ret)
+ goto err_put_bdev;
+
+ bdevname(bdev, bdev_name);
+ pr_info("attached %s%s\n", bdev_name,
+ info->panic_write ? "" : " (no dedicated panic_write!)");
+ return 0;
+
+err_put_bdev:
+ psblk_bdev = NULL;
+ blkdev_panic_write = NULL;
+ psblk_put_bdev(bdev, holder);
+ return ret;
+}
+
+/**
+ * register_pstore_blk() - register block device to pstore/blk
+ *
+ * @info: details on the desired block device interface
+ *
+ * Return:
+ * * 0 - OK
+ * * Others - something error.
+ */
+int register_pstore_blk(struct pstore_blk_info *info)
+{
+ int ret;
+
+ mutex_lock(&pstore_blk_lock);
+ ret = __register_pstore_blk(info);
+ mutex_unlock(&pstore_blk_lock);
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(register_pstore_blk);
+
+static void __unregister_pstore_blk(unsigned int major)
+{
+ struct pstore_device_info dev = { .read = psblk_generic_blk_read };
+ void *holder = blkdev;
+
+ lockdep_assert_held(&pstore_blk_lock);
+ if (psblk_bdev && MAJOR(psblk_bdev->bd_dev) == major) {
+ __unregister_pstore_device(&dev);
+ psblk_put_bdev(psblk_bdev, holder);
+ blkdev_panic_write = NULL;
+ psblk_bdev = NULL;
+ }
+}
+
+/**
+ * unregister_pstore_blk() - unregister block device from pstore/blk
+ *
+ * @major: the major device number of device
+ */
+void unregister_pstore_blk(unsigned int major)
+{
+ mutex_lock(&pstore_blk_lock);
+ __unregister_pstore_blk(major);
+ mutex_unlock(&pstore_blk_lock);
+}
+EXPORT_SYMBOL_GPL(unregister_pstore_blk);
+
+/* get information of pstore/blk */
+int pstore_blk_get_config(struct pstore_blk_config *info)
+{
+ strncpy(info->device, blkdev, 80);
+ info->max_reason = max_reason;
+ info->kmsg_size = check_size(kmsg_size, 4096);
+ info->pmsg_size = check_size(pmsg_size, 4096);
+ info->ftrace_size = check_size(ftrace_size, 4096);
+ info->console_size = check_size(console_size, 4096);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pstore_blk_get_config);
+
+static int __init pstore_blk_init(void)
+{
+ struct pstore_blk_info info = { };
+ int ret = 0;
+
+ mutex_lock(&pstore_blk_lock);
+ if (!pstore_zone_info && best_effort && blkdev[0])
+ ret = __register_pstore_blk(&info);
+ mutex_unlock(&pstore_blk_lock);
+
+ return ret;
+}
+late_initcall(pstore_blk_init);
+
+static void __exit pstore_blk_exit(void)
+{
+ mutex_lock(&pstore_blk_lock);
+ if (psblk_bdev)
+ __unregister_pstore_blk(MAJOR(psblk_bdev->bd_dev));
+ else {
+ struct pstore_device_info dev = { };
+
+ if (pstore_zone_info)
+ dev.read = pstore_zone_info->read;
+ __unregister_pstore_device(&dev);
+ }
+ mutex_unlock(&pstore_blk_lock);
+}
+module_exit(pstore_blk_exit);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
+MODULE_AUTHOR("Kees Cook <keescook@chromium.org>");
+MODULE_DESCRIPTION("pstore backend for block devices");
#include <linux/debugfs.h>
#include <linux/err.h>
#include <linux/cache.h>
+#include <linux/slab.h>
#include <asm/barrier.h>
#include "internal.h"
debugfs_remove_recursive(pstore_ftrace_dir);
}
+
+ssize_t pstore_ftrace_combine_log(char **dest_log, size_t *dest_log_size,
+ const char *src_log, size_t src_log_size)
+{
+ size_t dest_size, src_size, total, dest_off, src_off;
+ size_t dest_idx = 0, src_idx = 0, merged_idx = 0;
+ void *merged_buf;
+ struct pstore_ftrace_record *drec, *srec, *mrec;
+ size_t record_size = sizeof(struct pstore_ftrace_record);
+
+ dest_off = *dest_log_size % record_size;
+ dest_size = *dest_log_size - dest_off;
+
+ src_off = src_log_size % record_size;
+ src_size = src_log_size - src_off;
+
+ total = dest_size + src_size;
+ merged_buf = kmalloc(total, GFP_KERNEL);
+ if (!merged_buf)
+ return -ENOMEM;
+
+ drec = (struct pstore_ftrace_record *)(*dest_log + dest_off);
+ srec = (struct pstore_ftrace_record *)(src_log + src_off);
+ mrec = (struct pstore_ftrace_record *)(merged_buf);
+
+ while (dest_size > 0 && src_size > 0) {
+ if (pstore_ftrace_read_timestamp(&drec[dest_idx]) <
+ pstore_ftrace_read_timestamp(&srec[src_idx])) {
+ mrec[merged_idx++] = drec[dest_idx++];
+ dest_size -= record_size;
+ } else {
+ mrec[merged_idx++] = srec[src_idx++];
+ src_size -= record_size;
+ }
+ }
+
+ while (dest_size > 0) {
+ mrec[merged_idx++] = drec[dest_idx++];
+ dest_size -= record_size;
+ }
+
+ while (src_size > 0) {
+ mrec[merged_idx++] = srec[src_idx++];
+ src_size -= record_size;
+ }
+
+ kfree(*dest_log);
+ *dest_log = merged_buf;
+ *dest_log_size = total;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(pstore_ftrace_combine_log);
#include <linux/magic.h>
#include <linux/pstore.h>
#include <linux/slab.h>
-#include <linux/spinlock.h>
#include <linux/uaccess.h>
#include "internal.h"
#define PSTORE_NAMELEN 64
-static DEFINE_SPINLOCK(allpstore_lock);
-static LIST_HEAD(allpstore);
+static DEFINE_MUTEX(records_list_lock);
+static LIST_HEAD(records_list);
+
+static DEFINE_MUTEX(pstore_sb_lock);
+static struct super_block *pstore_sb;
struct pstore_private {
struct list_head list;
+ struct dentry *dentry;
struct pstore_record *record;
size_t total_size;
};
{
struct pstore_private *p = d_inode(dentry)->i_private;
struct pstore_record *record = p->record;
+ int rc = 0;
if (!record->psi->erase)
return -EPERM;
+ /* Make sure we can't race while removing this file. */
+ mutex_lock(&records_list_lock);
+ if (!list_empty(&p->list))
+ list_del_init(&p->list);
+ else
+ rc = -ENOENT;
+ p->dentry = NULL;
+ mutex_unlock(&records_list_lock);
+ if (rc)
+ return rc;
+
mutex_lock(&record->psi->read_mutex);
record->psi->erase(record);
mutex_unlock(&record->psi->read_mutex);
static void pstore_evict_inode(struct inode *inode)
{
struct pstore_private *p = inode->i_private;
- unsigned long flags;
clear_inode(inode);
- if (p) {
- spin_lock_irqsave(&allpstore_lock, flags);
- list_del(&p->list);
- spin_unlock_irqrestore(&allpstore_lock, flags);
- free_pstore_private(p);
- }
+ free_pstore_private(p);
}
static const struct inode_operations pstore_dir_inode_operations = {
.show_options = pstore_show_options,
};
-static struct super_block *pstore_sb;
+static struct dentry *psinfo_lock_root(void)
+{
+ struct dentry *root;
-bool pstore_is_mounted(void)
+ mutex_lock(&pstore_sb_lock);
+ /*
+ * Having no backend is fine -- no records appear.
+ * Not being mounted is fine -- nothing to do.
+ */
+ if (!psinfo || !pstore_sb) {
+ mutex_unlock(&pstore_sb_lock);
+ return NULL;
+ }
+
+ root = pstore_sb->s_root;
+ inode_lock(d_inode(root));
+ mutex_unlock(&pstore_sb_lock);
+
+ return root;
+}
+
+int pstore_put_backend_records(struct pstore_info *psi)
{
- return pstore_sb != NULL;
+ struct pstore_private *pos, *tmp;
+ struct dentry *root;
+ int rc = 0;
+
+ root = psinfo_lock_root();
+ if (!root)
+ return 0;
+
+ mutex_lock(&records_list_lock);
+ list_for_each_entry_safe(pos, tmp, &records_list, list) {
+ if (pos->record->psi == psi) {
+ list_del_init(&pos->list);
+ rc = simple_unlink(d_inode(root), pos->dentry);
+ if (WARN_ON(rc))
+ break;
+ d_drop(pos->dentry);
+ dput(pos->dentry);
+ pos->dentry = NULL;
+ }
+ }
+ mutex_unlock(&records_list_lock);
+
+ inode_unlock(d_inode(root));
+
+ return rc;
}
/*
int rc = 0;
char name[PSTORE_NAMELEN];
struct pstore_private *private, *pos;
- unsigned long flags;
size_t size = record->size + record->ecc_notice_size;
- WARN_ON(!inode_is_locked(d_inode(root)));
+ if (WARN_ON(!inode_is_locked(d_inode(root))))
+ return -EINVAL;
- spin_lock_irqsave(&allpstore_lock, flags);
- list_for_each_entry(pos, &allpstore, list) {
+ rc = -EEXIST;
+ /* Skip records that are already present in the filesystem. */
+ mutex_lock(&records_list_lock);
+ list_for_each_entry(pos, &records_list, list) {
if (pos->record->type == record->type &&
pos->record->id == record->id &&
- pos->record->psi == record->psi) {
- rc = -EEXIST;
- break;
- }
+ pos->record->psi == record->psi)
+ goto fail;
}
- spin_unlock_irqrestore(&allpstore_lock, flags);
- if (rc)
- return rc;
rc = -ENOMEM;
inode = pstore_get_inode(root->d_sb);
if (!dentry)
goto fail_private;
+ private->dentry = dentry;
private->record = record;
inode->i_size = private->total_size = size;
inode->i_private = private;
d_add(dentry, inode);
- spin_lock_irqsave(&allpstore_lock, flags);
- list_add(&private->list, &allpstore);
- spin_unlock_irqrestore(&allpstore_lock, flags);
+ list_add(&private->list, &records_list);
+ mutex_unlock(&records_list_lock);
return 0;
free_pstore_private(private);
fail_inode:
iput(inode);
-
fail:
+ mutex_unlock(&records_list_lock);
return rc;
}
*/
void pstore_get_records(int quiet)
{
- struct pstore_info *psi = psinfo;
struct dentry *root;
- if (!psi || !pstore_sb)
+ root = psinfo_lock_root();
+ if (!root)
return;
- root = pstore_sb->s_root;
-
- inode_lock(d_inode(root));
- pstore_get_backend_records(psi, root, quiet);
+ pstore_get_backend_records(psinfo, root, quiet);
inode_unlock(d_inode(root));
}
{
struct inode *inode;
- pstore_sb = sb;
-
sb->s_maxbytes = MAX_LFS_FILESIZE;
sb->s_blocksize = PAGE_SIZE;
sb->s_blocksize_bits = PAGE_SHIFT;
if (!sb->s_root)
return -ENOMEM;
+ mutex_lock(&pstore_sb_lock);
+ pstore_sb = sb;
+ mutex_unlock(&pstore_sb_lock);
+
pstore_get_records(0);
return 0;
static void pstore_kill_sb(struct super_block *sb)
{
+ mutex_lock(&pstore_sb_lock);
+ WARN_ON(pstore_sb != sb);
+
kill_litter_super(sb);
pstore_sb = NULL;
+
+ mutex_lock(&records_list_lock);
+ INIT_LIST_HEAD(&records_list);
+ mutex_unlock(&records_list_lock);
+
+ mutex_unlock(&pstore_sb_lock);
}
static struct file_system_type pstore_fs_type = {
#ifdef CONFIG_PSTORE_FTRACE
extern void pstore_register_ftrace(void);
extern void pstore_unregister_ftrace(void);
+ssize_t pstore_ftrace_combine_log(char **dest_log, size_t *dest_log_size,
+ const char *src_log, size_t src_log_size);
#else
static inline void pstore_register_ftrace(void) {}
static inline void pstore_unregister_ftrace(void) {}
+static inline ssize_t
+pstore_ftrace_combine_log(char **dest_log, size_t *dest_log_size,
+ const char *src_log, size_t src_log_size)
+{
+ *dest_log_size = 0;
+ return 0;
+}
#endif
#ifdef CONFIG_PSTORE_PMSG
extern void pstore_get_records(int);
extern void pstore_get_backend_records(struct pstore_info *psi,
struct dentry *root, int quiet);
+extern int pstore_put_backend_records(struct pstore_info *psi);
extern int pstore_mkfile(struct dentry *root,
struct pstore_record *record);
-extern bool pstore_is_mounted(void);
extern void pstore_record_init(struct pstore_record *record,
struct pstore_info *psi);
module_param_named(update_ms, pstore_update_ms, int, 0600);
MODULE_PARM_DESC(update_ms, "milliseconds before pstore updates its content "
"(default is -1, which means runtime updates are disabled; "
- "enabling this option is not safe, it may lead to further "
+ "enabling this option may not be safe; it may lead to further "
"corruption on Oopses)");
/* Names should be in the same order as the enum pstore_type_id */
static DECLARE_WORK(pstore_work, pstore_dowork);
/*
- * pstore_lock just protects "psinfo" during
- * calls to pstore_register()
+ * psinfo_lock protects "psinfo" during calls to
+ * pstore_register(), pstore_unregister(), and
+ * the filesystem mount/unmount routines.
*/
-static DEFINE_SPINLOCK(pstore_lock);
+static DEFINE_MUTEX(psinfo_lock);
struct pstore_info *psinfo;
static char *backend;
+module_param(backend, charp, 0444);
+MODULE_PARM_DESC(backend, "specific backend to use");
+
static char *compress =
#ifdef CONFIG_PSTORE_COMPRESS_DEFAULT
CONFIG_PSTORE_COMPRESS_DEFAULT;
#else
NULL;
#endif
+module_param(compress, charp, 0444);
+MODULE_PARM_DESC(compress, "compression to use");
/* Compression parameters */
static struct crypto_comp *tfm;
}
EXPORT_SYMBOL_GPL(pstore_name_to_type);
-static const char *get_reason_str(enum kmsg_dump_reason reason)
+static void pstore_timer_kick(void)
{
- switch (reason) {
- case KMSG_DUMP_PANIC:
- return "Panic";
- case KMSG_DUMP_OOPS:
- return "Oops";
- case KMSG_DUMP_EMERG:
- return "Emergency";
- case KMSG_DUMP_RESTART:
- return "Restart";
- case KMSG_DUMP_HALT:
- return "Halt";
- case KMSG_DUMP_POWEROFF:
- return "Poweroff";
- default:
- return "Unknown";
- }
+ if (pstore_update_ms < 0)
+ return;
+
+ mod_timer(&pstore_timer, jiffies + msecs_to_jiffies(pstore_update_ms));
}
/*
unsigned int part = 1;
int ret;
- why = get_reason_str(reason);
+ why = kmsg_dump_reason_str(reason);
if (down_trylock(&psinfo->buf_lock)) {
/* Failed to acquire lock: give up if we cannot wait. */
}
ret = psinfo->write(&record);
- if (ret == 0 && reason == KMSG_DUMP_OOPS && pstore_is_mounted())
+ if (ret == 0 && reason == KMSG_DUMP_OOPS) {
pstore_new_entry = 1;
+ pstore_timer_kick();
+ }
total += record.size;
part++;
}
static struct console pstore_console = {
- .name = "pstore",
.write = pstore_console_write,
- .flags = CON_PRINTBUFFER | CON_ENABLED | CON_ANYTIME,
.index = -1,
};
static void pstore_register_console(void)
{
+ /* Show which backend is going to get console writes. */
+ strscpy(pstore_console.name, psinfo->name,
+ sizeof(pstore_console.name));
+ /*
+ * Always initialize flags here since prior unregister_console()
+ * calls may have changed settings (specifically CON_ENABLED).
+ */
+ pstore_console.flags = CON_PRINTBUFFER | CON_ENABLED | CON_ANYTIME;
register_console(&pstore_console);
}
*/
int pstore_register(struct pstore_info *psi)
{
- struct module *owner = psi->owner;
-
if (backend && strcmp(backend, psi->name)) {
pr_warn("ignoring unexpected backend '%s'\n", psi->name);
return -EPERM;
return -EINVAL;
}
- spin_lock(&pstore_lock);
+ mutex_lock(&psinfo_lock);
if (psinfo) {
pr_warn("backend '%s' already loaded: ignoring '%s'\n",
psinfo->name, psi->name);
- spin_unlock(&pstore_lock);
+ mutex_unlock(&psinfo_lock);
return -EBUSY;
}
psinfo = psi;
mutex_init(&psinfo->read_mutex);
sema_init(&psinfo->buf_lock, 1);
- spin_unlock(&pstore_lock);
-
- if (owner && !try_module_get(owner)) {
- psinfo = NULL;
- return -EINVAL;
- }
if (psi->flags & PSTORE_FLAGS_DMESG)
allocate_buf_for_compression();
- if (pstore_is_mounted())
- pstore_get_records(0);
+ pstore_get_records(0);
- if (psi->flags & PSTORE_FLAGS_DMESG)
+ if (psi->flags & PSTORE_FLAGS_DMESG) {
+ pstore_dumper.max_reason = psinfo->max_reason;
pstore_register_kmsg();
+ }
if (psi->flags & PSTORE_FLAGS_CONSOLE)
pstore_register_console();
if (psi->flags & PSTORE_FLAGS_FTRACE)
pstore_register_pmsg();
/* Start watching for new records, if desired. */
- if (pstore_update_ms >= 0) {
- pstore_timer.expires = jiffies +
- msecs_to_jiffies(pstore_update_ms);
- add_timer(&pstore_timer);
- }
+ pstore_timer_kick();
/*
* Update the module parameter backend, so it is visible
* through /sys/module/pstore/parameters/backend
*/
- backend = psi->name;
+ backend = kstrdup(psi->name, GFP_KERNEL);
pr_info("Registered %s as persistent store backend\n", psi->name);
- module_put(owner);
-
+ mutex_unlock(&psinfo_lock);
return 0;
}
EXPORT_SYMBOL_GPL(pstore_register);
void pstore_unregister(struct pstore_info *psi)
{
- /* Stop timer and make sure all work has finished. */
- pstore_update_ms = -1;
- del_timer_sync(&pstore_timer);
- flush_work(&pstore_work);
+ /* It's okay to unregister nothing. */
+ if (!psi)
+ return;
+
+ mutex_lock(&psinfo_lock);
+
+ /* Only one backend can be registered at a time. */
+ if (WARN_ON(psi != psinfo)) {
+ mutex_unlock(&psinfo_lock);
+ return;
+ }
+ /* Unregister all callbacks. */
if (psi->flags & PSTORE_FLAGS_PMSG)
pstore_unregister_pmsg();
if (psi->flags & PSTORE_FLAGS_FTRACE)
if (psi->flags & PSTORE_FLAGS_DMESG)
pstore_unregister_kmsg();
+ /* Stop timer and make sure all work has finished. */
+ del_timer_sync(&pstore_timer);
+ flush_work(&pstore_work);
+
+ /* Remove all backend records from filesystem tree. */
+ pstore_put_backend_records(psi);
+
free_buf_for_compression();
psinfo = NULL;
+ kfree(backend);
backend = NULL;
+ mutex_unlock(&psinfo_lock);
}
EXPORT_SYMBOL_GPL(pstore_unregister);
schedule_work(&pstore_work);
}
- if (pstore_update_ms >= 0)
- mod_timer(&pstore_timer,
- jiffies + msecs_to_jiffies(pstore_update_ms));
+ pstore_timer_kick();
}
static void __init pstore_choose_compression(void)
}
module_exit(pstore_exit)
-module_param(compress, charp, 0444);
-MODULE_PARM_DESC(compress, "Pstore compression to use");
-
-module_param(backend, charp, 0444);
-MODULE_PARM_DESC(backend, "Pstore backend to use");
-
MODULE_AUTHOR("Tony Luck <tony.luck@intel.com>");
MODULE_LICENSE("GPL");
#include <linux/pstore_ram.h>
#include <linux/of.h>
#include <linux/of_address.h>
+#include "internal.h"
#define RAMOOPS_KERNMSG_HDR "===="
#define MIN_MEM_SIZE 4096UL
"size of reserved RAM used to store oops/panic logs");
static unsigned int mem_type;
-module_param(mem_type, uint, 0600);
+module_param(mem_type, uint, 0400);
MODULE_PARM_DESC(mem_type,
"set to 1 to try to use unbuffered memory (default 0)");
-static int dump_oops = 1;
-module_param(dump_oops, int, 0600);
-MODULE_PARM_DESC(dump_oops,
- "set to 1 to dump oopses, 0 to only dump panics (default 1)");
+static int ramoops_max_reason = -1;
+module_param_named(max_reason, ramoops_max_reason, int, 0400);
+MODULE_PARM_DESC(max_reason,
+ "maximum reason for kmsg dump (default 2: Oops and Panic) ");
static int ramoops_ecc;
-module_param_named(ecc, ramoops_ecc, int, 0600);
+module_param_named(ecc, ramoops_ecc, int, 0400);
MODULE_PARM_DESC(ramoops_ecc,
"if non-zero, the option enables ECC support and specifies "
"ECC buffer size in bytes (1 is a special value, means 16 "
"bytes ECC)");
+static int ramoops_dump_oops = -1;
+module_param_named(dump_oops, ramoops_dump_oops, int, 0400);
+MODULE_PARM_DESC(dump_oops,
+ "(deprecated: use max_reason instead) set to 1 to dump oopses & panics, 0 to only dump panics");
+
struct ramoops_context {
struct persistent_ram_zone **dprzs; /* Oops dump zones */
struct persistent_ram_zone *cprz; /* Console zone */
size_t console_size;
size_t ftrace_size;
size_t pmsg_size;
- int dump_oops;
u32 flags;
struct persistent_ram_ecc_info ecc_info;
unsigned int max_dump_cnt;
persistent_ram_ecc_string(prz, NULL, 0));
}
-static ssize_t ftrace_log_combine(struct persistent_ram_zone *dest,
- struct persistent_ram_zone *src)
-{
- size_t dest_size, src_size, total, dest_off, src_off;
- size_t dest_idx = 0, src_idx = 0, merged_idx = 0;
- void *merged_buf;
- struct pstore_ftrace_record *drec, *srec, *mrec;
- size_t record_size = sizeof(struct pstore_ftrace_record);
-
- dest_off = dest->old_log_size % record_size;
- dest_size = dest->old_log_size - dest_off;
-
- src_off = src->old_log_size % record_size;
- src_size = src->old_log_size - src_off;
-
- total = dest_size + src_size;
- merged_buf = kmalloc(total, GFP_KERNEL);
- if (!merged_buf)
- return -ENOMEM;
-
- drec = (struct pstore_ftrace_record *)(dest->old_log + dest_off);
- srec = (struct pstore_ftrace_record *)(src->old_log + src_off);
- mrec = (struct pstore_ftrace_record *)(merged_buf);
-
- while (dest_size > 0 && src_size > 0) {
- if (pstore_ftrace_read_timestamp(&drec[dest_idx]) <
- pstore_ftrace_read_timestamp(&srec[src_idx])) {
- mrec[merged_idx++] = drec[dest_idx++];
- dest_size -= record_size;
- } else {
- mrec[merged_idx++] = srec[src_idx++];
- src_size -= record_size;
- }
- }
-
- while (dest_size > 0) {
- mrec[merged_idx++] = drec[dest_idx++];
- dest_size -= record_size;
- }
-
- while (src_size > 0) {
- mrec[merged_idx++] = srec[src_idx++];
- src_size -= record_size;
- }
-
- kfree(dest->old_log);
- dest->old_log = merged_buf;
- dest->old_log_size = total;
-
- return 0;
-}
-
static ssize_t ramoops_pstore_read(struct pstore_record *record)
{
ssize_t size = 0;
tmp_prz->corrected_bytes +=
prz_next->corrected_bytes;
tmp_prz->bad_blocks += prz_next->bad_blocks;
- size = ftrace_log_combine(tmp_prz, prz_next);
+
+ size = pstore_ftrace_combine_log(
+ &tmp_prz->old_log,
+ &tmp_prz->old_log_size,
+ prz_next->old_log,
+ prz_next->old_log_size);
if (size)
goto out;
}
return -EINVAL;
/*
- * Out of the various dmesg dump types, ramoops is currently designed
- * to only store crash logs, rather than storing general kernel logs.
+ * We could filter on record->reason here if we wanted to (which
+ * would duplicate what happened before the "max_reason" setting
+ * was added), but that would defeat the purpose of a system
+ * changing printk.always_kmsg_dump, so instead log everything that
+ * the kmsg dumper sends us, since it should be doing the filtering
+ * based on the combination of printk.always_kmsg_dump and our
+ * requested "max_reason".
*/
- if (record->reason != KMSG_DUMP_OOPS &&
- record->reason != KMSG_DUMP_PANIC)
- return -EINVAL;
-
- /* Skip Oopes when configured to do so. */
- if (record->reason == KMSG_DUMP_OOPS && !cxt->dump_oops)
- return -EINVAL;
/*
* Explicitly only take the first part of any new crash.
return 0;
}
-static int ramoops_parse_dt_size(struct platform_device *pdev,
- const char *propname, u32 *value)
+/* Read a u32 from a dt property and make sure it's safe for an int. */
+static int ramoops_parse_dt_u32(struct platform_device *pdev,
+ const char *propname,
+ u32 default_value, u32 *value)
{
u32 val32 = 0;
int ret;
ret = of_property_read_u32(pdev->dev.of_node, propname, &val32);
- if (ret < 0 && ret != -EINVAL) {
+ if (ret == -EINVAL) {
+ /* field is missing, use default value. */
+ val32 = default_value;
+ } else if (ret < 0) {
dev_err(&pdev->dev, "failed to parse property %s: %d\n",
propname, ret);
return ret;
}
+ /* Sanity check our results. */
if (val32 > INT_MAX) {
dev_err(&pdev->dev, "%s %u > INT_MAX\n", propname, val32);
return -EOVERFLOW;
pdata->mem_size = resource_size(res);
pdata->mem_address = res->start;
pdata->mem_type = of_property_read_bool(of_node, "unbuffered");
- pdata->dump_oops = !of_property_read_bool(of_node, "no-dump-oops");
-
-#define parse_size(name, field) { \
- ret = ramoops_parse_dt_size(pdev, name, &value); \
+ /*
+ * Setting "no-dump-oops" is deprecated and will be ignored if
+ * "max_reason" is also specified.
+ */
+ if (of_property_read_bool(of_node, "no-dump-oops"))
+ pdata->max_reason = KMSG_DUMP_PANIC;
+ else
+ pdata->max_reason = KMSG_DUMP_OOPS;
+
+#define parse_u32(name, field, default_value) { \
+ ret = ramoops_parse_dt_u32(pdev, name, default_value, \
+ &value); \
if (ret < 0) \
return ret; \
field = value; \
}
- parse_size("record-size", pdata->record_size);
- parse_size("console-size", pdata->console_size);
- parse_size("ftrace-size", pdata->ftrace_size);
- parse_size("pmsg-size", pdata->pmsg_size);
- parse_size("ecc-size", pdata->ecc_info.ecc_size);
- parse_size("flags", pdata->flags);
+ parse_u32("record-size", pdata->record_size, 0);
+ parse_u32("console-size", pdata->console_size, 0);
+ parse_u32("ftrace-size", pdata->ftrace_size, 0);
+ parse_u32("pmsg-size", pdata->pmsg_size, 0);
+ parse_u32("ecc-size", pdata->ecc_info.ecc_size, 0);
+ parse_u32("flags", pdata->flags, 0);
+ parse_u32("max-reason", pdata->max_reason, pdata->max_reason);
-#undef parse_size
+#undef parse_u32
/*
* Some old Chromebooks relied on the kernel setting the
cxt->console_size = pdata->console_size;
cxt->ftrace_size = pdata->ftrace_size;
cxt->pmsg_size = pdata->pmsg_size;
- cxt->dump_oops = pdata->dump_oops;
cxt->flags = pdata->flags;
cxt->ecc_info = pdata->ecc_info;
* the single region size is how to check.
*/
cxt->pstore.flags = 0;
- if (cxt->max_dump_cnt)
+ if (cxt->max_dump_cnt) {
cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
+ cxt->pstore.max_reason = pdata->max_reason;
+ }
if (cxt->console_size)
cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
if (cxt->max_ftrace_cnt)
mem_size = pdata->mem_size;
mem_address = pdata->mem_address;
record_size = pdata->record_size;
- dump_oops = pdata->dump_oops;
+ ramoops_max_reason = pdata->max_reason;
ramoops_console_size = pdata->console_size;
ramoops_pmsg_size = pdata->pmsg_size;
ramoops_ftrace_size = pdata->ftrace_size;
pdata.console_size = ramoops_console_size;
pdata.ftrace_size = ramoops_ftrace_size;
pdata.pmsg_size = ramoops_pmsg_size;
- pdata.dump_oops = dump_oops;
+ /* If "max_reason" is set, its value has priority over "dump_oops". */
+ if (ramoops_max_reason >= 0)
+ pdata.max_reason = ramoops_max_reason;
+ /* Otherwise, if "dump_oops" is set, parse it into "max_reason". */
+ else if (ramoops_dump_oops != -1)
+ pdata.max_reason = ramoops_dump_oops ? KMSG_DUMP_OOPS
+ : KMSG_DUMP_PANIC;
+ /* And if neither are explicitly set, use the default. */
+ else
+ pdata.max_reason = KMSG_DUMP_OOPS;
pdata.flags = RAMOOPS_FLAG_FTRACE_PER_CPU;
/*
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Provide a pstore intermediate backend, organized into kernel memory
+ * allocated zones that are then mapped and flushed into a single
+ * contiguous region on a storage backend of some kind (block, mtd, etc).
+ */
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+#include <linux/slab.h>
+#include <linux/mount.h>
+#include <linux/printk.h>
+#include <linux/fs.h>
+#include <linux/pstore_zone.h>
+#include <linux/kdev_t.h>
+#include <linux/device.h>
+#include <linux/namei.h>
+#include <linux/fcntl.h>
+#include <linux/uio.h>
+#include <linux/writeback.h>
+#include "internal.h"
+
+/**
+ * struct psz_head - header of zone to flush to storage
+ *
+ * @sig: signature to indicate header (PSZ_SIG xor PSZONE-type value)
+ * @datalen: length of data in @data
+ * @start: offset into @data where the beginning of the stored bytes begin
+ * @data: zone data.
+ */
+struct psz_buffer {
+#define PSZ_SIG (0x43474244) /* DBGC */
+ uint32_t sig;
+ atomic_t datalen;
+ atomic_t start;
+ uint8_t data[];
+};
+
+/**
+ * struct psz_kmsg_header - kmsg dump-specific header to flush to storage
+ *
+ * @magic: magic num for kmsg dump header
+ * @time: kmsg dump trigger time
+ * @compressed: whether conpressed
+ * @counter: kmsg dump counter
+ * @reason: the kmsg dump reason (e.g. oops, panic, etc)
+ * @data: pointer to log data
+ *
+ * This is a sub-header for a kmsg dump, trailing after &psz_buffer.
+ */
+struct psz_kmsg_header {
+#define PSTORE_KMSG_HEADER_MAGIC 0x4dfc3ae5 /* Just a random number */
+ uint32_t magic;
+ struct timespec64 time;
+ bool compressed;
+ uint32_t counter;
+ enum kmsg_dump_reason reason;
+ uint8_t data[];
+};
+
+/**
+ * struct pstore_zone - single stored buffer
+ *
+ * @off: zone offset of storage
+ * @type: front-end type for this zone
+ * @name: front-end name for this zone
+ * @buffer: pointer to data buffer managed by this zone
+ * @oldbuf: pointer to old data buffer
+ * @buffer_size: bytes in @buffer->data
+ * @should_recover: whether this zone should recover from storage
+ * @dirty: whether the data in @buffer dirty
+ *
+ * zone structure in memory.
+ */
+struct pstore_zone {
+ loff_t off;
+ const char *name;
+ enum pstore_type_id type;
+
+ struct psz_buffer *buffer;
+ struct psz_buffer *oldbuf;
+ size_t buffer_size;
+ bool should_recover;
+ atomic_t dirty;
+};
+
+/**
+ * struct psz_context - all about running state of pstore/zone
+ *
+ * @kpszs: kmsg dump storage zones
+ * @ppsz: pmsg storage zone
+ * @cpsz: console storage zone
+ * @fpszs: ftrace storage zones
+ * @kmsg_max_cnt: max count of @kpszs
+ * @kmsg_read_cnt: counter of total read kmsg dumps
+ * @kmsg_write_cnt: counter of total kmsg dump writes
+ * @pmsg_read_cnt: counter of total read pmsg zone
+ * @console_read_cnt: counter of total read console zone
+ * @ftrace_max_cnt: max count of @fpszs
+ * @ftrace_read_cnt: counter of max read ftrace zone
+ * @oops_counter: counter of oops dumps
+ * @panic_counter: counter of panic dumps
+ * @recovered: whether finished recovering data from storage
+ * @on_panic: whether panic is happening
+ * @pstore_zone_info_lock: lock to @pstore_zone_info
+ * @pstore_zone_info: information from backend
+ * @pstore: structure for pstore
+ */
+struct psz_context {
+ struct pstore_zone **kpszs;
+ struct pstore_zone *ppsz;
+ struct pstore_zone *cpsz;
+ struct pstore_zone **fpszs;
+ unsigned int kmsg_max_cnt;
+ unsigned int kmsg_read_cnt;
+ unsigned int kmsg_write_cnt;
+ unsigned int pmsg_read_cnt;
+ unsigned int console_read_cnt;
+ unsigned int ftrace_max_cnt;
+ unsigned int ftrace_read_cnt;
+ /*
+ * These counters should be calculated during recovery.
+ * It records the oops/panic times after crashes rather than boots.
+ */
+ unsigned int oops_counter;
+ unsigned int panic_counter;
+ atomic_t recovered;
+ atomic_t on_panic;
+
+ /*
+ * pstore_zone_info_lock protects this entire structure during calls
+ * to register_pstore_zone()/unregister_pstore_zone().
+ */
+ struct mutex pstore_zone_info_lock;
+ struct pstore_zone_info *pstore_zone_info;
+ struct pstore_info pstore;
+};
+static struct psz_context pstore_zone_cxt;
+
+static void psz_flush_all_dirty_zones(struct work_struct *);
+static DECLARE_DELAYED_WORK(psz_cleaner, psz_flush_all_dirty_zones);
+
+/**
+ * enum psz_flush_mode - flush mode for psz_zone_write()
+ *
+ * @FLUSH_NONE: do not flush to storage but update data on memory
+ * @FLUSH_PART: just flush part of data including meta data to storage
+ * @FLUSH_META: just flush meta data of zone to storage
+ * @FLUSH_ALL: flush all of zone
+ */
+enum psz_flush_mode {
+ FLUSH_NONE = 0,
+ FLUSH_PART,
+ FLUSH_META,
+ FLUSH_ALL,
+};
+
+static inline int buffer_datalen(struct pstore_zone *zone)
+{
+ return atomic_read(&zone->buffer->datalen);
+}
+
+static inline int buffer_start(struct pstore_zone *zone)
+{
+ return atomic_read(&zone->buffer->start);
+}
+
+static inline bool is_on_panic(void)
+{
+ return atomic_read(&pstore_zone_cxt.on_panic);
+}
+
+static ssize_t psz_zone_read_buffer(struct pstore_zone *zone, char *buf,
+ size_t len, unsigned long off)
+{
+ if (!buf || !zone || !zone->buffer)
+ return -EINVAL;
+ if (off > zone->buffer_size)
+ return -EINVAL;
+ len = min_t(size_t, len, zone->buffer_size - off);
+ memcpy(buf, zone->buffer->data + off, len);
+ return len;
+}
+
+static int psz_zone_read_oldbuf(struct pstore_zone *zone, char *buf,
+ size_t len, unsigned long off)
+{
+ if (!buf || !zone || !zone->oldbuf)
+ return -EINVAL;
+ if (off > zone->buffer_size)
+ return -EINVAL;
+ len = min_t(size_t, len, zone->buffer_size - off);
+ memcpy(buf, zone->oldbuf->data + off, len);
+ return 0;
+}
+
+static int psz_zone_write(struct pstore_zone *zone,
+ enum psz_flush_mode flush_mode, const char *buf,
+ size_t len, unsigned long off)
+{
+ struct pstore_zone_info *info = pstore_zone_cxt.pstore_zone_info;
+ ssize_t wcnt = 0;
+ ssize_t (*writeop)(const char *buf, size_t bytes, loff_t pos);
+ size_t wlen;
+
+ if (off > zone->buffer_size)
+ return -EINVAL;
+
+ wlen = min_t(size_t, len, zone->buffer_size - off);
+ if (buf && wlen) {
+ memcpy(zone->buffer->data + off, buf, wlen);
+ atomic_set(&zone->buffer->datalen, wlen + off);
+ }
+
+ /* avoid to damage old records */
+ if (!is_on_panic() && !atomic_read(&pstore_zone_cxt.recovered))
+ goto dirty;
+
+ writeop = is_on_panic() ? info->panic_write : info->write;
+ if (!writeop)
+ goto dirty;
+
+ switch (flush_mode) {
+ case FLUSH_NONE:
+ if (unlikely(buf && wlen))
+ goto dirty;
+ return 0;
+ case FLUSH_PART:
+ wcnt = writeop((const char *)zone->buffer->data + off, wlen,
+ zone->off + sizeof(*zone->buffer) + off);
+ if (wcnt != wlen)
+ goto dirty;
+ fallthrough;
+ case FLUSH_META:
+ wlen = sizeof(struct psz_buffer);
+ wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
+ if (wcnt != wlen)
+ goto dirty;
+ break;
+ case FLUSH_ALL:
+ wlen = zone->buffer_size + sizeof(*zone->buffer);
+ wcnt = writeop((const char *)zone->buffer, wlen, zone->off);
+ if (wcnt != wlen)
+ goto dirty;
+ break;
+ }
+
+ return 0;
+dirty:
+ /* no need to mark dirty if going to try next zone */
+ if (wcnt == -ENOMSG)
+ return -ENOMSG;
+ atomic_set(&zone->dirty, true);
+ /* flush dirty zones nicely */
+ if (wcnt == -EBUSY && !is_on_panic())
+ schedule_delayed_work(&psz_cleaner, msecs_to_jiffies(500));
+ return -EBUSY;
+}
+
+static int psz_flush_dirty_zone(struct pstore_zone *zone)
+{
+ int ret;
+
+ if (unlikely(!zone))
+ return -EINVAL;
+
+ if (unlikely(!atomic_read(&pstore_zone_cxt.recovered)))
+ return -EBUSY;
+
+ if (!atomic_xchg(&zone->dirty, false))
+ return 0;
+
+ ret = psz_zone_write(zone, FLUSH_ALL, NULL, 0, 0);
+ if (ret)
+ atomic_set(&zone->dirty, true);
+ return ret;
+}
+
+static int psz_flush_dirty_zones(struct pstore_zone **zones, unsigned int cnt)
+{
+ int i, ret;
+ struct pstore_zone *zone;
+
+ if (!zones)
+ return -EINVAL;
+
+ for (i = 0; i < cnt; i++) {
+ zone = zones[i];
+ if (!zone)
+ return -EINVAL;
+ ret = psz_flush_dirty_zone(zone);
+ if (ret)
+ return ret;
+ }
+ return 0;
+}
+
+static int psz_move_zone(struct pstore_zone *old, struct pstore_zone *new)
+{
+ const char *data = (const char *)old->buffer->data;
+ int ret;
+
+ ret = psz_zone_write(new, FLUSH_ALL, data, buffer_datalen(old), 0);
+ if (ret) {
+ atomic_set(&new->buffer->datalen, 0);
+ atomic_set(&new->dirty, false);
+ return ret;
+ }
+ atomic_set(&old->buffer->datalen, 0);
+ return 0;
+}
+
+static void psz_flush_all_dirty_zones(struct work_struct *work)
+{
+ struct psz_context *cxt = &pstore_zone_cxt;
+ int ret = 0;
+
+ if (cxt->ppsz)
+ ret |= psz_flush_dirty_zone(cxt->ppsz);
+ if (cxt->cpsz)
+ ret |= psz_flush_dirty_zone(cxt->cpsz);
+ if (cxt->kpszs)
+ ret |= psz_flush_dirty_zones(cxt->kpszs, cxt->kmsg_max_cnt);
+ if (cxt->fpszs)
+ ret |= psz_flush_dirty_zones(cxt->fpszs, cxt->ftrace_max_cnt);
+ if (ret && cxt->pstore_zone_info)
+ schedule_delayed_work(&psz_cleaner, msecs_to_jiffies(1000));
+}
+
+static int psz_kmsg_recover_data(struct psz_context *cxt)
+{
+ struct pstore_zone_info *info = cxt->pstore_zone_info;
+ struct pstore_zone *zone = NULL;
+ struct psz_buffer *buf;
+ unsigned long i;
+ ssize_t rcnt;
+
+ if (!info->read)
+ return -EINVAL;
+
+ for (i = 0; i < cxt->kmsg_max_cnt; i++) {
+ zone = cxt->kpszs[i];
+ if (unlikely(!zone))
+ return -EINVAL;
+ if (atomic_read(&zone->dirty)) {
+ unsigned int wcnt = cxt->kmsg_write_cnt;
+ struct pstore_zone *new = cxt->kpszs[wcnt];
+ int ret;
+
+ ret = psz_move_zone(zone, new);
+ if (ret) {
+ pr_err("move zone from %lu to %d failed\n",
+ i, wcnt);
+ return ret;
+ }
+ cxt->kmsg_write_cnt = (wcnt + 1) % cxt->kmsg_max_cnt;
+ }
+ if (!zone->should_recover)
+ continue;
+ buf = zone->buffer;
+ rcnt = info->read((char *)buf, zone->buffer_size + sizeof(*buf),
+ zone->off);
+ if (rcnt != zone->buffer_size + sizeof(*buf))
+ return (int)rcnt < 0 ? (int)rcnt : -EIO;
+ }
+ return 0;
+}
+
+static int psz_kmsg_recover_meta(struct psz_context *cxt)
+{
+ struct pstore_zone_info *info = cxt->pstore_zone_info;
+ struct pstore_zone *zone;
+ size_t rcnt, len;
+ struct psz_buffer *buf;
+ struct psz_kmsg_header *hdr;
+ struct timespec64 time = { };
+ unsigned long i;
+ /*
+ * Recover may on panic, we can't allocate any memory by kmalloc.
+ * So, we use local array instead.
+ */
+ char buffer_header[sizeof(*buf) + sizeof(*hdr)] = {0};
+
+ if (!info->read)
+ return -EINVAL;
+
+ len = sizeof(*buf) + sizeof(*hdr);
+ buf = (struct psz_buffer *)buffer_header;
+ for (i = 0; i < cxt->kmsg_max_cnt; i++) {
+ zone = cxt->kpszs[i];
+ if (unlikely(!zone))
+ return -EINVAL;
+
+ rcnt = info->read((char *)buf, len, zone->off);
+ if (rcnt == -ENOMSG) {
+ pr_debug("%s with id %lu may be broken, skip\n",
+ zone->name, i);
+ continue;
+ } else if (rcnt != len) {
+ pr_err("read %s with id %lu failed\n", zone->name, i);
+ return (int)rcnt < 0 ? (int)rcnt : -EIO;
+ }
+
+ if (buf->sig != zone->buffer->sig) {
+ pr_debug("no valid data in kmsg dump zone %lu\n", i);
+ continue;
+ }
+
+ if (zone->buffer_size < atomic_read(&buf->datalen)) {
+ pr_info("found overtop zone: %s: id %lu, off %lld, size %zu\n",
+ zone->name, i, zone->off,
+ zone->buffer_size);
+ continue;
+ }
+
+ hdr = (struct psz_kmsg_header *)buf->data;
+ if (hdr->magic != PSTORE_KMSG_HEADER_MAGIC) {
+ pr_info("found invalid zone: %s: id %lu, off %lld, size %zu\n",
+ zone->name, i, zone->off,
+ zone->buffer_size);
+ continue;
+ }
+
+ /*
+ * we get the newest zone, and the next one must be the oldest
+ * or unused zone, because we do write one by one like a circle.
+ */
+ if (hdr->time.tv_sec >= time.tv_sec) {
+ time.tv_sec = hdr->time.tv_sec;
+ cxt->kmsg_write_cnt = (i + 1) % cxt->kmsg_max_cnt;
+ }
+
+ if (hdr->reason == KMSG_DUMP_OOPS)
+ cxt->oops_counter =
+ max(cxt->oops_counter, hdr->counter);
+ else if (hdr->reason == KMSG_DUMP_PANIC)
+ cxt->panic_counter =
+ max(cxt->panic_counter, hdr->counter);
+
+ if (!atomic_read(&buf->datalen)) {
+ pr_debug("found erased zone: %s: id %lu, off %lld, size %zu, datalen %d\n",
+ zone->name, i, zone->off,
+ zone->buffer_size,
+ atomic_read(&buf->datalen));
+ continue;
+ }
+
+ if (!is_on_panic())
+ zone->should_recover = true;
+ pr_debug("found nice zone: %s: id %lu, off %lld, size %zu, datalen %d\n",
+ zone->name, i, zone->off,
+ zone->buffer_size, atomic_read(&buf->datalen));
+ }
+
+ return 0;
+}
+
+static int psz_kmsg_recover(struct psz_context *cxt)
+{
+ int ret;
+
+ if (!cxt->kpszs)
+ return 0;
+
+ ret = psz_kmsg_recover_meta(cxt);
+ if (ret)
+ goto recover_fail;
+
+ ret = psz_kmsg_recover_data(cxt);
+ if (ret)
+ goto recover_fail;
+
+ return 0;
+recover_fail:
+ pr_debug("psz_recover_kmsg failed\n");
+ return ret;
+}
+
+static int psz_recover_zone(struct psz_context *cxt, struct pstore_zone *zone)
+{
+ struct pstore_zone_info *info = cxt->pstore_zone_info;
+ struct psz_buffer *oldbuf, tmpbuf;
+ int ret = 0;
+ char *buf;
+ ssize_t rcnt, len, start, off;
+
+ if (!zone || zone->oldbuf)
+ return 0;
+
+ if (is_on_panic()) {
+ /* save data as much as possible */
+ psz_flush_dirty_zone(zone);
+ return 0;
+ }
+
+ if (unlikely(!info->read))
+ return -EINVAL;
+
+ len = sizeof(struct psz_buffer);
+ rcnt = info->read((char *)&tmpbuf, len, zone->off);
+ if (rcnt != len) {
+ pr_debug("read zone %s failed\n", zone->name);
+ return (int)rcnt < 0 ? (int)rcnt : -EIO;
+ }
+
+ if (tmpbuf.sig != zone->buffer->sig) {
+ pr_debug("no valid data in zone %s\n", zone->name);
+ return 0;
+ }
+
+ if (zone->buffer_size < atomic_read(&tmpbuf.datalen) ||
+ zone->buffer_size < atomic_read(&tmpbuf.start)) {
+ pr_info("found overtop zone: %s: off %lld, size %zu\n",
+ zone->name, zone->off, zone->buffer_size);
+ /* just keep going */
+ return 0;
+ }
+
+ if (!atomic_read(&tmpbuf.datalen)) {
+ pr_debug("found erased zone: %s: off %lld, size %zu, datalen %d\n",
+ zone->name, zone->off, zone->buffer_size,
+ atomic_read(&tmpbuf.datalen));
+ return 0;
+ }
+
+ pr_debug("found nice zone: %s: off %lld, size %zu, datalen %d\n",
+ zone->name, zone->off, zone->buffer_size,
+ atomic_read(&tmpbuf.datalen));
+
+ len = atomic_read(&tmpbuf.datalen) + sizeof(*oldbuf);
+ oldbuf = kzalloc(len, GFP_KERNEL);
+ if (!oldbuf)
+ return -ENOMEM;
+
+ memcpy(oldbuf, &tmpbuf, sizeof(*oldbuf));
+ buf = (char *)oldbuf + sizeof(*oldbuf);
+ len = atomic_read(&oldbuf->datalen);
+ start = atomic_read(&oldbuf->start);
+ off = zone->off + sizeof(*oldbuf);
+
+ /* get part of data */
+ rcnt = info->read(buf, len - start, off + start);
+ if (rcnt != len - start) {
+ pr_err("read zone %s failed\n", zone->name);
+ ret = (int)rcnt < 0 ? (int)rcnt : -EIO;
+ goto free_oldbuf;
+ }
+
+ /* get the rest of data */
+ rcnt = info->read(buf + len - start, start, off);
+ if (rcnt != start) {
+ pr_err("read zone %s failed\n", zone->name);
+ ret = (int)rcnt < 0 ? (int)rcnt : -EIO;
+ goto free_oldbuf;
+ }
+
+ zone->oldbuf = oldbuf;
+ psz_flush_dirty_zone(zone);
+ return 0;
+
+free_oldbuf:
+ kfree(oldbuf);
+ return ret;
+}
+
+static int psz_recover_zones(struct psz_context *cxt,
+ struct pstore_zone **zones, unsigned int cnt)
+{
+ int ret;
+ unsigned int i;
+ struct pstore_zone *zone;
+
+ if (!zones)
+ return 0;
+
+ for (i = 0; i < cnt; i++) {
+ zone = zones[i];
+ if (unlikely(!zone))
+ continue;
+ ret = psz_recover_zone(cxt, zone);
+ if (ret)
+ goto recover_fail;
+ }
+
+ return 0;
+recover_fail:
+ pr_debug("recover %s[%u] failed\n", zone->name, i);
+ return ret;
+}
+
+/**
+ * psz_recovery() - recover data from storage
+ * @cxt: the context of pstore/zone
+ *
+ * recovery means reading data back from storage after rebooting
+ *
+ * Return: 0 on success, others on failure.
+ */
+static inline int psz_recovery(struct psz_context *cxt)
+{
+ int ret;
+
+ if (atomic_read(&cxt->recovered))
+ return 0;
+
+ ret = psz_kmsg_recover(cxt);
+ if (ret)
+ goto out;
+
+ ret = psz_recover_zone(cxt, cxt->ppsz);
+ if (ret)
+ goto out;
+
+ ret = psz_recover_zone(cxt, cxt->cpsz);
+ if (ret)
+ goto out;
+
+ ret = psz_recover_zones(cxt, cxt->fpszs, cxt->ftrace_max_cnt);
+
+out:
+ if (unlikely(ret))
+ pr_err("recover failed\n");
+ else {
+ pr_debug("recover end!\n");
+ atomic_set(&cxt->recovered, 1);
+ }
+ return ret;
+}
+
+static int psz_pstore_open(struct pstore_info *psi)
+{
+ struct psz_context *cxt = psi->data;
+
+ cxt->kmsg_read_cnt = 0;
+ cxt->pmsg_read_cnt = 0;
+ cxt->console_read_cnt = 0;
+ cxt->ftrace_read_cnt = 0;
+ return 0;
+}
+
+static inline bool psz_old_ok(struct pstore_zone *zone)
+{
+ if (zone && zone->oldbuf && atomic_read(&zone->oldbuf->datalen))
+ return true;
+ return false;
+}
+
+static inline bool psz_ok(struct pstore_zone *zone)
+{
+ if (zone && zone->buffer && buffer_datalen(zone))
+ return true;
+ return false;
+}
+
+static inline int psz_kmsg_erase(struct psz_context *cxt,
+ struct pstore_zone *zone, struct pstore_record *record)
+{
+ struct psz_buffer *buffer = zone->buffer;
+ struct psz_kmsg_header *hdr =
+ (struct psz_kmsg_header *)buffer->data;
+ size_t size;
+
+ if (unlikely(!psz_ok(zone)))
+ return 0;
+
+ /* this zone is already updated, no need to erase */
+ if (record->count != hdr->counter)
+ return 0;
+
+ size = buffer_datalen(zone) + sizeof(*zone->buffer);
+ atomic_set(&zone->buffer->datalen, 0);
+ if (cxt->pstore_zone_info->erase)
+ return cxt->pstore_zone_info->erase(size, zone->off);
+ else
+ return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+}
+
+static inline int psz_record_erase(struct psz_context *cxt,
+ struct pstore_zone *zone)
+{
+ if (unlikely(!psz_old_ok(zone)))
+ return 0;
+
+ kfree(zone->oldbuf);
+ zone->oldbuf = NULL;
+ /*
+ * if there are new data in zone buffer, that means the old data
+ * are already invalid. It is no need to flush 0 (erase) to
+ * block device.
+ */
+ if (!buffer_datalen(zone))
+ return psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+ psz_flush_dirty_zone(zone);
+ return 0;
+}
+
+static int psz_pstore_erase(struct pstore_record *record)
+{
+ struct psz_context *cxt = record->psi->data;
+
+ switch (record->type) {
+ case PSTORE_TYPE_DMESG:
+ if (record->id >= cxt->kmsg_max_cnt)
+ return -EINVAL;
+ return psz_kmsg_erase(cxt, cxt->kpszs[record->id], record);
+ case PSTORE_TYPE_PMSG:
+ return psz_record_erase(cxt, cxt->ppsz);
+ case PSTORE_TYPE_CONSOLE:
+ return psz_record_erase(cxt, cxt->cpsz);
+ case PSTORE_TYPE_FTRACE:
+ if (record->id >= cxt->ftrace_max_cnt)
+ return -EINVAL;
+ return psz_record_erase(cxt, cxt->fpszs[record->id]);
+ default: return -EINVAL;
+ }
+}
+
+static void psz_write_kmsg_hdr(struct pstore_zone *zone,
+ struct pstore_record *record)
+{
+ struct psz_context *cxt = record->psi->data;
+ struct psz_buffer *buffer = zone->buffer;
+ struct psz_kmsg_header *hdr =
+ (struct psz_kmsg_header *)buffer->data;
+
+ hdr->magic = PSTORE_KMSG_HEADER_MAGIC;
+ hdr->compressed = record->compressed;
+ hdr->time.tv_sec = record->time.tv_sec;
+ hdr->time.tv_nsec = record->time.tv_nsec;
+ hdr->reason = record->reason;
+ if (hdr->reason == KMSG_DUMP_OOPS)
+ hdr->counter = ++cxt->oops_counter;
+ else if (hdr->reason == KMSG_DUMP_PANIC)
+ hdr->counter = ++cxt->panic_counter;
+ else
+ hdr->counter = 0;
+}
+
+/*
+ * In case zone is broken, which may occur to MTD device, we try each zones,
+ * start at cxt->kmsg_write_cnt.
+ */
+static inline int notrace psz_kmsg_write_record(struct psz_context *cxt,
+ struct pstore_record *record)
+{
+ size_t size, hlen;
+ struct pstore_zone *zone;
+ unsigned int i;
+
+ for (i = 0; i < cxt->kmsg_max_cnt; i++) {
+ unsigned int zonenum, len;
+ int ret;
+
+ zonenum = (cxt->kmsg_write_cnt + i) % cxt->kmsg_max_cnt;
+ zone = cxt->kpszs[zonenum];
+ if (unlikely(!zone))
+ return -ENOSPC;
+
+ /* avoid destroying old data, allocate a new one */
+ len = zone->buffer_size + sizeof(*zone->buffer);
+ zone->oldbuf = zone->buffer;
+ zone->buffer = kzalloc(len, GFP_KERNEL);
+ if (!zone->buffer) {
+ zone->buffer = zone->oldbuf;
+ return -ENOMEM;
+ }
+ zone->buffer->sig = zone->oldbuf->sig;
+
+ pr_debug("write %s to zone id %d\n", zone->name, zonenum);
+ psz_write_kmsg_hdr(zone, record);
+ hlen = sizeof(struct psz_kmsg_header);
+ size = min_t(size_t, record->size, zone->buffer_size - hlen);
+ ret = psz_zone_write(zone, FLUSH_ALL, record->buf, size, hlen);
+ if (likely(!ret || ret != -ENOMSG)) {
+ cxt->kmsg_write_cnt = zonenum + 1;
+ cxt->kmsg_write_cnt %= cxt->kmsg_max_cnt;
+ /* no need to try next zone, free last zone buffer */
+ kfree(zone->oldbuf);
+ zone->oldbuf = NULL;
+ return ret;
+ }
+
+ pr_debug("zone %u may be broken, try next dmesg zone\n",
+ zonenum);
+ kfree(zone->buffer);
+ zone->buffer = zone->oldbuf;
+ zone->oldbuf = NULL;
+ }
+
+ return -EBUSY;
+}
+
+static int notrace psz_kmsg_write(struct psz_context *cxt,
+ struct pstore_record *record)
+{
+ int ret;
+
+ /*
+ * Explicitly only take the first part of any new crash.
+ * If our buffer is larger than kmsg_bytes, this can never happen,
+ * and if our buffer is smaller than kmsg_bytes, we don't want the
+ * report split across multiple records.
+ */
+ if (record->part != 1)
+ return -ENOSPC;
+
+ if (!cxt->kpszs)
+ return -ENOSPC;
+
+ ret = psz_kmsg_write_record(cxt, record);
+ if (!ret && is_on_panic()) {
+ /* ensure all data are flushed to storage when panic */
+ pr_debug("try to flush other dirty zones\n");
+ psz_flush_all_dirty_zones(NULL);
+ }
+
+ /* always return 0 as we had handled it on buffer */
+ return 0;
+}
+
+static int notrace psz_record_write(struct pstore_zone *zone,
+ struct pstore_record *record)
+{
+ size_t start, rem;
+ bool is_full_data = false;
+ char *buf;
+ int cnt;
+
+ if (!zone || !record)
+ return -ENOSPC;
+
+ if (atomic_read(&zone->buffer->datalen) >= zone->buffer_size)
+ is_full_data = true;
+
+ cnt = record->size;
+ buf = record->buf;
+ if (unlikely(cnt > zone->buffer_size)) {
+ buf += cnt - zone->buffer_size;
+ cnt = zone->buffer_size;
+ }
+
+ start = buffer_start(zone);
+ rem = zone->buffer_size - start;
+ if (unlikely(rem < cnt)) {
+ psz_zone_write(zone, FLUSH_PART, buf, rem, start);
+ buf += rem;
+ cnt -= rem;
+ start = 0;
+ is_full_data = true;
+ }
+
+ atomic_set(&zone->buffer->start, cnt + start);
+ psz_zone_write(zone, FLUSH_PART, buf, cnt, start);
+
+ /**
+ * psz_zone_write will set datalen as start + cnt.
+ * It work if actual data length lesser than buffer size.
+ * If data length greater than buffer size, pmsg will rewrite to
+ * beginning of zone, which make buffer->datalen wrongly.
+ * So we should reset datalen as buffer size once actual data length
+ * greater than buffer size.
+ */
+ if (is_full_data) {
+ atomic_set(&zone->buffer->datalen, zone->buffer_size);
+ psz_zone_write(zone, FLUSH_META, NULL, 0, 0);
+ }
+ return 0;
+}
+
+static int notrace psz_pstore_write(struct pstore_record *record)
+{
+ struct psz_context *cxt = record->psi->data;
+
+ if (record->type == PSTORE_TYPE_DMESG &&
+ record->reason == KMSG_DUMP_PANIC)
+ atomic_set(&cxt->on_panic, 1);
+
+ /*
+ * if on panic, do not write except panic records
+ * Fix case that panic_write prints log which wakes up console backend.
+ */
+ if (is_on_panic() && record->type != PSTORE_TYPE_DMESG)
+ return -EBUSY;
+
+ switch (record->type) {
+ case PSTORE_TYPE_DMESG:
+ return psz_kmsg_write(cxt, record);
+ case PSTORE_TYPE_CONSOLE:
+ return psz_record_write(cxt->cpsz, record);
+ case PSTORE_TYPE_PMSG:
+ return psz_record_write(cxt->ppsz, record);
+ case PSTORE_TYPE_FTRACE: {
+ int zonenum = smp_processor_id();
+
+ if (!cxt->fpszs)
+ return -ENOSPC;
+ return psz_record_write(cxt->fpszs[zonenum], record);
+ }
+ default:
+ return -EINVAL;
+ }
+}
+
+static struct pstore_zone *psz_read_next_zone(struct psz_context *cxt)
+{
+ struct pstore_zone *zone = NULL;
+
+ while (cxt->kmsg_read_cnt < cxt->kmsg_max_cnt) {
+ zone = cxt->kpszs[cxt->kmsg_read_cnt++];
+ if (psz_ok(zone))
+ return zone;
+ }
+
+ if (cxt->ftrace_read_cnt < cxt->ftrace_max_cnt)
+ /*
+ * No need psz_old_ok(). Let psz_ftrace_read() do so for
+ * combination. psz_ftrace_read() should traverse over
+ * all zones in case of some zone without data.
+ */
+ return cxt->fpszs[cxt->ftrace_read_cnt++];
+
+ if (cxt->pmsg_read_cnt == 0) {
+ cxt->pmsg_read_cnt++;
+ zone = cxt->ppsz;
+ if (psz_old_ok(zone))
+ return zone;
+ }
+
+ if (cxt->console_read_cnt == 0) {
+ cxt->console_read_cnt++;
+ zone = cxt->cpsz;
+ if (psz_old_ok(zone))
+ return zone;
+ }
+
+ return NULL;
+}
+
+static int psz_kmsg_read_hdr(struct pstore_zone *zone,
+ struct pstore_record *record)
+{
+ struct psz_buffer *buffer = zone->buffer;
+ struct psz_kmsg_header *hdr =
+ (struct psz_kmsg_header *)buffer->data;
+
+ if (hdr->magic != PSTORE_KMSG_HEADER_MAGIC)
+ return -EINVAL;
+ record->compressed = hdr->compressed;
+ record->time.tv_sec = hdr->time.tv_sec;
+ record->time.tv_nsec = hdr->time.tv_nsec;
+ record->reason = hdr->reason;
+ record->count = hdr->counter;
+ return 0;
+}
+
+static ssize_t psz_kmsg_read(struct pstore_zone *zone,
+ struct pstore_record *record)
+{
+ ssize_t size, hlen = 0;
+
+ size = buffer_datalen(zone);
+ /* Clear and skip this kmsg dump record if it has no valid header */
+ if (psz_kmsg_read_hdr(zone, record)) {
+ atomic_set(&zone->buffer->datalen, 0);
+ atomic_set(&zone->dirty, 0);
+ return -ENOMSG;
+ }
+ size -= sizeof(struct psz_kmsg_header);
+
+ if (!record->compressed) {
+ char *buf = kasprintf(GFP_KERNEL, "%s: Total %d times\n",
+ kmsg_dump_reason_str(record->reason),
+ record->count);
+ hlen = strlen(buf);
+ record->buf = krealloc(buf, hlen + size, GFP_KERNEL);
+ if (!record->buf) {
+ kfree(buf);
+ return -ENOMEM;
+ }
+ } else {
+ record->buf = kmalloc(size, GFP_KERNEL);
+ if (!record->buf)
+ return -ENOMEM;
+ }
+
+ size = psz_zone_read_buffer(zone, record->buf + hlen, size,
+ sizeof(struct psz_kmsg_header));
+ if (unlikely(size < 0)) {
+ kfree(record->buf);
+ return -ENOMSG;
+ }
+
+ return size + hlen;
+}
+
+/* try to combine all ftrace zones */
+static ssize_t psz_ftrace_read(struct pstore_zone *zone,
+ struct pstore_record *record)
+{
+ struct psz_context *cxt;
+ struct psz_buffer *buf;
+ int ret;
+
+ if (!zone || !record)
+ return -ENOSPC;
+
+ if (!psz_old_ok(zone))
+ goto out;
+
+ buf = (struct psz_buffer *)zone->oldbuf;
+ if (!buf)
+ return -ENOMSG;
+
+ ret = pstore_ftrace_combine_log(&record->buf, &record->size,
+ (char *)buf->data, atomic_read(&buf->datalen));
+ if (unlikely(ret))
+ return ret;
+
+out:
+ cxt = record->psi->data;
+ if (cxt->ftrace_read_cnt < cxt->ftrace_max_cnt)
+ /* then, read next ftrace zone */
+ return -ENOMSG;
+ record->id = 0;
+ return record->size ? record->size : -ENOMSG;
+}
+
+static ssize_t psz_record_read(struct pstore_zone *zone,
+ struct pstore_record *record)
+{
+ size_t len;
+ struct psz_buffer *buf;
+
+ if (!zone || !record)
+ return -ENOSPC;
+
+ buf = (struct psz_buffer *)zone->oldbuf;
+ if (!buf)
+ return -ENOMSG;
+
+ len = atomic_read(&buf->datalen);
+ record->buf = kmalloc(len, GFP_KERNEL);
+ if (!record->buf)
+ return -ENOMEM;
+
+ if (unlikely(psz_zone_read_oldbuf(zone, record->buf, len, 0))) {
+ kfree(record->buf);
+ return -ENOMSG;
+ }
+
+ return len;
+}
+
+static ssize_t psz_pstore_read(struct pstore_record *record)
+{
+ struct psz_context *cxt = record->psi->data;
+ ssize_t (*readop)(struct pstore_zone *zone,
+ struct pstore_record *record);
+ struct pstore_zone *zone;
+ ssize_t ret;
+
+ /* before read, we must recover from storage */
+ ret = psz_recovery(cxt);
+ if (ret)
+ return ret;
+
+next_zone:
+ zone = psz_read_next_zone(cxt);
+ if (!zone)
+ return 0;
+
+ record->type = zone->type;
+ switch (record->type) {
+ case PSTORE_TYPE_DMESG:
+ readop = psz_kmsg_read;
+ record->id = cxt->kmsg_read_cnt - 1;
+ break;
+ case PSTORE_TYPE_FTRACE:
+ readop = psz_ftrace_read;
+ break;
+ case PSTORE_TYPE_CONSOLE:
+ fallthrough;
+ case PSTORE_TYPE_PMSG:
+ readop = psz_record_read;
+ break;
+ default:
+ goto next_zone;
+ }
+
+ ret = readop(zone, record);
+ if (ret == -ENOMSG)
+ goto next_zone;
+ return ret;
+}
+
+static struct psz_context pstore_zone_cxt = {
+ .pstore_zone_info_lock =
+ __MUTEX_INITIALIZER(pstore_zone_cxt.pstore_zone_info_lock),
+ .recovered = ATOMIC_INIT(0),
+ .on_panic = ATOMIC_INIT(0),
+ .pstore = {
+ .owner = THIS_MODULE,
+ .open = psz_pstore_open,
+ .read = psz_pstore_read,
+ .write = psz_pstore_write,
+ .erase = psz_pstore_erase,
+ },
+};
+
+static void psz_free_zone(struct pstore_zone **pszone)
+{
+ struct pstore_zone *zone = *pszone;
+
+ if (!zone)
+ return;
+
+ kfree(zone->buffer);
+ kfree(zone);
+ *pszone = NULL;
+}
+
+static void psz_free_zones(struct pstore_zone ***pszones, unsigned int *cnt)
+{
+ struct pstore_zone **zones = *pszones;
+
+ if (!zones)
+ return;
+
+ while (*cnt > 0) {
+ (*cnt)--;
+ psz_free_zone(&(zones[*cnt]));
+ }
+ kfree(zones);
+ *pszones = NULL;
+}
+
+static void psz_free_all_zones(struct psz_context *cxt)
+{
+ if (cxt->kpszs)
+ psz_free_zones(&cxt->kpszs, &cxt->kmsg_max_cnt);
+ if (cxt->ppsz)
+ psz_free_zone(&cxt->ppsz);
+ if (cxt->cpsz)
+ psz_free_zone(&cxt->cpsz);
+ if (cxt->fpszs)
+ psz_free_zones(&cxt->fpszs, &cxt->ftrace_max_cnt);
+}
+
+static struct pstore_zone *psz_init_zone(enum pstore_type_id type,
+ loff_t *off, size_t size)
+{
+ struct pstore_zone_info *info = pstore_zone_cxt.pstore_zone_info;
+ struct pstore_zone *zone;
+ const char *name = pstore_type_to_name(type);
+
+ if (!size)
+ return NULL;
+
+ if (*off + size > info->total_size) {
+ pr_err("no room for %s (0x%zx@0x%llx over 0x%lx)\n",
+ name, size, *off, info->total_size);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ zone = kzalloc(sizeof(struct pstore_zone), GFP_KERNEL);
+ if (!zone)
+ return ERR_PTR(-ENOMEM);
+
+ zone->buffer = kmalloc(size, GFP_KERNEL);
+ if (!zone->buffer) {
+ kfree(zone);
+ return ERR_PTR(-ENOMEM);
+ }
+ memset(zone->buffer, 0xFF, size);
+ zone->off = *off;
+ zone->name = name;
+ zone->type = type;
+ zone->buffer_size = size - sizeof(struct psz_buffer);
+ zone->buffer->sig = type ^ PSZ_SIG;
+ zone->oldbuf = NULL;
+ atomic_set(&zone->dirty, 0);
+ atomic_set(&zone->buffer->datalen, 0);
+ atomic_set(&zone->buffer->start, 0);
+
+ *off += size;
+
+ pr_debug("pszone %s: off 0x%llx, %zu header, %zu data\n", zone->name,
+ zone->off, sizeof(*zone->buffer), zone->buffer_size);
+ return zone;
+}
+
+static struct pstore_zone **psz_init_zones(enum pstore_type_id type,
+ loff_t *off, size_t total_size, ssize_t record_size,
+ unsigned int *cnt)
+{
+ struct pstore_zone_info *info = pstore_zone_cxt.pstore_zone_info;
+ struct pstore_zone **zones, *zone;
+ const char *name = pstore_type_to_name(type);
+ int c, i;
+
+ *cnt = 0;
+ if (!total_size || !record_size)
+ return NULL;
+
+ if (*off + total_size > info->total_size) {
+ pr_err("no room for zones %s (0x%zx@0x%llx over 0x%lx)\n",
+ name, total_size, *off, info->total_size);
+ return ERR_PTR(-ENOMEM);
+ }
+
+ c = total_size / record_size;
+ zones = kcalloc(c, sizeof(*zones), GFP_KERNEL);
+ if (!zones) {
+ pr_err("allocate for zones %s failed\n", name);
+ return ERR_PTR(-ENOMEM);
+ }
+ memset(zones, 0, c * sizeof(*zones));
+
+ for (i = 0; i < c; i++) {
+ zone = psz_init_zone(type, off, record_size);
+ if (!zone || IS_ERR(zone)) {
+ pr_err("initialize zones %s failed\n", name);
+ psz_free_zones(&zones, &i);
+ return (void *)zone;
+ }
+ zones[i] = zone;
+ }
+
+ *cnt = c;
+ return zones;
+}
+
+static int psz_alloc_zones(struct psz_context *cxt)
+{
+ struct pstore_zone_info *info = cxt->pstore_zone_info;
+ loff_t off = 0;
+ int err;
+ size_t off_size = 0;
+
+ off_size += info->pmsg_size;
+ cxt->ppsz = psz_init_zone(PSTORE_TYPE_PMSG, &off, info->pmsg_size);
+ if (IS_ERR(cxt->ppsz)) {
+ err = PTR_ERR(cxt->ppsz);
+ cxt->ppsz = NULL;
+ goto free_out;
+ }
+
+ off_size += info->console_size;
+ cxt->cpsz = psz_init_zone(PSTORE_TYPE_CONSOLE, &off,
+ info->console_size);
+ if (IS_ERR(cxt->cpsz)) {
+ err = PTR_ERR(cxt->cpsz);
+ cxt->cpsz = NULL;
+ goto free_out;
+ }
+
+ off_size += info->ftrace_size;
+ cxt->fpszs = psz_init_zones(PSTORE_TYPE_FTRACE, &off,
+ info->ftrace_size,
+ info->ftrace_size / nr_cpu_ids,
+ &cxt->ftrace_max_cnt);
+ if (IS_ERR(cxt->fpszs)) {
+ err = PTR_ERR(cxt->fpszs);
+ cxt->fpszs = NULL;
+ goto free_out;
+ }
+
+ cxt->kpszs = psz_init_zones(PSTORE_TYPE_DMESG, &off,
+ info->total_size - off_size,
+ info->kmsg_size, &cxt->kmsg_max_cnt);
+ if (IS_ERR(cxt->kpszs)) {
+ err = PTR_ERR(cxt->kpszs);
+ cxt->kpszs = NULL;
+ goto free_out;
+ }
+
+ return 0;
+free_out:
+ psz_free_all_zones(cxt);
+ return err;
+}
+
+/**
+ * register_pstore_zone() - register to pstore/zone
+ *
+ * @info: back-end driver information. See &struct pstore_zone_info.
+ *
+ * Only one back-end at one time.
+ *
+ * Return: 0 on success, others on failure.
+ */
+int register_pstore_zone(struct pstore_zone_info *info)
+{
+ int err = -EINVAL;
+ struct psz_context *cxt = &pstore_zone_cxt;
+
+ if (info->total_size < 4096) {
+ pr_warn("total_size must be >= 4096\n");
+ return -EINVAL;
+ }
+
+ if (!info->kmsg_size && !info->pmsg_size && !info->console_size &&
+ !info->ftrace_size) {
+ pr_warn("at least one record size must be non-zero\n");
+ return -EINVAL;
+ }
+
+ if (!info->name || !info->name[0])
+ return -EINVAL;
+
+#define check_size(name, size) { \
+ if (info->name > 0 && info->name < (size)) { \
+ pr_err(#name " must be over %d\n", (size)); \
+ return -EINVAL; \
+ } \
+ if (info->name & (size - 1)) { \
+ pr_err(#name " must be a multiple of %d\n", \
+ (size)); \
+ return -EINVAL; \
+ } \
+ }
+
+ check_size(total_size, 4096);
+ check_size(kmsg_size, SECTOR_SIZE);
+ check_size(pmsg_size, SECTOR_SIZE);
+ check_size(console_size, SECTOR_SIZE);
+ check_size(ftrace_size, SECTOR_SIZE);
+
+#undef check_size
+
+ /*
+ * the @read and @write must be applied.
+ * if no @read, pstore may mount failed.
+ * if no @write, pstore do not support to remove record file.
+ */
+ if (!info->read || !info->write) {
+ pr_err("no valid general read/write interface\n");
+ return -EINVAL;
+ }
+
+ mutex_lock(&cxt->pstore_zone_info_lock);
+ if (cxt->pstore_zone_info) {
+ pr_warn("'%s' already loaded: ignoring '%s'\n",
+ cxt->pstore_zone_info->name, info->name);
+ mutex_unlock(&cxt->pstore_zone_info_lock);
+ return -EBUSY;
+ }
+ cxt->pstore_zone_info = info;
+
+ pr_debug("register %s with properties:\n", info->name);
+ pr_debug("\ttotal size : %ld Bytes\n", info->total_size);
+ pr_debug("\tkmsg size : %ld Bytes\n", info->kmsg_size);
+ pr_debug("\tpmsg size : %ld Bytes\n", info->pmsg_size);
+ pr_debug("\tconsole size : %ld Bytes\n", info->console_size);
+ pr_debug("\tftrace size : %ld Bytes\n", info->ftrace_size);
+
+ err = psz_alloc_zones(cxt);
+ if (err) {
+ pr_err("alloc zones failed\n");
+ goto fail_out;
+ }
+
+ if (info->kmsg_size) {
+ cxt->pstore.bufsize = cxt->kpszs[0]->buffer_size -
+ sizeof(struct psz_kmsg_header);
+ cxt->pstore.buf = kzalloc(cxt->pstore.bufsize, GFP_KERNEL);
+ if (!cxt->pstore.buf) {
+ err = -ENOMEM;
+ goto fail_free;
+ }
+ }
+ cxt->pstore.data = cxt;
+
+ pr_info("registered %s as backend for", info->name);
+ cxt->pstore.max_reason = info->max_reason;
+ cxt->pstore.name = info->name;
+ if (info->kmsg_size) {
+ cxt->pstore.flags |= PSTORE_FLAGS_DMESG;
+ pr_cont(" kmsg(%s",
+ kmsg_dump_reason_str(cxt->pstore.max_reason));
+ if (cxt->pstore_zone_info->panic_write)
+ pr_cont(",panic_write");
+ pr_cont(")");
+ }
+ if (info->pmsg_size) {
+ cxt->pstore.flags |= PSTORE_FLAGS_PMSG;
+ pr_cont(" pmsg");
+ }
+ if (info->console_size) {
+ cxt->pstore.flags |= PSTORE_FLAGS_CONSOLE;
+ pr_cont(" console");
+ }
+ if (info->ftrace_size) {
+ cxt->pstore.flags |= PSTORE_FLAGS_FTRACE;
+ pr_cont(" ftrace");
+ }
+ pr_cont("\n");
+
+ err = pstore_register(&cxt->pstore);
+ if (err) {
+ pr_err("registering with pstore failed\n");
+ goto fail_free;
+ }
+ mutex_unlock(&pstore_zone_cxt.pstore_zone_info_lock);
+
+ return 0;
+
+fail_free:
+ kfree(cxt->pstore.buf);
+ cxt->pstore.buf = NULL;
+ cxt->pstore.bufsize = 0;
+ psz_free_all_zones(cxt);
+fail_out:
+ pstore_zone_cxt.pstore_zone_info = NULL;
+ mutex_unlock(&pstore_zone_cxt.pstore_zone_info_lock);
+ return err;
+}
+EXPORT_SYMBOL_GPL(register_pstore_zone);
+
+/**
+ * unregister_pstore_zone() - unregister to pstore/zone
+ *
+ * @info: back-end driver information. See struct pstore_zone_info.
+ */
+void unregister_pstore_zone(struct pstore_zone_info *info)
+{
+ struct psz_context *cxt = &pstore_zone_cxt;
+
+ mutex_lock(&cxt->pstore_zone_info_lock);
+ if (!cxt->pstore_zone_info) {
+ mutex_unlock(&cxt->pstore_zone_info_lock);
+ return;
+ }
+
+ /* Stop incoming writes from pstore. */
+ pstore_unregister(&cxt->pstore);
+
+ /* Flush any pending writes. */
+ psz_flush_all_dirty_zones(NULL);
+ flush_delayed_work(&psz_cleaner);
+
+ /* Clean up allocations. */
+ kfree(cxt->pstore.buf);
+ cxt->pstore.buf = NULL;
+ cxt->pstore.bufsize = 0;
+ cxt->pstore_zone_info = NULL;
+
+ psz_free_all_zones(cxt);
+
+ /* Clear counters and zone state. */
+ cxt->oops_counter = 0;
+ cxt->panic_counter = 0;
+ atomic_set(&cxt->recovered, 0);
+ atomic_set(&cxt->on_panic, 0);
+
+ mutex_unlock(&cxt->pstore_zone_info_lock);
+}
+EXPORT_SYMBOL_GPL(unregister_pstore_zone);
+
+MODULE_LICENSE("GPL");
+MODULE_AUTHOR("WeiXiong Liao <liaoweixiong@allwinnertech.com>");
+MODULE_AUTHOR("Kees Cook <keescook@chromium.org>");
+MODULE_DESCRIPTION("Storage Manager for pstore/blk");
* Check pipe occupancy without the inode lock first. This function
* is speculative anyways, so missing one is ok.
*/
- if (pipe_full(pipe->head, pipe->tail, pipe->max_usage))
+ if (!pipe_full(pipe->head, pipe->tail, pipe->max_usage))
return 0;
ret = 0;
#include <linux/slab.h>
#include <linux/percpu.h>
#include <linux/buffer_head.h>
+#include <linux/local_lock.h>
#include "squashfs_fs.h"
#include "squashfs_fs_sb.h"
*/
struct squashfs_stream {
- void *stream;
+ void *stream;
+ local_lock_t lock;
};
void *squashfs_decompressor_create(struct squashfs_sb_info *msblk,
err = PTR_ERR(stream->stream);
goto out;
}
+ local_lock_init(&stream->lock);
}
kfree(comp_opts);
int squashfs_decompress(struct squashfs_sb_info *msblk, struct buffer_head **bh,
int b, int offset, int length, struct squashfs_page_actor *output)
{
- struct squashfs_stream __percpu *percpu =
- (struct squashfs_stream __percpu *) msblk->stream;
- struct squashfs_stream *stream = get_cpu_ptr(percpu);
- int res = msblk->decompressor->decompress(msblk, stream->stream, bh, b,
- offset, length, output);
- put_cpu_ptr(stream);
+ struct squashfs_stream *stream;
+ int res;
+
+ local_lock(&msblk->stream->lock);
+ stream = this_cpu_ptr(msblk->stream);
+
+ res = msblk->decompressor->decompress(msblk, stream->stream, bh, b,
+ offset, length, output);
+
+ local_unlock(&msblk->stream->lock);
if (res < 0)
ERROR("%s decompression failed, data probably corrupt\n",
u8 *hash)
{
const struct ubifs_ch *ch = node;
- SHASH_DESC_ON_STACK(shash, c->hash_tfm);
- int err;
-
- shash->tfm = c->hash_tfm;
- err = crypto_shash_digest(shash, node, le32_to_cpu(ch->len), hash);
- if (err < 0)
- return err;
- return 0;
+ return crypto_shash_tfm_digest(c->hash_tfm, node, le32_to_cpu(ch->len),
+ hash);
}
/**
static int ubifs_hash_calc_hmac(const struct ubifs_info *c, const u8 *hash,
u8 *hmac)
{
- SHASH_DESC_ON_STACK(shash, c->hmac_tfm);
- int err;
-
- shash->tfm = c->hmac_tfm;
-
- err = crypto_shash_digest(shash, hash, c->hash_len, hmac);
- if (err < 0)
- return err;
- return 0;
+ return crypto_shash_tfm_digest(c->hmac_tfm, hash, c->hash_len, hmac);
}
/**
struct shash_desc *inhash)
{
struct ubifs_auth_node *auth = node;
- u8 *hash;
+ u8 hash[UBIFS_HASH_ARR_SZ];
int err;
- hash = kmalloc(crypto_shash_descsize(c->hash_tfm), GFP_NOFS);
- if (!hash)
- return -ENOMEM;
-
{
SHASH_DESC_ON_STACK(hash_desc, c->hash_tfm);
err = crypto_shash_final(hash_desc, hash);
if (err)
- goto out;
+ return err;
}
err = ubifs_hash_calc_hmac(c, hash, auth->hmac);
if (err)
- goto out;
+ return err;
auth->ch.node_type = UBIFS_AUTH_NODE;
ubifs_prepare_node(c, auth, ubifs_auth_node_sz(c), 0);
-
- err = 0;
-out:
- kfree(hash);
-
- return err;
+ return 0;
}
static struct shash_desc *ubifs_get_desc(const struct ubifs_info *c,
struct ubifs_info *c = inode->i_sb->s_fs_info;
struct ubifs_budget_req req = { .dirtied_ino = 1,
.dirtied_ino_d = ALIGN(ui->data_len, 8) };
- int iflags = I_DIRTY_TIME;
int err, release;
if (!IS_ENABLED(CONFIG_UBIFS_ATIME_SUPPORT))
if (flags & S_MTIME)
inode->i_mtime = *time;
- if (!(inode->i_sb->s_flags & SB_LAZYTIME))
- iflags |= I_DIRTY_SYNC;
-
release = ui->dirty;
- __mark_inode_dirty(inode, iflags);
+ __mark_inode_dirty(inode, I_DIRTY_SYNC);
mutex_unlock(&ui->ui_mutex);
if (release)
ubifs_release_budget(c, &req);
u8 calc[UBIFS_MAX_HASH_LEN];
const void *node = mst;
- SHASH_DESC_ON_STACK(shash, c->hash_tfm);
-
- shash->tfm = c->hash_tfm;
-
- crypto_shash_digest(shash, node + sizeof(struct ubifs_ch),
- UBIFS_MST_NODE_SZ - sizeof(struct ubifs_ch), calc);
+ crypto_shash_tfm_digest(c->hash_tfm, node + sizeof(struct ubifs_ch),
+ UBIFS_MST_NODE_SZ - sizeof(struct ubifs_ch),
+ calc);
if (ubifs_check_hash(c, expected, calc))
return -EPERM;
return data == 0xFFFFFFFF;
}
-/* authenticate_sleb_hash and authenticate_sleb_hmac are split out for stack usage */
+/* authenticate_sleb_hash is split out for stack usage */
static int authenticate_sleb_hash(struct ubifs_info *c, struct shash_desc *log_hash, u8 *hash)
{
SHASH_DESC_ON_STACK(hash_desc, c->hash_tfm);
return crypto_shash_final(hash_desc, hash);
}
-static int authenticate_sleb_hmac(struct ubifs_info *c, u8 *hash, u8 *hmac)
-{
- SHASH_DESC_ON_STACK(hmac_desc, c->hmac_tfm);
-
- hmac_desc->tfm = c->hmac_tfm;
-
- return crypto_shash_digest(hmac_desc, hash, c->hash_len, hmac);
-}
-
/**
* authenticate_sleb - authenticate one scan LEB
* @c: UBIFS file-system description object
struct ubifs_scan_node *snod;
int n_nodes = 0;
int err;
- u8 *hash, *hmac;
+ u8 hash[UBIFS_HASH_ARR_SZ];
+ u8 hmac[UBIFS_HMAC_ARR_SZ];
if (!ubifs_authenticated(c))
return sleb->nodes_cnt;
- hash = kmalloc(crypto_shash_descsize(c->hash_tfm), GFP_NOFS);
- hmac = kmalloc(c->hmac_desc_len, GFP_NOFS);
- if (!hash || !hmac) {
- err = -ENOMEM;
- goto out;
- }
-
list_for_each_entry(snod, &sleb->nodes, list) {
n_nodes++;
if (err)
goto out;
- err = authenticate_sleb_hmac(c, hash, hmac);
+ err = crypto_shash_tfm_digest(c->hmac_tfm, hash,
+ c->hash_len, hmac);
if (err)
goto out;
err = 0;
}
out:
- kfree(hash);
- kfree(hmac);
-
return err ? err : n_nodes - n_not_auth;
}
/**
* fsverity_ioctl_enable() - enable verity on a file
+ * @filp: file to enable verity on
+ * @uarg: user pointer to fsverity_enable_arg
*
* Enable fs-verity on a file. See the "FS_IOC_ENABLE_VERITY" section of
* Documentation/filesystems/fsverity.rst for the documentation.
u64 level_start[FS_VERITY_MAX_LEVELS];
};
-/**
+/*
* fsverity_info - cached verity metadata for an inode
*
* When a verity file is first opened, an instance of this struct is allocated
/* init.c */
-extern void __printf(3, 4) __cold
+void __printf(3, 4) __cold
fsverity_msg(const struct inode *inode, const char *level,
const char *fmt, ...);
/**
* fsverity_ioctl_measure() - get a verity file's measurement
+ * @filp: file to get measurement of
+ * @_uarg: user pointer to fsverity_digest
*
* Retrieve the file measurement that the kernel is enforcing for reads from a
* verity file. See the "FS_IOC_MEASURE_VERITY" section of
/**
* fsverity_cleanup_inode() - free the inode's verity info, if present
+ * @inode: an inode being evicted
*
* Filesystems must call this on inode eviction to free ->i_verity_info.
*/
/**
* fsverity_verify_signature() - check a verity file's signature
+ * @vi: the file's fsverity_info
+ * @desc: the file's fsverity_descriptor
+ * @desc_size: size of @desc
*
* If the file's fs-verity descriptor includes a signature of the file
* measurement, verify it against the certificates in the fs-verity keyring.
/**
* fsverity_verify_page() - verify a data page
+ * @page: the page to verity
*
* Verify a page that has just been read from a verity file. The page must be a
* pagecache page that is still locked and not yet uptodate.
#ifdef CONFIG_BLOCK
/**
* fsverity_verify_bio() - verify a 'read' bio that has just completed
+ * @bio: the bio to verify
*
* Verify a set of pages that have just been read from a verity file. The pages
* must be pagecache pages that are still locked and not yet uptodate. Pages
/**
* fsverity_enqueue_verify_work() - enqueue work on the fs-verity workqueue
+ * @work: the work to enqueue
*
* Enqueue verification work for asynchronous processing.
*/
struct simple_xattr *new_xattr = NULL;
int err = 0;
+ if (removed_size)
+ *removed_size = -1;
+
/* value == NULL means remove */
if (value) {
new_xattr = simple_xattr_alloc(value, size);
list_add(&new_xattr->list, &xattrs->head);
xattr = NULL;
}
-
- if (removed_size)
- *removed_size = -1;
out:
spin_unlock(&xattrs->lock);
if (xattr) {
/* Start and end of .opd section - used for function descriptors. */
extern char __start_opd[], __end_opd[];
+/* Start and end of instrumentation protected text section */
+extern char __noinstr_text_start[], __noinstr_text_end[];
+
extern __visible const void __nosave_begin, __nosave_end;
/* Function descriptor handling (if any). Override in asm/sections.h */
#ifdef CONFIG_NEED_MULTIPLE_NODES
#define cpumask_of_node(node) ((node) == 0 ? cpu_online_mask : cpu_none_mask)
#else
- #define cpumask_of_node(node) ((void)node, cpu_online_mask)
+ #define cpumask_of_node(node) ((void)(node), cpu_online_mask)
#endif
#endif
#ifndef pcibus_to_node
. = ALIGN((align)); \
__end_rodata = .;
+/*
+ * Non-instrumentable text section
+ */
+#define NOINSTR_TEXT \
+ ALIGN_FUNCTION(); \
+ __noinstr_text_start = .; \
+ *(.noinstr.text) \
+ __noinstr_text_end = .;
+
/*
* .text section. Map to function alignment to avoid address changes
* during second ld run in second ld pass when generating System.map
#define TEXT_TEXT \
ALIGN_FUNCTION(); \
*(.text.hot TEXT_MAIN .text.fixup .text.unlikely) \
+ NOINSTR_TEXT \
*(.text..refcount) \
*(.ref.text) \
MEM_KEEP(init.text*) \
{
type &= ~CRYPTO_ALG_TYPE_MASK;
type |= CRYPTO_ALG_TYPE_ACOMPRESS;
- mask |= CRYPTO_ALG_TYPE_MASK;
+ mask |= CRYPTO_ALG_TYPE_ACOMPRESS_MASK;
return crypto_has_alg(alg_name, type, mask);
}
void crypto_init_queue(struct crypto_queue *queue, unsigned int max_qlen);
int crypto_enqueue_request(struct crypto_queue *queue,
struct crypto_async_request *request);
+void crypto_enqueue_request_head(struct crypto_queue *queue,
+ struct crypto_async_request *request);
struct crypto_async_request *crypto_dequeue_request(struct crypto_queue *queue);
static inline unsigned int crypto_queue_len(struct crypto_queue *queue)
{
static inline size_t drbg_max_requests(struct drbg_state *drbg)
{
/* SP800-90A requires 2**48 maximum requests before reseeding */
-#if (__BITS_PER_LONG == 32)
- return SIZE_MAX;
-#else
- return (1UL<<48);
-#endif
+ return (1<<20);
}
/*
* @idling: the engine is entering idle state
* @busy: request pump is busy
* @running: the engine is on working
- * @cur_req_prepared: current request is prepared
+ * @retry_support: indication that the hardware allows re-execution
+ * of a failed backlog request
+ * crypto-engine, in head position to keep order
* @list: link with the global crypto engine list
* @queue_lock: spinlock to syncronise access to request queue
* @queue: the crypto queue of the engine
* @unprepare_crypt_hardware: there are currently no more requests on the
* queue so the subsystem notifies the driver that it may relax the
* hardware by issuing this call
+ * @do_batch_requests: execute a batch of requests. Depends on multiple
+ * requests support.
* @kworker: kthread worker struct for request pump
* @pump_requests: work struct for scheduling work to the request pump
* @priv_data: the engine private data
bool idling;
bool busy;
bool running;
- bool cur_req_prepared;
+
+ bool retry_support;
struct list_head list;
spinlock_t queue_lock;
int (*prepare_crypt_hardware)(struct crypto_engine *engine);
int (*unprepare_crypt_hardware)(struct crypto_engine *engine);
+ int (*do_batch_requests)(struct crypto_engine *engine);
+
struct kthread_worker *kworker;
struct kthread_work pump_requests;
int crypto_engine_start(struct crypto_engine *engine);
int crypto_engine_stop(struct crypto_engine *engine);
struct crypto_engine *crypto_engine_alloc_init(struct device *dev, bool rt);
+struct crypto_engine *crypto_engine_alloc_init_and_set(struct device *dev,
+ bool retry_support,
+ int (*cbk_do_batch)(struct crypto_engine *engine),
+ bool rt, int qlen);
int crypto_engine_exit(struct crypto_engine *engine);
#endif /* _CRYPTO_ENGINE_H */
int crypto_shash_digest(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *out);
+/**
+ * crypto_shash_tfm_digest() - calculate message digest for buffer
+ * @tfm: hash transformation object
+ * @data: see crypto_shash_update()
+ * @len: see crypto_shash_update()
+ * @out: see crypto_shash_final()
+ *
+ * This is a simplified version of crypto_shash_digest() for users who don't
+ * want to allocate their own hash descriptor (shash_desc). Instead,
+ * crypto_shash_tfm_digest() takes a hash transformation object (crypto_shash)
+ * directly, and it allocates a hash descriptor on the stack internally.
+ * Note that this stack allocation may be fairly large.
+ *
+ * Context: Any context.
+ * Return: 0 on success; < 0 if an error occurred.
+ */
+int crypto_shash_tfm_digest(struct crypto_shash *tfm, const u8 *data,
+ unsigned int len, u8 *out);
+
/**
* crypto_shash_export() - extract operational state for message digest
* @desc: reference to the operational state handle whose state is exported
extern int crypto_sha512_finup(struct shash_desc *desc, const u8 *data,
unsigned int len, u8 *hash);
+/*
+ * An implementation of SHA-1's compression function. Don't use in new code!
+ * You shouldn't be using SHA-1, and even if you *have* to use SHA-1, this isn't
+ * the correct way to hash something with SHA-1 (use crypto_shash instead).
+ */
+#define SHA1_DIGEST_WORDS (SHA1_DIGEST_SIZE / 4)
+#define SHA1_WORKSPACE_WORDS 16
+void sha1_init(__u32 *buf);
+void sha1_transform(__u32 *digest, const char *data, __u32 *W);
+
/*
* Stand-alone implementation of the SHA256 algorithm. It is designed to
* have as little dependencies as possible so it can be used in the
* For details see lib/crypto/sha256.c
*/
-static inline int sha256_init(struct sha256_state *sctx)
+static inline void sha256_init(struct sha256_state *sctx)
{
sctx->state[0] = SHA256_H0;
sctx->state[1] = SHA256_H1;
sctx->state[6] = SHA256_H6;
sctx->state[7] = SHA256_H7;
sctx->count = 0;
-
- return 0;
}
-extern int sha256_update(struct sha256_state *sctx, const u8 *input,
- unsigned int length);
-extern int sha256_final(struct sha256_state *sctx, u8 *hash);
+void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len);
+void sha256_final(struct sha256_state *sctx, u8 *out);
-static inline int sha224_init(struct sha256_state *sctx)
+static inline void sha224_init(struct sha256_state *sctx)
{
sctx->state[0] = SHA224_H0;
sctx->state[1] = SHA224_H1;
sctx->state[6] = SHA224_H6;
sctx->state[7] = SHA224_H7;
sctx->count = 0;
-
- return 0;
}
-extern int sha224_update(struct sha256_state *sctx, const u8 *input,
- unsigned int length);
-extern int sha224_final(struct sha256_state *sctx, u8 *hash);
+void sha224_update(struct sha256_state *sctx, const u8 *data, unsigned int len);
+void sha224_final(struct sha256_state *sctx, u8 *out);
#endif
{
struct sha256_state *sctx = shash_desc_ctx(desc);
- return sha224_init(sctx);
+ sha224_init(sctx);
+ return 0;
}
static inline int sha256_base_init(struct shash_desc *desc)
{
struct sha256_state *sctx = shash_desc_ctx(desc);
- return sha256_init(sctx);
+ sha256_init(sctx);
+ return 0;
}
static inline int sha256_base_do_update(struct shash_desc *desc,
* @MODE_HSYNC: hsync out of range
* @MODE_VSYNC: vsync out of range
* @MODE_H_ILLEGAL: mode has illegal horizontal timings
- * @MODE_V_ILLEGAL: mode has illegal horizontal timings
+ * @MODE_V_ILLEGAL: mode has illegal vertical timings
* @MODE_BAD_WIDTH: requires an unsupported linepitch
* @MODE_NOMODE: no mode with a matching name
* @MODE_NO_INTERLACE: interlaced mode not supported
#define BCM54810_EXP_BROADREACH_LRE_MISC_CTL_EN (1 << 0)
#define BCM54810_SHD_CLK_CTL 0x3
#define BCM54810_SHD_CLK_CTL_GTXCLK_EN (1 << 9)
+#define BCM54810_SHD_SCR3_TRDDAPD 0x0100
/* BCM54612E Registers */
#define BCM54612E_EXP_SPARE0 (MII_BCM54XX_EXP_SEL_ETC + 0x34)
extern bool capable_wrt_inode_uidgid(const struct inode *inode, int cap);
extern bool file_ns_capable(const struct file *file, struct user_namespace *ns, int cap);
extern bool ptracer_capable(struct task_struct *tsk, struct user_namespace *ns);
+static inline bool perfmon_capable(void)
+{
+ return capable(CAP_PERFMON) || capable(CAP_SYS_ADMIN);
+}
/* audit system wants to get cap info from files as well */
extern int get_vfs_caps_from_disk(const struct dentry *dentry, struct cpu_vfs_cap_data *cpu_caps);
/* Annotate a C jump table to allow objtool to follow the code flow */
#define __annotate_jump_table __section(.rodata..c_jump_table)
+#ifdef CONFIG_DEBUG_ENTRY
+/* Begin/end of an instrumentation safe region */
+#define instrumentation_begin() ({ \
+ asm volatile("%c0:\n\t" \
+ ".pushsection .discard.instr_begin\n\t" \
+ ".long %c0b - .\n\t" \
+ ".popsection\n\t" : : "i" (__COUNTER__)); \
+})
+
+/*
+ * Because instrumentation_{begin,end}() can nest, objtool validation considers
+ * _begin() a +1 and _end() a -1 and computes a sum over the instructions.
+ * When the value is greater than 0, we consider instrumentation allowed.
+ *
+ * There is a problem with code like:
+ *
+ * noinstr void foo()
+ * {
+ * instrumentation_begin();
+ * ...
+ * if (cond) {
+ * instrumentation_begin();
+ * ...
+ * instrumentation_end();
+ * }
+ * bar();
+ * instrumentation_end();
+ * }
+ *
+ * If instrumentation_end() would be an empty label, like all the other
+ * annotations, the inner _end(), which is at the end of a conditional block,
+ * would land on the instruction after the block.
+ *
+ * If we then consider the sum of the !cond path, we'll see that the call to
+ * bar() is with a 0-value, even though, we meant it to happen with a positive
+ * value.
+ *
+ * To avoid this, have _end() be a NOP instruction, this ensures it will be
+ * part of the condition block and does not escape.
+ */
+#define instrumentation_end() ({ \
+ asm volatile("%c0: nop\n\t" \
+ ".pushsection .discard.instr_end\n\t" \
+ ".long %c0b - .\n\t" \
+ ".popsection\n\t" : : "i" (__COUNTER__)); \
+})
+#endif /* CONFIG_DEBUG_ENTRY */
+
#else
#define annotate_reachable()
#define annotate_unreachable()
#define __annotate_jump_table
#endif
+#ifndef instrumentation_begin
+#define instrumentation_begin() do { } while(0)
+#define instrumentation_end() do { } while(0)
+#endif
+
#ifndef ASM_UNREACHABLE
# define ASM_UNREACHABLE
#endif
/* &a[0] degrades to a pointer: a different type from an array */
#define __must_be_array(a) BUILD_BUG_ON_ZERO(__same_type((a), &(a)[0]))
+/*
+ * This is needed in functions which generate the stack canary, see
+ * arch/x86/kernel/smpboot.c::start_secondary() for an example.
+ */
+#define prevent_tail_call_optimization() mb()
+
#endif /* __LINUX_COMPILER_H */
#define notrace __attribute__((__no_instrument_function__))
#endif
+/* Section for code which can't be instrumented at all */
+#define noinstr \
+ noinline notrace __attribute((__section__(".noinstr.text")))
+
/*
* it doesn't make sense on ARM (currently the only user of __naked)
* to trace naked functions because then mcount is called without
*/
#define CON_PRINTBUFFER (1)
-#define CON_CONSDEV (2) /* Last on the command line */
+#define CON_CONSDEV (2) /* Preferred console, /dev/console */
#define CON_ENABLED (4)
#define CON_BOOT (8)
#define CON_ANYTIME (16) /* Safe to call when cpu is offline */
u8 aer_info[96];
};
+/* Firmware Error Record Reference, UEFI v2.7 sec N.2.10 */
+struct cper_sec_fw_err_rec_ref {
+ u8 record_type;
+ u8 revision;
+ u8 reserved[6];
+ u64 record_identifier;
+ guid_t record_identifier_guid;
+};
+
/* Reset to default packing */
#pragma pack()
static inline void put_online_cpus(void) { cpus_read_unlock(); }
#ifdef CONFIG_PM_SLEEP_SMP
-int __freeze_secondary_cpus(int primary, bool suspend);
-static inline int freeze_secondary_cpus(int primary)
-{
- return __freeze_secondary_cpus(primary, true);
-}
-
-static inline int disable_nonboot_cpus(void)
-{
- return __freeze_secondary_cpus(0, false);
-}
-
-void enable_nonboot_cpus(void);
+extern int freeze_secondary_cpus(int primary);
+extern void thaw_secondary_cpus(void);
static inline int suspend_disable_secondary_cpus(void)
{
}
static inline void suspend_enable_secondary_cpus(void)
{
- return enable_nonboot_cpus();
+ return thaw_secondary_cpus();
}
#else /* !CONFIG_PM_SLEEP_SMP */
-static inline int disable_nonboot_cpus(void) { return 0; }
-static inline void enable_nonboot_cpus(void) {}
+static inline void thaw_secondary_cpus(void) {}
static inline int suspend_disable_secondary_cpus(void) { return 0; }
static inline void suspend_enable_secondary_cpus(void) { }
#endif /* !CONFIG_PM_SLEEP_SMP */
static inline bool is_kdump_kernel(void) { return 0; }
#endif /* CONFIG_CRASH_DUMP */
-extern unsigned long saved_max_pfn;
-
/* Device Dump information to be filled by drivers */
struct vmcoredd_data {
char dump_name[VMCOREDD_MAX_NAME_BYTES]; /* Unique name of the dump */
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0 */
-#ifndef __CRYPTOHASH_H
-#define __CRYPTOHASH_H
-
-#include <uapi/linux/types.h>
-
-#define SHA_DIGEST_WORDS 5
-#define SHA_MESSAGE_BYTES (512 /*bits*/ / 8)
-#define SHA_WORKSPACE_WORDS 16
-
-void sha_init(__u32 *buf);
-void sha_transform(__u32 *digest, const char *data, __u32 *W);
-
-#endif
/* SPDX-License-Identifier: GPL-2.0 */
#include <linux/fs.h>
-#include <linux/bpf-cgroup.h>
#define DEVCG_ACC_MKNOD 1
#define DEVCG_ACC_READ 2
#define DEVCG_DEV_CHAR 2
#define DEVCG_DEV_ALL 4 /* this represents all devices */
-#ifdef CONFIG_CGROUP_DEVICE
-int devcgroup_check_permission(short type, u32 major, u32 minor,
- short access);
-#else
-static inline int devcgroup_check_permission(short type, u32 major, u32 minor,
- short access)
-{ return 0; }
-#endif
#if defined(CONFIG_CGROUP_DEVICE) || defined(CONFIG_CGROUP_BPF)
+int devcgroup_check_permission(short type, u32 major, u32 minor,
+ short access);
static inline int devcgroup_inode_permission(struct inode *inode, int mask)
{
short type, access = 0;
}
#else
+static inline int devcgroup_check_permission(short type, u32 major, u32 minor,
+ short access)
+{ return 0; }
static inline int devcgroup_inode_permission(struct inode *inode, int mask)
{ return 0; }
static inline int devcgroup_inode_mknod(int mode, dev_t dev)
#define EFI_WRITE_PROTECTED ( 8 | (1UL << (BITS_PER_LONG-1)))
#define EFI_OUT_OF_RESOURCES ( 9 | (1UL << (BITS_PER_LONG-1)))
#define EFI_NOT_FOUND (14 | (1UL << (BITS_PER_LONG-1)))
+#define EFI_TIMEOUT (18 | (1UL << (BITS_PER_LONG-1)))
#define EFI_ABORTED (21 | (1UL << (BITS_PER_LONG-1)))
#define EFI_SECURITY_VIOLATION (26 | (1UL << (BITS_PER_LONG-1)))
typedef struct {
efi_guid_t guid;
- const char *name;
unsigned long *ptr;
+ const char name[16];
} efi_config_table_type_t;
#define EFI_SYSTEM_TABLE_SIGNATURE ((u64)0x5453595320494249ULL)
u32 tables;
} efi_system_table_32_t;
+typedef union efi_simple_text_input_protocol efi_simple_text_input_protocol_t;
typedef union efi_simple_text_output_protocol efi_simple_text_output_protocol_t;
typedef union {
unsigned long fw_vendor; /* physical addr of CHAR16 vendor string */
u32 fw_revision;
unsigned long con_in_handle;
- unsigned long con_in;
+ efi_simple_text_input_protocol_t *con_in;
unsigned long con_out_handle;
efi_simple_text_output_protocol_t *con_out;
unsigned long stderr_handle;
void __init efi_arch_mem_reserve(phys_addr_t addr, u64 size);
+char *efi_systab_show_arch(char *str);
+
#endif /* _LINUX_EFI_H */
* Directory entry modification events - reported only to directory
* where entry is modified and not to a watching parent.
*/
-#define FANOTIFY_DIRENT_EVENTS (FAN_MOVE | FAN_CREATE | FAN_DELETE | \
- FAN_DIR_MODIFY)
+#define FANOTIFY_DIRENT_EVENTS (FAN_MOVE | FAN_CREATE | FAN_DELETE)
/* Events that can only be reported with data type FSNOTIFY_EVENT_INODE */
#define FANOTIFY_INODE_EVENTS (FANOTIFY_DIRENT_EVENTS | \
#include <linux/workqueue.h>
#include <linux/sched.h>
#include <linux/capability.h>
-#include <linux/cryptohash.h>
#include <linux/set_memory.h>
#include <linux/kallsyms.h>
#include <linux/if_vlan.h>
#include <linux/vmalloc.h>
+#include <crypto/sha.h>
#include <net/sch_generic.h>
static inline u32 bpf_prog_tag_scratch_size(const struct bpf_prog *prog)
{
return round_up(bpf_prog_insn_size(prog) +
- sizeof(__be64) + 1, SHA_MESSAGE_BYTES);
+ sizeof(__be64) + 1, SHA1_BLOCK_SIZE);
}
static inline unsigned int bpf_prog_size(unsigned int proglen)
static void __used __section(.discard.func_stack_frame_non_standard) \
*__func_stack_frame_non_standard_##func = func
+/*
+ * This macro indicates that the following intra-function call is valid.
+ * Any non-annotated intra-function call will cause objtool to issue a warning.
+ */
+#define ANNOTATE_INTRA_FUNCTION_CALL \
+ 999: \
+ .pushsection .discard.intra_function_calls; \
+ .long 999b; \
+ .popsection;
+
#else /* !CONFIG_STACK_VALIDATION */
#define STACK_FRAME_NON_STANDARD(func)
+#define ANNOTATE_INTRA_FUNCTION_CALL
#endif /* CONFIG_STACK_VALIDATION */
#include <linux/fs.h>
#include <linux/mm.h>
+#include <linux/parser.h>
#include <linux/slab.h>
#include <uapi/linux/fscrypt.h>
#define FS_CRYPTO_BLOCK_SIZE 16
+union fscrypt_context;
struct fscrypt_info;
+struct seq_file;
struct fscrypt_str {
unsigned char *name;
struct fscrypt_operations {
unsigned int flags;
const char *key_prefix;
- int (*get_context)(struct inode *, void *, size_t);
- int (*set_context)(struct inode *, const void *, size_t, void *);
- bool (*dummy_context)(struct inode *);
- bool (*empty_dir)(struct inode *);
+ int (*get_context)(struct inode *inode, void *ctx, size_t len);
+ int (*set_context)(struct inode *inode, const void *ctx, size_t len,
+ void *fs_data);
+ const union fscrypt_context *(*get_dummy_context)(
+ struct super_block *sb);
+ bool (*empty_dir)(struct inode *inode);
unsigned int max_namelen;
bool (*has_stable_inodes)(struct super_block *sb);
void (*get_ino_and_lblk_bits)(struct super_block *sb,
/**
* fscrypt_needs_contents_encryption() - check whether an inode needs
* contents encryption
+ * @inode: the inode to check
*
* Return: %true iff the inode is an encrypted regular file and the kernel was
* built with fscrypt support.
return IS_ENCRYPTED(inode) && S_ISREG(inode->i_mode);
}
-static inline bool fscrypt_dummy_context_enabled(struct inode *inode)
+static inline const union fscrypt_context *
+fscrypt_get_dummy_context(struct super_block *sb)
{
- return inode->i_sb->s_cop->dummy_context &&
- inode->i_sb->s_cop->dummy_context(inode);
+ if (!sb->s_cop->get_dummy_context)
+ return NULL;
+ return sb->s_cop->get_dummy_context(sb);
}
/*
}
/* crypto.c */
-extern void fscrypt_enqueue_decrypt_work(struct work_struct *);
-
-extern struct page *fscrypt_encrypt_pagecache_blocks(struct page *page,
- unsigned int len,
- unsigned int offs,
- gfp_t gfp_flags);
-extern int fscrypt_encrypt_block_inplace(const struct inode *inode,
- struct page *page, unsigned int len,
- unsigned int offs, u64 lblk_num,
- gfp_t gfp_flags);
-
-extern int fscrypt_decrypt_pagecache_blocks(struct page *page, unsigned int len,
- unsigned int offs);
-extern int fscrypt_decrypt_block_inplace(const struct inode *inode,
- struct page *page, unsigned int len,
- unsigned int offs, u64 lblk_num);
+void fscrypt_enqueue_decrypt_work(struct work_struct *);
+
+struct page *fscrypt_encrypt_pagecache_blocks(struct page *page,
+ unsigned int len,
+ unsigned int offs,
+ gfp_t gfp_flags);
+int fscrypt_encrypt_block_inplace(const struct inode *inode, struct page *page,
+ unsigned int len, unsigned int offs,
+ u64 lblk_num, gfp_t gfp_flags);
+
+int fscrypt_decrypt_pagecache_blocks(struct page *page, unsigned int len,
+ unsigned int offs);
+int fscrypt_decrypt_block_inplace(const struct inode *inode, struct page *page,
+ unsigned int len, unsigned int offs,
+ u64 lblk_num);
static inline bool fscrypt_is_bounce_page(struct page *page)
{
return (struct page *)page_private(bounce_page);
}
-extern void fscrypt_free_bounce_page(struct page *bounce_page);
+void fscrypt_free_bounce_page(struct page *bounce_page);
/* policy.c */
-extern int fscrypt_ioctl_set_policy(struct file *, const void __user *);
-extern int fscrypt_ioctl_get_policy(struct file *, void __user *);
-extern int fscrypt_ioctl_get_policy_ex(struct file *, void __user *);
-extern int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg);
-extern int fscrypt_has_permitted_context(struct inode *, struct inode *);
-extern int fscrypt_inherit_context(struct inode *, struct inode *,
- void *, bool);
+int fscrypt_ioctl_set_policy(struct file *filp, const void __user *arg);
+int fscrypt_ioctl_get_policy(struct file *filp, void __user *arg);
+int fscrypt_ioctl_get_policy_ex(struct file *filp, void __user *arg);
+int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg);
+int fscrypt_has_permitted_context(struct inode *parent, struct inode *child);
+int fscrypt_inherit_context(struct inode *parent, struct inode *child,
+ void *fs_data, bool preload);
+
+struct fscrypt_dummy_context {
+ const union fscrypt_context *ctx;
+};
+
+int fscrypt_set_test_dummy_encryption(struct super_block *sb,
+ const substring_t *arg,
+ struct fscrypt_dummy_context *dummy_ctx);
+void fscrypt_show_test_dummy_encryption(struct seq_file *seq, char sep,
+ struct super_block *sb);
+static inline void
+fscrypt_free_dummy_context(struct fscrypt_dummy_context *dummy_ctx)
+{
+ kfree(dummy_ctx->ctx);
+ dummy_ctx->ctx = NULL;
+}
+
/* keyring.c */
-extern void fscrypt_sb_free(struct super_block *sb);
-extern int fscrypt_ioctl_add_key(struct file *filp, void __user *arg);
-extern int fscrypt_ioctl_remove_key(struct file *filp, void __user *arg);
-extern int fscrypt_ioctl_remove_key_all_users(struct file *filp,
- void __user *arg);
-extern int fscrypt_ioctl_get_key_status(struct file *filp, void __user *arg);
+void fscrypt_sb_free(struct super_block *sb);
+int fscrypt_ioctl_add_key(struct file *filp, void __user *arg);
+int fscrypt_ioctl_remove_key(struct file *filp, void __user *arg);
+int fscrypt_ioctl_remove_key_all_users(struct file *filp, void __user *arg);
+int fscrypt_ioctl_get_key_status(struct file *filp, void __user *arg);
/* keysetup.c */
-extern int fscrypt_get_encryption_info(struct inode *);
-extern void fscrypt_put_encryption_info(struct inode *);
-extern void fscrypt_free_inode(struct inode *);
-extern int fscrypt_drop_inode(struct inode *inode);
+int fscrypt_get_encryption_info(struct inode *inode);
+void fscrypt_put_encryption_info(struct inode *inode);
+void fscrypt_free_inode(struct inode *inode);
+int fscrypt_drop_inode(struct inode *inode);
/* fname.c */
-extern int fscrypt_setup_filename(struct inode *, const struct qstr *,
- int lookup, struct fscrypt_name *);
+int fscrypt_setup_filename(struct inode *inode, const struct qstr *iname,
+ int lookup, struct fscrypt_name *fname);
static inline void fscrypt_free_filename(struct fscrypt_name *fname)
{
kfree(fname->crypto_buf.name);
}
-extern int fscrypt_fname_alloc_buffer(const struct inode *, u32,
- struct fscrypt_str *);
-extern void fscrypt_fname_free_buffer(struct fscrypt_str *);
-extern int fscrypt_fname_disk_to_usr(const struct inode *inode,
- u32 hash, u32 minor_hash,
- const struct fscrypt_str *iname,
- struct fscrypt_str *oname);
-extern bool fscrypt_match_name(const struct fscrypt_name *fname,
- const u8 *de_name, u32 de_name_len);
-extern u64 fscrypt_fname_siphash(const struct inode *dir,
- const struct qstr *name);
+int fscrypt_fname_alloc_buffer(const struct inode *inode, u32 max_encrypted_len,
+ struct fscrypt_str *crypto_str);
+void fscrypt_fname_free_buffer(struct fscrypt_str *crypto_str);
+int fscrypt_fname_disk_to_usr(const struct inode *inode,
+ u32 hash, u32 minor_hash,
+ const struct fscrypt_str *iname,
+ struct fscrypt_str *oname);
+bool fscrypt_match_name(const struct fscrypt_name *fname,
+ const u8 *de_name, u32 de_name_len);
+u64 fscrypt_fname_siphash(const struct inode *dir, const struct qstr *name);
/* bio.c */
-extern void fscrypt_decrypt_bio(struct bio *);
-extern int fscrypt_zeroout_range(const struct inode *, pgoff_t, sector_t,
- unsigned int);
+void fscrypt_decrypt_bio(struct bio *bio);
+int fscrypt_zeroout_range(const struct inode *inode, pgoff_t lblk,
+ sector_t pblk, unsigned int len);
/* hooks.c */
-extern int fscrypt_file_open(struct inode *inode, struct file *filp);
-extern int __fscrypt_prepare_link(struct inode *inode, struct inode *dir,
- struct dentry *dentry);
-extern int __fscrypt_prepare_rename(struct inode *old_dir,
- struct dentry *old_dentry,
- struct inode *new_dir,
- struct dentry *new_dentry,
- unsigned int flags);
-extern int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
- struct fscrypt_name *fname);
-extern int fscrypt_prepare_setflags(struct inode *inode,
- unsigned int oldflags, unsigned int flags);
-extern int __fscrypt_prepare_symlink(struct inode *dir, unsigned int len,
- unsigned int max_len,
- struct fscrypt_str *disk_link);
-extern int __fscrypt_encrypt_symlink(struct inode *inode, const char *target,
- unsigned int len,
- struct fscrypt_str *disk_link);
-extern const char *fscrypt_get_symlink(struct inode *inode, const void *caddr,
- unsigned int max_size,
- struct delayed_call *done);
+int fscrypt_file_open(struct inode *inode, struct file *filp);
+int __fscrypt_prepare_link(struct inode *inode, struct inode *dir,
+ struct dentry *dentry);
+int __fscrypt_prepare_rename(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry,
+ unsigned int flags);
+int __fscrypt_prepare_lookup(struct inode *dir, struct dentry *dentry,
+ struct fscrypt_name *fname);
+int fscrypt_prepare_setflags(struct inode *inode,
+ unsigned int oldflags, unsigned int flags);
+int __fscrypt_prepare_symlink(struct inode *dir, unsigned int len,
+ unsigned int max_len,
+ struct fscrypt_str *disk_link);
+int __fscrypt_encrypt_symlink(struct inode *inode, const char *target,
+ unsigned int len, struct fscrypt_str *disk_link);
+const char *fscrypt_get_symlink(struct inode *inode, const void *caddr,
+ unsigned int max_size,
+ struct delayed_call *done);
static inline void fscrypt_set_ops(struct super_block *sb,
const struct fscrypt_operations *s_cop)
{
return false;
}
-static inline bool fscrypt_dummy_context_enabled(struct inode *inode)
+static inline const union fscrypt_context *
+fscrypt_get_dummy_context(struct super_block *sb)
{
- return false;
+ return NULL;
}
static inline void fscrypt_handle_d_move(struct dentry *dentry)
return -EOPNOTSUPP;
}
+struct fscrypt_dummy_context {
+};
+
+static inline void fscrypt_show_test_dummy_encryption(struct seq_file *seq,
+ char sep,
+ struct super_block *sb)
+{
+}
+
+static inline void
+fscrypt_free_dummy_context(struct fscrypt_dummy_context *dummy_ctx)
+{
+}
+
/* keyring.c */
static inline void fscrypt_sb_free(struct super_block *sb)
{
#endif /* !CONFIG_FS_ENCRYPTION */
/**
- * fscrypt_require_key - require an inode's encryption key
+ * fscrypt_require_key() - require an inode's encryption key
* @inode: the inode we need the key for
*
* If the inode is encrypted, set up its encryption key if not already done.
}
/**
- * fscrypt_prepare_link - prepare to link an inode into a possibly-encrypted directory
+ * fscrypt_prepare_link() - prepare to link an inode into a possibly-encrypted
+ * directory
* @old_dentry: an existing dentry for the inode being linked
* @dir: the target directory
* @dentry: negative dentry for the target filename
}
/**
- * fscrypt_prepare_rename - prepare for a rename between possibly-encrypted directories
+ * fscrypt_prepare_rename() - prepare for a rename between possibly-encrypted
+ * directories
* @old_dir: source directory
* @old_dentry: dentry for source file
* @new_dir: target directory
}
/**
- * fscrypt_prepare_lookup - prepare to lookup a name in a possibly-encrypted directory
+ * fscrypt_prepare_lookup() - prepare to lookup a name in a possibly-encrypted
+ * directory
* @dir: directory being searched
* @dentry: filename being looked up
* @fname: (output) the name to use to search the on-disk directory
}
/**
- * fscrypt_prepare_setattr - prepare to change a possibly-encrypted inode's attributes
+ * fscrypt_prepare_setattr() - prepare to change a possibly-encrypted inode's
+ * attributes
* @dentry: dentry through which the inode is being changed
* @attr: attributes to change
*
}
/**
- * fscrypt_prepare_symlink - prepare to create a possibly-encrypted symlink
+ * fscrypt_prepare_symlink() - prepare to create a possibly-encrypted symlink
* @dir: directory in which the symlink is being created
* @target: plaintext symlink target
* @len: length of @target excluding null terminator
unsigned int max_len,
struct fscrypt_str *disk_link)
{
- if (IS_ENCRYPTED(dir) || fscrypt_dummy_context_enabled(dir))
+ if (IS_ENCRYPTED(dir) || fscrypt_get_dummy_context(dir->i_sb) != NULL)
return __fscrypt_prepare_symlink(dir, len, max_len, disk_link);
disk_link->name = (unsigned char *)target;
}
/**
- * fscrypt_encrypt_symlink - encrypt the symlink target if needed
+ * fscrypt_encrypt_symlink() - encrypt the symlink target if needed
* @inode: symlink inode
* @target: plaintext symlink target
* @len: length of @target excluding null terminator
/* enable.c */
-extern int fsverity_ioctl_enable(struct file *filp, const void __user *arg);
+int fsverity_ioctl_enable(struct file *filp, const void __user *arg);
/* measure.c */
-extern int fsverity_ioctl_measure(struct file *filp, void __user *arg);
+int fsverity_ioctl_measure(struct file *filp, void __user *arg);
/* open.c */
-extern int fsverity_file_open(struct inode *inode, struct file *filp);
-extern int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
-extern void fsverity_cleanup_inode(struct inode *inode);
+int fsverity_file_open(struct inode *inode, struct file *filp);
+int fsverity_prepare_setattr(struct dentry *dentry, struct iattr *attr);
+void fsverity_cleanup_inode(struct inode *inode);
/* verify.c */
-extern bool fsverity_verify_page(struct page *page);
-extern void fsverity_verify_bio(struct bio *bio);
-extern void fsverity_enqueue_verify_work(struct work_struct *work);
+bool fsverity_verify_page(struct page *page);
+void fsverity_verify_bio(struct bio *bio);
+void fsverity_enqueue_verify_work(struct work_struct *work);
#else /* !CONFIG_FS_VERITY */
/**
* fsverity_active() - do reads from the inode need to go through fs-verity?
+ * @inode: inode to check
*
* This checks whether ->i_verity_info has been set.
*
* be verified or not. Don't use IS_VERITY() for this purpose; it's subject to
* a race condition where the file is being read concurrently with
* FS_IOC_ENABLE_VERITY completing. (S_VERITY is set before ->i_verity_info.)
+ *
+ * Return: true if reads need to go through fs-verity, otherwise false
*/
static inline bool fsverity_active(const struct inode *inode)
{
#endif
};
+extern struct ftrace_ops __rcu *ftrace_ops_list;
+extern struct ftrace_ops ftrace_list_end;
+
+/*
+ * Traverse the ftrace_global_list, invoking all entries. The reason that we
+ * can use rcu_dereference_raw_check() is that elements removed from this list
+ * are simply leaked, so there is no need to interact with a grace-period
+ * mechanism. The rcu_dereference_raw_check() calls are needed to handle
+ * concurrent insertions into the ftrace_global_list.
+ *
+ * Silly Alpha and silly pointer-speculation compiler optimizations!
+ */
+#define do_for_each_ftrace_op(op, list) \
+ op = rcu_dereference_raw_check(list); \
+ do
+
+/*
+ * Optimized for just a single item in the list (as that is the normal case).
+ */
+#define while_for_each_ftrace_op(op) \
+ while (likely(op = rcu_dereference_raw_check((op)->next)) && \
+ unlikely((op) != &ftrace_list_end))
+
/*
* Type of the current tracing.
*/
#ifndef _LINUX_FTRACE_IRQ_H
#define _LINUX_FTRACE_IRQ_H
-
-#ifdef CONFIG_FTRACE_NMI_ENTER
-extern void arch_ftrace_nmi_enter(void);
-extern void arch_ftrace_nmi_exit(void);
-#else
-static inline void arch_ftrace_nmi_enter(void) { }
-static inline void arch_ftrace_nmi_exit(void) { }
-#endif
-
#ifdef CONFIG_HWLAT_TRACER
extern bool trace_hwlat_callback_enabled;
extern void trace_hwlat_callback(bool enter);
if (trace_hwlat_callback_enabled)
trace_hwlat_callback(true);
#endif
- arch_ftrace_nmi_enter();
}
static inline void ftrace_nmi_exit(void)
{
- arch_ftrace_nmi_exit();
#ifdef CONFIG_HWLAT_TRACER
if (trace_hwlat_callback_enabled)
trace_hwlat_callback(false);
#ifndef LINUX_HARDIRQ_H
#define LINUX_HARDIRQ_H
+#include <linux/context_tracking_state.h>
#include <linux/preempt.h>
#include <linux/lockdep.h>
#include <linux/ftrace_irq.h>
#include <linux/vtime.h>
#include <asm/hardirq.h>
-
extern void synchronize_irq(unsigned int irq);
extern bool synchronize_hardirq(unsigned int irq);
-#if defined(CONFIG_TINY_RCU)
-
-static inline void rcu_nmi_enter(void)
-{
-}
+#ifdef CONFIG_NO_HZ_FULL
+void __rcu_irq_enter_check_tick(void);
+#else
+static inline void __rcu_irq_enter_check_tick(void) { }
+#endif
-static inline void rcu_nmi_exit(void)
+static __always_inline void rcu_irq_enter_check_tick(void)
{
+ if (context_tracking_enabled())
+ __rcu_irq_enter_check_tick();
}
-#else
-extern void rcu_nmi_enter(void);
-extern void rcu_nmi_exit(void);
-#endif
-
/*
* It is safe to do non-atomic ops on ->hardirq_context,
* because NMI handlers may not preempt and the ops are
#define arch_nmi_exit() do { } while (0)
#endif
+#ifdef CONFIG_TINY_RCU
+static inline void rcu_nmi_enter(void) { }
+static inline void rcu_nmi_exit(void) { }
+#else
+extern void rcu_nmi_enter(void);
+extern void rcu_nmi_exit(void);
+#endif
+
+/*
+ * NMI vs Tracing
+ * --------------
+ *
+ * We must not land in a tracer until (or after) we've changed preempt_count
+ * such that in_nmi() becomes true. To that effect all NMI C entry points must
+ * be marked 'notrace' and call nmi_enter() as soon as possible.
+ */
+
+/*
+ * nmi_enter() can nest up to 15 times; see NMI_BITS.
+ */
#define nmi_enter() \
do { \
arch_nmi_enter(); \
printk_nmi_enter(); \
lockdep_off(); \
ftrace_nmi_enter(); \
- BUG_ON(in_nmi()); \
- preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
+ BUG_ON(in_nmi() == NMI_MASK); \
+ __preempt_count_add(NMI_OFFSET + HARDIRQ_OFFSET); \
rcu_nmi_enter(); \
lockdep_hardirq_enter(); \
} while (0)
lockdep_hardirq_exit(); \
rcu_nmi_exit(); \
BUG_ON(!in_nmi()); \
- preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
+ __preempt_count_sub(NMI_OFFSET + HARDIRQ_OFFSET); \
ftrace_nmi_exit(); \
lockdep_on(); \
printk_nmi_exit(); \
HOST1X_CLASS_GR3D = 0x60,
};
+struct host1x;
struct host1x_client;
struct iommu_group;
+u64 host1x_get_dma_mask(struct host1x *host1x);
+
/**
* struct host1x_client_ops - host1x client operations
* @init: host1x client initialization code
void hwmon_device_unregister(struct device *dev);
void devm_hwmon_device_unregister(struct device *dev);
+int hwmon_notify_event(struct device *dev, enum hwmon_sensor_types type,
+ u32 attr, int channel);
+
/**
* hwmon_is_bad_char - Is the char invalid in a hwmon name
* @ch: the char to be considered
int num_adapters;
int max_adapters;
- struct i2c_adapter *adapter[0];
+ struct i2c_adapter *adapter[];
};
struct i2c_mux_core *i2c_mux_alloc(struct i2c_adapter *parent,
/*
* i2c.h - definitions for the Linux i2c bus interface
* Copyright (C) 1995-2000 Simon G. Vogl
- * Copyright (C) 2013-2019 Wolfram Sang <wsa@the-dreams.de>
+ * Copyright (C) 2013-2019 Wolfram Sang <wsa@kernel.org>
*
* With some changes from Kyösti Mälkki <kmalkki@cc.hut.fi> and
* Frodo Looijaard <frodol@dds.nl>
*/
static inline void idr_preload_end(void)
{
- preempt_enable();
+ local_unlock(&radix_tree_preloads.lock);
}
/**
}
/* HE Operation defines */
-#define IEEE80211_HE_OPERATION_DFLT_PE_DURATION_MASK 0x00000003
+#define IEEE80211_HE_OPERATION_DFLT_PE_DURATION_MASK 0x00000007
#define IEEE80211_HE_OPERATION_TWT_REQUIRED 0x00000008
#define IEEE80211_HE_OPERATION_RTS_THRESHOLD_MASK 0x00003ff0
#define IEEE80211_HE_OPERATION_RTS_THRESHOLD_OFFSET 4
/*
* public include for LM8333 keypad driver - same license as driver
- * Copyright (C) 2012 Wolfram Sang, Pengutronix <w.sang@pengutronix.de>
+ * Copyright (C) 2012 Wolfram Sang, Pengutronix <kernel@pengutronix.de>
*/
#ifndef _LM8333_H
KMSG_DUMP_PANIC,
KMSG_DUMP_OOPS,
KMSG_DUMP_EMERG,
- KMSG_DUMP_RESTART,
- KMSG_DUMP_HALT,
- KMSG_DUMP_POWEROFF,
+ KMSG_DUMP_SHUTDOWN,
+ KMSG_DUMP_MAX
};
/**
int kmsg_dump_register(struct kmsg_dumper *dumper);
int kmsg_dump_unregister(struct kmsg_dumper *dumper);
+
+const char *kmsg_dump_reason_str(enum kmsg_dump_reason reason);
#else
static inline void kmsg_dump(enum kmsg_dump_reason reason)
{
{
return -EINVAL;
}
+
+static inline const char *kmsg_dump_reason_str(enum kmsg_dump_reason reason)
+{
+ return "Disabled";
+}
#endif
#endif /* _LINUX_KMSG_DUMP_H */
void kvm_reload_remote_mmus(struct kvm *kvm);
bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
+ struct kvm_vcpu *except,
unsigned long *vcpu_bitmap, cpumask_var_t tmp);
bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req);
+bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
+ struct kvm_vcpu *except);
bool kvm_make_cpus_request_mask(struct kvm *kvm, unsigned int req,
unsigned long *vcpu_bitmap);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/* Copyright (C) 2020 ROHM Semiconductors */
+
+#ifndef LINEAR_RANGE_H
+#define LINEAR_RANGE_H
+
+#include <linux/types.h>
+
+/**
+ * struct linear_range - table of selector - value pairs
+ *
+ * Define a lookup-table for range of values. Intended to help when looking
+ * for a register value matching certaing physical measure (like voltage).
+ * Usable when increment of one in register always results a constant increment
+ * of the physical measure (like voltage).
+ *
+ * @min: Lowest value in range
+ * @min_sel: Lowest selector for range
+ * @max_sel: Highest selector for range
+ * @step: Value step size
+ */
+struct linear_range {
+ unsigned int min;
+ unsigned int min_sel;
+ unsigned int max_sel;
+ unsigned int step;
+};
+
+unsigned int linear_range_values_in_range(const struct linear_range *r);
+unsigned int linear_range_values_in_range_array(const struct linear_range *r,
+ int ranges);
+unsigned int linear_range_get_max_value(const struct linear_range *r);
+
+int linear_range_get_value(const struct linear_range *r, unsigned int selector,
+ unsigned int *val);
+int linear_range_get_value_array(const struct linear_range *r, int ranges,
+ unsigned int selector, unsigned int *val);
+int linear_range_get_selector_low(const struct linear_range *r,
+ unsigned int val, unsigned int *selector,
+ bool *found);
+int linear_range_get_selector_high(const struct linear_range *r,
+ unsigned int val, unsigned int *selector,
+ bool *found);
+int linear_range_get_selector_low_array(const struct linear_range *r,
+ int ranges, unsigned int val,
+ unsigned int *selector, bool *found);
+
+#endif
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_LOCAL_LOCK_H
+#define _LINUX_LOCAL_LOCK_H
+
+#include <linux/local_lock_internal.h>
+
+/**
+ * local_lock_init - Runtime initialize a lock instance
+ */
+#define local_lock_init(lock) __local_lock_init(lock)
+
+/**
+ * local_lock - Acquire a per CPU local lock
+ * @lock: The lock variable
+ */
+#define local_lock(lock) __local_lock(lock)
+
+/**
+ * local_lock_irq - Acquire a per CPU local lock and disable interrupts
+ * @lock: The lock variable
+ */
+#define local_lock_irq(lock) __local_lock_irq(lock)
+
+/**
+ * local_lock_irqsave - Acquire a per CPU local lock, save and disable
+ * interrupts
+ * @lock: The lock variable
+ * @flags: Storage for interrupt flags
+ */
+#define local_lock_irqsave(lock, flags) \
+ __local_lock_irqsave(lock, flags)
+
+/**
+ * local_unlock - Release a per CPU local lock
+ * @lock: The lock variable
+ */
+#define local_unlock(lock) __local_unlock(lock)
+
+/**
+ * local_unlock_irq - Release a per CPU local lock and enable interrupts
+ * @lock: The lock variable
+ */
+#define local_unlock_irq(lock) __local_unlock_irq(lock)
+
+/**
+ * local_unlock_irqrestore - Release a per CPU local lock and restore
+ * interrupt flags
+ * @lock: The lock variable
+ * @flags: Interrupt flags to restore
+ */
+#define local_unlock_irqrestore(lock, flags) \
+ __local_unlock_irqrestore(lock, flags)
+
+#endif
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _LINUX_LOCAL_LOCK_H
+# error "Do not include directly, include linux/local_lock.h"
+#endif
+
+#include <linux/percpu-defs.h>
+#include <linux/lockdep.h>
+
+typedef struct {
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+ struct lockdep_map dep_map;
+ struct task_struct *owner;
+#endif
+} local_lock_t;
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+# define LL_DEP_MAP_INIT(lockname) \
+ .dep_map = { \
+ .name = #lockname, \
+ .wait_type_inner = LD_WAIT_CONFIG, \
+ }
+#else
+# define LL_DEP_MAP_INIT(lockname)
+#endif
+
+#define INIT_LOCAL_LOCK(lockname) { LL_DEP_MAP_INIT(lockname) }
+
+#define __local_lock_init(lock) \
+do { \
+ static struct lock_class_key __key; \
+ \
+ debug_check_no_locks_freed((void *)lock, sizeof(*lock));\
+ lockdep_init_map_wait(&(lock)->dep_map, #lock, &__key, 0, LD_WAIT_CONFIG);\
+} while (0)
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+static inline void local_lock_acquire(local_lock_t *l)
+{
+ lock_map_acquire(&l->dep_map);
+ DEBUG_LOCKS_WARN_ON(l->owner);
+ l->owner = current;
+}
+
+static inline void local_lock_release(local_lock_t *l)
+{
+ DEBUG_LOCKS_WARN_ON(l->owner != current);
+ l->owner = NULL;
+ lock_map_release(&l->dep_map);
+}
+
+#else /* CONFIG_DEBUG_LOCK_ALLOC */
+static inline void local_lock_acquire(local_lock_t *l) { }
+static inline void local_lock_release(local_lock_t *l) { }
+#endif /* !CONFIG_DEBUG_LOCK_ALLOC */
+
+#define __local_lock(lock) \
+ do { \
+ preempt_disable(); \
+ local_lock_acquire(this_cpu_ptr(lock)); \
+ } while (0)
+
+#define __local_lock_irq(lock) \
+ do { \
+ local_irq_disable(); \
+ local_lock_acquire(this_cpu_ptr(lock)); \
+ } while (0)
+
+#define __local_lock_irqsave(lock, flags) \
+ do { \
+ local_irq_save(flags); \
+ local_lock_acquire(this_cpu_ptr(lock)); \
+ } while (0)
+
+#define __local_unlock(lock) \
+ do { \
+ local_lock_release(this_cpu_ptr(lock)); \
+ preempt_enable(); \
+ } while (0)
+
+#define __local_unlock_irq(lock) \
+ do { \
+ local_lock_release(this_cpu_ptr(lock)); \
+ local_irq_enable(); \
+ } while (0)
+
+#define __local_unlock_irqrestore(lock, flags) \
+ do { \
+ local_lock_release(this_cpu_ptr(lock)); \
+ local_irq_restore(flags); \
+ } while (0)
extern void lockdep_init_task(struct task_struct *task);
-extern void lockdep_off(void);
-extern void lockdep_on(void);
+/*
+ * Split the recrursion counter in two to readily detect 'off' vs recursion.
+ */
+#define LOCKDEP_RECURSION_BITS 16
+#define LOCKDEP_OFF (1U << LOCKDEP_RECURSION_BITS)
+#define LOCKDEP_RECURSION_MASK (LOCKDEP_OFF - 1)
+
+/*
+ * lockdep_{off,on}() are macros to avoid tracing and kprobes; not inlines due
+ * to header dependencies.
+ */
+
+#define lockdep_off() \
+do { \
+ current->lockdep_recursion += LOCKDEP_OFF; \
+} while (0)
+
+#define lockdep_on() \
+do { \
+ current->lockdep_recursion -= LOCKDEP_OFF; \
+} while (0)
extern void lockdep_register_key(struct lock_class_key *key);
extern void lockdep_unregister_key(struct lock_class_key *key);
char **value)
LSM_HOOK(int, -EINVAL, setprocattr, const char *name, void *value, size_t size)
LSM_HOOK(int, 0, ismaclabel, const char *name)
-LSM_HOOK(int, 0, secid_to_secctx, u32 secid, char **secdata,
+LSM_HOOK(int, -EOPNOTSUPP, secid_to_secctx, u32 secid, char **secdata,
u32 *seclen)
LSM_HOOK(int, 0, secctx_to_secid, const char *secdata, u32 seclen, u32 *secid)
LSM_HOOK(void, LSM_RET_VOID, release_secctx, char *secdata, u32 seclen)
atomic_long_inc(&memcg->memory_events[event]);
cgroup_file_notify(&memcg->events_file);
+ if (!cgroup_subsys_on_dfl(memory_cgrp_subsys))
+ break;
if (cgrp_dfl_root.flags & CGRP_ROOT_MEMORY_LOCAL_EVENTS)
break;
} while ((memcg = parent_mem_cgroup(memcg)) &&
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0
+ *
+ * Copyright (C) 2020 Gateworks Corporation
+ */
+#ifndef __LINUX_MFD_GSC_H_
+#define __LINUX_MFD_GSC_H_
+
+#include <linux/regmap.h>
+
+/* Device Addresses */
+#define GSC_MISC 0x20
+#define GSC_UPDATE 0x21
+#define GSC_GPIO 0x23
+#define GSC_HWMON 0x29
+#define GSC_EEPROM0 0x50
+#define GSC_EEPROM1 0x51
+#define GSC_EEPROM2 0x52
+#define GSC_EEPROM3 0x53
+#define GSC_RTC 0x68
+
+/* Register offsets */
+enum {
+ GSC_CTRL_0 = 0x00,
+ GSC_CTRL_1 = 0x01,
+ GSC_TIME = 0x02,
+ GSC_TIME_ADD = 0x06,
+ GSC_IRQ_STATUS = 0x0A,
+ GSC_IRQ_ENABLE = 0x0B,
+ GSC_FW_CRC = 0x0C,
+ GSC_FW_VER = 0x0E,
+ GSC_WP = 0x0F,
+};
+
+/* Bit definitions */
+#define GSC_CTRL_0_PB_HARD_RESET 0
+#define GSC_CTRL_0_PB_CLEAR_SECURE_KEY 1
+#define GSC_CTRL_0_PB_SOFT_POWER_DOWN 2
+#define GSC_CTRL_0_PB_BOOT_ALTERNATE 3
+#define GSC_CTRL_0_PERFORM_CRC 4
+#define GSC_CTRL_0_TAMPER_DETECT 5
+#define GSC_CTRL_0_SWITCH_HOLD 6
+
+#define GSC_CTRL_1_SLEEP_ENABLE 0
+#define GSC_CTRL_1_SLEEP_ACTIVATE 1
+#define GSC_CTRL_1_SLEEP_ADD 2
+#define GSC_CTRL_1_SLEEP_NOWAKEPB 3
+#define GSC_CTRL_1_WDT_TIME 4
+#define GSC_CTRL_1_WDT_ENABLE 5
+#define GSC_CTRL_1_SWITCH_BOOT_ENABLE 6
+#define GSC_CTRL_1_SWITCH_BOOT_CLEAR 7
+
+#define GSC_IRQ_PB 0
+#define GSC_IRQ_KEY_ERASED 1
+#define GSC_IRQ_EEPROM_WP 2
+#define GSC_IRQ_RESV 3
+#define GSC_IRQ_GPIO 4
+#define GSC_IRQ_TAMPER 5
+#define GSC_IRQ_WDT_TIMEOUT 6
+#define GSC_IRQ_SWITCH_HOLD 7
+
+int gsc_read(void *context, unsigned int reg, unsigned int *val);
+int gsc_write(void *context, unsigned int reg, unsigned int val);
+
+struct gsc_dev {
+ struct device *dev;
+
+ struct i2c_client *i2c; /* 0x20: interrupt controller, WDT */
+ struct i2c_client *i2c_hwmon; /* 0x29: hwmon, fan controller */
+
+ struct regmap *regmap;
+
+ unsigned int fwver;
+ unsigned short fwcrc;
+};
+
+#endif /* __LINUX_MFD_GSC_H_ */
MAX8998_ENVICHG,
MAX8998_ESAFEOUT1,
MAX8998_ESAFEOUT2,
+ MAX8998_CHARGER,
};
/**
MLX5_PORT_DOWN = 2,
};
+enum mlx5_cmdif_state {
+ MLX5_CMDIF_STATE_UNINITIALIZED,
+ MLX5_CMDIF_STATE_UP,
+ MLX5_CMDIF_STATE_DOWN,
+};
+
struct mlx5_cmd_first {
__be32 data[4];
};
struct mlx5_cmd {
struct mlx5_nb nb;
+ enum mlx5_cmdif_state state;
void *cmd_alloc_buf;
dma_addr_t alloc_dma;
int alloc_size;
struct semaphore sem;
struct semaphore pages_sem;
int mode;
+ u16 allowed_opcode;
struct mlx5_cmd_work_ent *ent_arr[MLX5_MAX_COMMANDS];
struct dma_pool *pool;
struct mlx5_cmd_debug dbg;
struct delayed_work cb_timeout_work;
void *context;
int idx;
+ struct completion handling;
struct completion done;
struct mlx5_cmd *cmd;
struct work_struct work;
return min_t(u32, last_frag_stride_idx - fbc->strides_offset, fbc->sz_m1);
}
+enum {
+ CMD_ALLOWED_OPCODE_ALL,
+};
+
int mlx5_cmd_init(struct mlx5_core_dev *dev);
void mlx5_cmd_cleanup(struct mlx5_core_dev *dev);
+void mlx5_cmd_set_state(struct mlx5_core_dev *dev,
+ enum mlx5_cmdif_state cmdif_state);
void mlx5_cmd_use_events(struct mlx5_core_dev *dev);
void mlx5_cmd_use_polling(struct mlx5_core_dev *dev);
+void mlx5_cmd_allowed_opcode(struct mlx5_core_dev *dev, u16 opcode);
struct mlx5_async_ctx {
struct mlx5_core_dev *dev;
# define VM_MAPPED_COPY VM_ARCH_1 /* T if mapped copy of data (nommu mmap) */
#endif
-#if defined(CONFIG_X86_INTEL_MPX)
-/* MPX specific bounds table or bounds directory */
-# define VM_MPX VM_HIGH_ARCH_4
-#else
-# define VM_MPX VM_NONE
-#endif
-
#ifndef VM_GROWSUP
# define VM_GROWSUP VM_NONE
#endif
extern void kvfree(const void *addr);
+/*
+ * Mapcount of compound page as a whole, does not include mapped sub-pages.
+ *
+ * Must be called only for compound pages or any their tail sub-pages.
+ */
static inline int compound_mapcount(struct page *page)
{
VM_BUG_ON_PAGE(!PageCompound(page), page);
int __page_mapcount(struct page *page);
+/*
+ * Mapcount of 0-order page; when compound sub-page, includes
+ * compound_mapcount().
+ *
+ * Result is undefined for pages which cannot be mapped into userspace.
+ * For example SLAB or special types of pages. See function page_has_type().
+ * They use this place in struct page differently.
+ */
static inline int page_mapcount(struct page *page)
{
- VM_BUG_ON_PAGE(PageSlab(page), page);
-
if (unlikely(PageCompound(page)))
return __page_mapcount(page);
return atomic_read(&page->_mapcount) + 1;
__u16 vendor;
__u16 family;
__u16 model;
+ __u16 steppings;
__u16 feature; /* bit index */
kernel_ulong_t driver_data;
};
#define X86_VENDOR_ANY 0xffff
#define X86_FAMILY_ANY 0
#define X86_MODEL_ANY 0
+#define X86_STEPPING_ANY 0
#define X86_FEATURE_ANY 0 /* Same as FPU, you can't test for that */
/*
void __percpu *percpu;
unsigned int percpu_size;
#endif
+ void *noinstr_text_start;
+ unsigned int noinstr_text_size;
#ifdef CONFIG_TRACEPOINTS
unsigned int num_tracepoints;
unsigned int num_ftrace_callsites;
unsigned long *ftrace_callsites;
#endif
+#ifdef CONFIG_KPROBES
+ void *kprobes_text_start;
+ unsigned int kprobes_text_size;
+ unsigned long *kprobe_blacklist;
+ unsigned int num_kprobe_blacklist;
+#endif
#ifdef CONFIG_LIVEPATCH
bool klp; /* Is this a livepatch module? */
#include <net/netfilter/nf_conntrack_expect.h>
#include <uapi/linux/netfilter/nf_conntrack_tuple_common.h>
-extern const char *const pptp_msg_name[];
+const char *pptp_msg_name(u_int16_t msg);
/* state of the control session */
enum pptp_ctrlsess_state {
/**
* struct padata_instance - The overall control structure.
*
- * @node: Used by CPU hotplug.
+ * @cpu_online_node: Linkage for CPU online callback.
+ * @cpu_dead_node: Linkage for CPU offline callback.
* @parallel_wq: The workqueue used for parallel work.
* @serial_wq: The workqueue used for serial work.
* @pslist: List of padata_shell objects attached to this instance.
* @flags: padata flags.
*/
struct padata_instance {
- struct hlist_node node;
+ struct hlist_node cpu_online_node;
+ struct hlist_node cpu_dead_node;
struct workqueue_struct *parallel_wq;
struct workqueue_struct *serial_wq;
struct list_head pslist;
* but could potentially be used anywhere else that simple option=arg
* parsing is required.
*/
-
+#ifndef _LINUX_PARSER_H
+#define _LINUX_PARSER_H
/* associates an integer enumerator with a pattern string. */
struct match_token {
bool match_wildcard(const char *pattern, const char *str);
size_t match_strlcpy(char *, const substring_t *, size_t);
char *match_strdup(const substring_t *);
+
+#endif /* _LINUX_PARSER_H */
struct perf_callchain_entry {
__u64 nr;
- __u64 ip[0]; /* /proc/sys/kernel/perf_event_max_stack */
+ __u64 ip[]; /* /proc/sys/kernel/perf_event_max_stack */
};
struct perf_callchain_entry_ctx {
struct perf_branch_stack {
__u64 nr;
__u64 hw_idx;
- struct perf_branch_entry entries[0];
+ struct perf_branch_entry entries[];
};
struct task_struct;
static inline int perf_allow_kernel(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 1 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 1 && !perfmon_capable())
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_KERNEL);
static inline int perf_allow_cpu(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > 0 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > 0 && !perfmon_capable())
return -EACCES;
return security_perf_event_open(attr, PERF_SECURITY_CPU);
static inline int perf_allow_tracepoint(struct perf_event_attr *attr)
{
- if (sysctl_perf_event_paranoid > -1 && !capable(CAP_SYS_ADMIN))
+ if (sysctl_perf_event_paranoid > -1 && !perfmon_capable())
return -EPERM;
return security_perf_event_open(attr, PERF_SECURITY_TRACEPOINT);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef _GSC_HWMON_H
+#define _GSC_HWMON_H
+
+enum gsc_hwmon_mode {
+ mode_temperature,
+ mode_voltage,
+ mode_voltage_raw,
+ mode_max,
+};
+
+/**
+ * struct gsc_hwmon_channel - configuration parameters
+ * @reg: I2C register offset
+ * @mode: channel mode
+ * @name: channel name
+ * @mvoffset: voltage offset
+ * @vdiv: voltage divider array (2 resistor values in milli-ohms)
+ */
+struct gsc_hwmon_channel {
+ unsigned int reg;
+ unsigned int mode;
+ const char *name;
+ unsigned int mvoffset;
+ unsigned int vdiv[2];
+};
+
+/**
+ * struct gsc_hwmon_platform_data - platform data for gsc_hwmon driver
+ * @channels: pointer to array of gsc_hwmon_channel structures
+ * describing channels
+ * @nchannels: number of elements in @channels array
+ * @vreference: voltage reference (mV)
+ * @resolution: ADC bit resolution
+ * @fan_base: register base for FAN controller
+ */
+struct gsc_hwmon_platform_data {
+ const struct gsc_hwmon_channel *channels;
+ int nchannels;
+ unsigned int resolution;
+ unsigned int vreference;
+ unsigned int fan_base;
+};
+#endif
* PREEMPT_MASK: 0x000000ff
* SOFTIRQ_MASK: 0x0000ff00
* HARDIRQ_MASK: 0x000f0000
- * NMI_MASK: 0x00100000
+ * NMI_MASK: 0x00f00000
* PREEMPT_NEED_RESCHED: 0x80000000
*/
#define PREEMPT_BITS 8
#define SOFTIRQ_BITS 8
#define HARDIRQ_BITS 4
-#define NMI_BITS 1
+#define NMI_BITS 4
#define PREEMPT_SHIFT 0
#define SOFTIRQ_SHIFT (PREEMPT_SHIFT + PREEMPT_BITS)
printk_once(KERN_NOTICE pr_fmt(fmt), ##__VA_ARGS__)
#define pr_info_once(fmt, ...) \
printk_once(KERN_INFO pr_fmt(fmt), ##__VA_ARGS__)
-#define pr_cont_once(fmt, ...) \
- printk_once(KERN_CONT pr_fmt(fmt), ##__VA_ARGS__)
+/* no pr_cont_once, don't do that... */
#if defined(DEBUG)
#define pr_devel_once(fmt, ...) \
u32 tmr_len; /* In */
} __packed;
+#define SEV_INIT_FLAGS_SEV_ES 0x01
+
/**
* struct sev_data_pek_csr - PEK_CSR command parameters
*
*
* @read_mutex: serializes @open, @read, @close, and @erase callbacks
* @flags: bitfield of frontends the backend can accept writes for
+ * @max_reason: Used when PSTORE_FLAGS_DMESG is set. Contains the
+ * kmsg_dump_reason enum value. KMSG_DUMP_UNDEF means
+ * "use existing kmsg_dump() filtering, based on the
+ * printk.always_kmsg_dump boot param" (which is either
+ * KMSG_DUMP_OOPS when false, or KMSG_DUMP_MAX when
+ * true); see printk.always_kmsg_dump for more details.
* @data: backend-private pointer passed back during callbacks
*
* Callbacks:
*/
struct pstore_info {
struct module *owner;
- char *name;
+ const char *name;
struct semaphore buf_lock;
char *buf;
struct mutex read_mutex;
int flags;
+ int max_reason;
void *data;
int (*open)(struct pstore_info *psi);
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __PSTORE_BLK_H_
+#define __PSTORE_BLK_H_
+
+#include <linux/types.h>
+#include <linux/pstore.h>
+#include <linux/pstore_zone.h>
+
+/**
+ * typedef pstore_blk_panic_write_op - panic write operation to block device
+ *
+ * @buf: the data to write
+ * @start_sect: start sector to block device
+ * @sects: sectors count on buf
+ *
+ * Return: On success, zero should be returned. Others excluding -ENOMSG
+ * mean error. -ENOMSG means to try next zone.
+ *
+ * Panic write to block device must be aligned to SECTOR_SIZE.
+ */
+typedef int (*pstore_blk_panic_write_op)(const char *buf, sector_t start_sect,
+ sector_t sects);
+
+/**
+ * struct pstore_blk_info - pstore/blk registration details
+ *
+ * @major: Which major device number to support with pstore/blk
+ * @flags: The supported PSTORE_FLAGS_* from linux/pstore.h.
+ * @panic_write:The write operation only used for the panic case.
+ * This can be NULL, but is recommended to avoid losing
+ * crash data if the kernel's IO path or work queues are
+ * broken during a panic.
+ * @devt: The dev_t that pstore/blk has attached to.
+ * @nr_sects: Number of sectors on @devt.
+ * @start_sect: Starting sector on @devt.
+ */
+struct pstore_blk_info {
+ unsigned int major;
+ unsigned int flags;
+ pstore_blk_panic_write_op panic_write;
+
+ /* Filled in by pstore/blk after registration. */
+ dev_t devt;
+ sector_t nr_sects;
+ sector_t start_sect;
+};
+
+int register_pstore_blk(struct pstore_blk_info *info);
+void unregister_pstore_blk(unsigned int major);
+
+/**
+ * struct pstore_device_info - back-end pstore/blk driver structure.
+ *
+ * @total_size: The total size in bytes pstore/blk can use. It must be greater
+ * than 4096 and be multiple of 4096.
+ * @flags: Refer to macro starting with PSTORE_FLAGS defined in
+ * linux/pstore.h. It means what front-ends this device support.
+ * Zero means all backends for compatible.
+ * @read: The general read operation. Both of the function parameters
+ * @size and @offset are relative value to bock device (not the
+ * whole disk).
+ * On success, the number of bytes should be returned, others
+ * means error.
+ * @write: The same as @read, but the following error number:
+ * -EBUSY means try to write again later.
+ * -ENOMSG means to try next zone.
+ * @erase: The general erase operation for device with special removing
+ * job. Both of the function parameters @size and @offset are
+ * relative value to storage.
+ * Return 0 on success and others on failure.
+ * @panic_write:The write operation only used for panic case. It's optional
+ * if you do not care panic log. The parameters are relative
+ * value to storage.
+ * On success, the number of bytes should be returned, others
+ * excluding -ENOMSG mean error. -ENOMSG means to try next zone.
+ */
+struct pstore_device_info {
+ unsigned long total_size;
+ unsigned int flags;
+ pstore_zone_read_op read;
+ pstore_zone_write_op write;
+ pstore_zone_erase_op erase;
+ pstore_zone_write_op panic_write;
+};
+
+int register_pstore_device(struct pstore_device_info *dev);
+void unregister_pstore_device(struct pstore_device_info *dev);
+
+/**
+ * struct pstore_blk_config - the pstore_blk backend configuration
+ *
+ * @device: Name of the desired block device
+ * @max_reason: Maximum kmsg dump reason to store to block device
+ * @kmsg_size: Total size of for kmsg dumps
+ * @pmsg_size: Total size of the pmsg storage area
+ * @console_size: Total size of the console storage area
+ * @ftrace_size: Total size for ftrace logging data (for all CPUs)
+ */
+struct pstore_blk_config {
+ char device[80];
+ enum kmsg_dump_reason max_reason;
+ unsigned long kmsg_size;
+ unsigned long pmsg_size;
+ unsigned long console_size;
+ unsigned long ftrace_size;
+};
+
+/**
+ * pstore_blk_get_config - get a copy of the pstore_blk backend configuration
+ *
+ * @info: The sturct pstore_blk_config to be filled in
+ *
+ * Failure returns negative error code, and success returns 0.
+ */
+int pstore_blk_get_config(struct pstore_blk_config *info);
+
+#endif
unsigned long console_size;
unsigned long ftrace_size;
unsigned long pmsg_size;
- int dump_oops;
+ int max_reason;
u32 flags;
struct persistent_ram_ecc_info ecc_info;
};
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#ifndef __PSTORE_ZONE_H_
+#define __PSTORE_ZONE_H_
+
+#include <linux/types.h>
+
+typedef ssize_t (*pstore_zone_read_op)(char *, size_t, loff_t);
+typedef ssize_t (*pstore_zone_write_op)(const char *, size_t, loff_t);
+typedef ssize_t (*pstore_zone_erase_op)(size_t, loff_t);
+/**
+ * struct pstore_zone_info - pstore/zone back-end driver structure
+ *
+ * @owner: Module which is responsible for this back-end driver.
+ * @name: Name of the back-end driver.
+ * @total_size: The total size in bytes pstore/zone can use. It must be greater
+ * than 4096 and be multiple of 4096.
+ * @kmsg_size: The size of oops/panic zone. Zero means disabled, otherwise,
+ * it must be multiple of SECTOR_SIZE(512 Bytes).
+ * @max_reason: Maximum kmsg dump reason to store.
+ * @pmsg_size: The size of pmsg zone which is the same as @kmsg_size.
+ * @console_size:The size of console zone which is the same as @kmsg_size.
+ * @ftrace_size:The size of ftrace zone which is the same as @kmsg_size.
+ * @read: The general read operation. Both of the function parameters
+ * @size and @offset are relative value to storage.
+ * On success, the number of bytes should be returned, others
+ * mean error.
+ * @write: The same as @read, but the following error number:
+ * -EBUSY means try to write again later.
+ * -ENOMSG means to try next zone.
+ * @erase: The general erase operation for device with special removing
+ * job. Both of the function parameters @size and @offset are
+ * relative value to storage.
+ * Return 0 on success and others on failure.
+ * @panic_write:The write operation only used for panic case. It's optional
+ * if you do not care panic log. The parameters are relative
+ * value to storage.
+ * On success, the number of bytes should be returned, others
+ * excluding -ENOMSG mean error. -ENOMSG means to try next zone.
+ */
+struct pstore_zone_info {
+ struct module *owner;
+ const char *name;
+
+ unsigned long total_size;
+ unsigned long kmsg_size;
+ int max_reason;
+ unsigned long pmsg_size;
+ unsigned long console_size;
+ unsigned long ftrace_size;
+ pstore_zone_read_op read;
+ pstore_zone_write_op write;
+ pstore_zone_erase_op erase;
+ pstore_zone_write_op panic_write;
+};
+
+extern int register_pstore_zone(struct pstore_zone_info *info);
+extern void unregister_pstore_zone(struct pstore_zone_info *info);
+
+#endif
* parameter func: the desired function to use.
* parameter chan: the function channel index to use.
*
- * @do_work: Request driver to perform auxiliary (periodic) operations
- * Driver should return delay of the next auxiliary work scheduling
- * time (>=0) or negative value in case further scheduling
- * is not required.
+ * @do_aux_work: Request driver to perform auxiliary (periodic) operations
+ * Driver should return delay of the next auxiliary work
+ * scheduling time (>=0) or negative value in case further
+ * scheduling is not required.
*
* Drivers should embed their ptp_clock_info within a private
* structure, obtaining a reference to it using container_of().
#include <linux/spinlock.h>
#include <linux/types.h>
#include <linux/xarray.h>
+#include <linux/local_lock.h>
/* Keep unconverted code working */
#define radix_tree_root xarray
#define radix_tree_node xa_node
+struct radix_tree_preload {
+ local_lock_t lock;
+ unsigned nr;
+ /* nodes->parent points to next preallocated node */
+ struct radix_tree_node *nodes;
+};
+DECLARE_PER_CPU(struct radix_tree_preload, radix_tree_preloads);
+
/*
* The bottom two bits of the slot determine how the remaining bits in the
* slot are interpreted:
static inline void radix_tree_preload_end(void)
{
- preempt_enable();
+ local_unlock(&radix_tree_preloads.lock);
}
void __rcu **idr_get_free(struct radix_tree_root *root,
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the list_head within the struct.
- * @cond...: optional lockdep expression if called from non-RCU protection.
+ * @cond: optional lockdep expression if called from non-RCU protection.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as list_add_rcu()
* @pos: the type * to use as a loop cursor.
* @head: the head for your list.
* @member: the name of the hlist_node within the struct.
- * @cond...: optional lockdep expression if called from non-RCU protection.
+ * @cond: optional lockdep expression if called from non-RCU protection.
*
* This list-traversal primitive may safely run concurrently with
* the _rcu list-mutation primitives such as hlist_add_head_rcu()
/* Exported common interfaces */
void call_rcu(struct rcu_head *head, rcu_callback_t func);
void rcu_barrier_tasks(void);
+void rcu_barrier_tasks_rude(void);
void synchronize_rcu(void);
#ifdef CONFIG_PREEMPT_RCU
* Note a quasi-voluntary context switch for RCU-tasks's benefit.
* This is a macro rather than an inline function to avoid #include hell.
*/
-#ifdef CONFIG_TASKS_RCU
-#define rcu_tasks_qs(t) \
- do { \
- if (READ_ONCE((t)->rcu_tasks_holdout)) \
- WRITE_ONCE((t)->rcu_tasks_holdout, false); \
+#ifdef CONFIG_TASKS_RCU_GENERIC
+
+# ifdef CONFIG_TASKS_RCU
+# define rcu_tasks_classic_qs(t, preempt) \
+ do { \
+ if (!(preempt) && READ_ONCE((t)->rcu_tasks_holdout)) \
+ WRITE_ONCE((t)->rcu_tasks_holdout, false); \
} while (0)
-#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t)
void call_rcu_tasks(struct rcu_head *head, rcu_callback_t func);
void synchronize_rcu_tasks(void);
+# else
+# define rcu_tasks_classic_qs(t, preempt) do { } while (0)
+# define call_rcu_tasks call_rcu
+# define synchronize_rcu_tasks synchronize_rcu
+# endif
+
+# ifdef CONFIG_TASKS_RCU_TRACE
+# define rcu_tasks_trace_qs(t) \
+ do { \
+ if (!likely(READ_ONCE((t)->trc_reader_checked)) && \
+ !unlikely(READ_ONCE((t)->trc_reader_nesting))) { \
+ smp_store_release(&(t)->trc_reader_checked, true); \
+ smp_mb(); /* Readers partitioned by store. */ \
+ } \
+ } while (0)
+# else
+# define rcu_tasks_trace_qs(t) do { } while (0)
+# endif
+
+#define rcu_tasks_qs(t, preempt) \
+do { \
+ rcu_tasks_classic_qs((t), (preempt)); \
+ rcu_tasks_trace_qs((t)); \
+} while (0)
+
+# ifdef CONFIG_TASKS_RUDE_RCU
+void call_rcu_tasks_rude(struct rcu_head *head, rcu_callback_t func);
+void synchronize_rcu_tasks_rude(void);
+# endif
+
+#define rcu_note_voluntary_context_switch(t) rcu_tasks_qs(t, false)
void exit_tasks_rcu_start(void);
void exit_tasks_rcu_finish(void);
-#else /* #ifdef CONFIG_TASKS_RCU */
-#define rcu_tasks_qs(t) do { } while (0)
+#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
+#define rcu_tasks_qs(t, preempt) do { } while (0)
#define rcu_note_voluntary_context_switch(t) do { } while (0)
#define call_rcu_tasks call_rcu
#define synchronize_rcu_tasks synchronize_rcu
static inline void exit_tasks_rcu_start(void) { }
static inline void exit_tasks_rcu_finish(void) { }
-#endif /* #else #ifdef CONFIG_TASKS_RCU */
+#endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
/**
* cond_resched_tasks_rcu_qs - Report potential quiescent states to RCU
*/
#define cond_resched_tasks_rcu_qs() \
do { \
- rcu_tasks_qs(current); \
+ rcu_tasks_qs(current, false); \
cond_resched(); \
} while (0)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Read-Copy Update mechanism for mutual exclusion, adapted for tracing.
+ *
+ * Copyright (C) 2020 Paul E. McKenney.
+ */
+
+#ifndef __LINUX_RCUPDATE_TRACE_H
+#define __LINUX_RCUPDATE_TRACE_H
+
+#include <linux/sched.h>
+#include <linux/rcupdate.h>
+
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+
+extern struct lockdep_map rcu_trace_lock_map;
+
+static inline int rcu_read_lock_trace_held(void)
+{
+ return lock_is_held(&rcu_trace_lock_map);
+}
+
+#else /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+static inline int rcu_read_lock_trace_held(void)
+{
+ return 1;
+}
+
+#endif /* #else #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+#ifdef CONFIG_TASKS_TRACE_RCU
+
+void rcu_read_unlock_trace_special(struct task_struct *t, int nesting);
+
+/**
+ * rcu_read_lock_trace - mark beginning of RCU-trace read-side critical section
+ *
+ * When synchronize_rcu_trace() is invoked by one task, then that task
+ * is guaranteed to block until all other tasks exit their read-side
+ * critical sections. Similarly, if call_rcu_trace() is invoked on one
+ * task while other tasks are within RCU read-side critical sections,
+ * invocation of the corresponding RCU callback is deferred until after
+ * the all the other tasks exit their critical sections.
+ *
+ * For more details, please see the documentation for rcu_read_lock().
+ */
+static inline void rcu_read_lock_trace(void)
+{
+ struct task_struct *t = current;
+
+ WRITE_ONCE(t->trc_reader_nesting, READ_ONCE(t->trc_reader_nesting) + 1);
+ if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) &&
+ t->trc_reader_special.b.need_mb)
+ smp_mb(); // Pairs with update-side barriers
+ rcu_lock_acquire(&rcu_trace_lock_map);
+}
+
+/**
+ * rcu_read_unlock_trace - mark end of RCU-trace read-side critical section
+ *
+ * Pairs with a preceding call to rcu_read_lock_trace(), and nesting is
+ * allowed. Invoking a rcu_read_unlock_trace() when there is no matching
+ * rcu_read_lock_trace() is verboten, and will result in lockdep complaints.
+ *
+ * For more details, please see the documentation for rcu_read_unlock().
+ */
+static inline void rcu_read_unlock_trace(void)
+{
+ int nesting;
+ struct task_struct *t = current;
+
+ rcu_lock_release(&rcu_trace_lock_map);
+ nesting = READ_ONCE(t->trc_reader_nesting) - 1;
+ if (likely(!READ_ONCE(t->trc_reader_special.s)) || nesting) {
+ WRITE_ONCE(t->trc_reader_nesting, nesting);
+ return; // We assume shallow reader nesting.
+ }
+ rcu_read_unlock_trace_special(t, nesting);
+}
+
+void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func);
+void synchronize_rcu_tasks_trace(void);
+void rcu_barrier_tasks_trace(void);
+
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
+
+#endif /* __LINUX_RCUPDATE_TRACE_H */
#define wait_rcu_gp(...) _wait_rcu_gp(false, __VA_ARGS__)
+/**
+ * synchronize_rcu_mult - Wait concurrently for multiple grace periods
+ * @...: List of call_rcu() functions for different grace periods to wait on
+ *
+ * This macro waits concurrently for multiple types of RCU grace periods.
+ * For example, synchronize_rcu_mult(call_rcu, call_rcu_tasks) would wait
+ * on concurrent RCU and RCU-tasks grace periods. Waiting on a given SRCU
+ * domain requires you to write a wrapper function for that SRCU domain's
+ * call_srcu() function, with this wrapper supplying the pointer to the
+ * corresponding srcu_struct.
+ *
+ * The first argument tells Tiny RCU's _wait_rcu_gp() not to
+ * bother waiting for RCU. The reason for this is because anywhere
+ * synchronize_rcu_mult() can be called is automatically already a full
+ * grace period.
+ */
+#define synchronize_rcu_mult(...) \
+ _wait_rcu_gp(IS_ENABLED(CONFIG_TINY_RCU), __VA_ARGS__)
+
#endif /* _LINUX_SCHED_RCUPDATE_WAIT_H */
#define rcu_note_context_switch(preempt) \
do { \
rcu_qs(); \
- rcu_tasks_qs(current); \
+ rcu_tasks_qs(current, (preempt)); \
} while (0)
static inline int rcu_needs_cpu(u64 basemono, u64 *nextevt)
static inline void rcu_irq_exit_irqson(void) { }
static inline void rcu_irq_enter_irqson(void) { }
static inline void rcu_irq_exit(void) { }
+static inline void rcu_irq_exit_preempt(void) { }
+static inline void rcu_irq_exit_check_preempt(void) { }
static inline void exit_rcu(void) { }
static inline bool rcu_preempt_need_deferred_qs(struct task_struct *t)
{
static inline void rcu_end_inkernel_boot(void) { }
static inline bool rcu_inkernel_boot_has_ended(void) { return true; }
static inline bool rcu_is_watching(void) { return true; }
+static inline bool __rcu_is_watching(void) { return true; }
static inline void rcu_momentary_dyntick_idle(void) { }
static inline void kfree_rcu_scheduler_running(void) { }
+static inline bool rcu_gp_might_be_stalled(void) { return false; }
/* Avoid RCU read-side critical sections leaking across. */
static inline void rcu_all_qs(void) { barrier(); }
bool rcu_eqs_special_set(int cpu);
void rcu_momentary_dyntick_idle(void);
void kfree_rcu_scheduler_running(void);
+bool rcu_gp_might_be_stalled(void);
unsigned long get_state_synchronize_rcu(void);
void cond_synchronize_rcu(unsigned long oldstate);
void rcu_idle_exit(void);
void rcu_irq_enter(void);
void rcu_irq_exit(void);
+void rcu_irq_exit_preempt(void);
void rcu_irq_enter_irqson(void);
void rcu_irq_exit_irqson(void);
+#ifdef CONFIG_PROVE_RCU
+void rcu_irq_exit_check_preempt(void);
+#else
+static inline void rcu_irq_exit_check_preempt(void) { }
+#endif
+
void exit_rcu(void);
void rcu_scheduler_starting(void);
void rcu_end_inkernel_boot(void);
bool rcu_inkernel_boot_has_ended(void);
bool rcu_is_watching(void);
+bool __rcu_is_watching(void);
#ifndef CONFIG_PREEMPTION
void rcu_all_qs(void);
#endif
#include <linux/err.h>
#include <linux/bug.h>
#include <linux/lockdep.h>
+#include <linux/iopoll.h>
struct module;
struct clk;
struct device;
+struct device_node;
struct i2c_client;
struct i3c_device;
struct irq_domain;
unsigned int delay_us;
};
+#define REG_SEQ(_reg, _def, _delay_us) { \
+ .reg = _reg, \
+ .def = _def, \
+ .delay_us = _delay_us, \
+ }
+#define REG_SEQ0(_reg, _def) REG_SEQ(_reg, _def, 0)
+
#define regmap_update_bits(map, reg, mask, val) \
regmap_update_bits_base(map, reg, mask, val, NULL, false, false)
#define regmap_update_bits_async(map, reg, mask, val)\
*/
#define regmap_read_poll_timeout(map, addr, val, cond, sleep_us, timeout_us) \
({ \
- u64 __timeout_us = (timeout_us); \
- unsigned long __sleep_us = (sleep_us); \
- ktime_t __timeout = ktime_add_us(ktime_get(), __timeout_us); \
- int __ret; \
- might_sleep_if(__sleep_us); \
- for (;;) { \
- __ret = regmap_read((map), (addr), &(val)); \
- if (__ret) \
- break; \
- if (cond) \
- break; \
- if ((__timeout_us) && \
- ktime_compare(ktime_get(), __timeout) > 0) { \
- __ret = regmap_read((map), (addr), &(val)); \
- break; \
- } \
- if (__sleep_us) \
- usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
- } \
- __ret ?: ((cond) ? 0 : -ETIMEDOUT); \
+ int __ret, __tmp; \
+ __tmp = read_poll_timeout(regmap_read, __ret, __ret || (cond), \
+ sleep_us, timeout_us, false, (map), (addr), &(val)); \
+ __ret ?: __tmp; \
})
/**
*/
#define regmap_field_read_poll_timeout(field, val, cond, sleep_us, timeout_us) \
({ \
- u64 __timeout_us = (timeout_us); \
- unsigned long __sleep_us = (sleep_us); \
- ktime_t timeout = ktime_add_us(ktime_get(), __timeout_us); \
- int pollret; \
- might_sleep_if(__sleep_us); \
- for (;;) { \
- pollret = regmap_field_read((field), &(val)); \
- if (pollret) \
- break; \
- if (cond) \
- break; \
- if (__timeout_us && ktime_compare(ktime_get(), timeout) > 0) { \
- pollret = regmap_field_read((field), &(val)); \
- break; \
- } \
- if (__sleep_us) \
- usleep_range((__sleep_us >> 2) + 1, __sleep_us); \
- } \
- pollret ?: ((cond) ? 0 : -ETIMEDOUT); \
+ int __ret, __tmp; \
+ __tmp = read_poll_timeout(regmap_field_read, __ret, __ret || (cond), \
+ sleep_us, timeout_us, false, (field), &(val)); \
+ __ret ?: __tmp; \
})
#ifdef CONFIG_REGMAP
const struct regmap_range *ranges,
unsigned int nranges);
+static inline int regmap_set_bits(struct regmap *map,
+ unsigned int reg, unsigned int bits)
+{
+ return regmap_update_bits_base(map, reg, bits, bits,
+ NULL, false, false);
+}
+
+static inline int regmap_clear_bits(struct regmap *map,
+ unsigned int reg, unsigned int bits)
+{
+ return regmap_update_bits_base(map, reg, bits, 0, NULL, false, false);
+}
+
+int regmap_test_bits(struct regmap *map, unsigned int reg, unsigned int bits);
+
/**
* struct reg_field - Description of an register field
*
.msb = _msb, \
}
+#define REG_FIELD_ID(_reg, _lsb, _msb, _size, _offset) { \
+ .reg = _reg, \
+ .lsb = _lsb, \
+ .msb = _msb, \
+ .id_size = _size, \
+ .id_offset = _offset, \
+ }
+
struct regmap_field *regmap_field_alloc(struct regmap *regmap,
struct reg_field reg_field);
void regmap_field_free(struct regmap_field *field);
int regmap_add_irq_chip(struct regmap *map, int irq, int irq_flags,
int irq_base, const struct regmap_irq_chip *chip,
struct regmap_irq_chip_data **data);
+int regmap_add_irq_chip_np(struct device_node *np, struct regmap *map, int irq,
+ int irq_flags, int irq_base,
+ const struct regmap_irq_chip *chip,
+ struct regmap_irq_chip_data **data);
void regmap_del_irq_chip(int irq, struct regmap_irq_chip_data *data);
int devm_regmap_add_irq_chip(struct device *dev, struct regmap *map, int irq,
int irq_flags, int irq_base,
const struct regmap_irq_chip *chip,
struct regmap_irq_chip_data **data);
+int devm_regmap_add_irq_chip_np(struct device *dev, struct device_node *np,
+ struct regmap *map, int irq, int irq_flags,
+ int irq_base,
+ const struct regmap_irq_chip *chip,
+ struct regmap_irq_chip_data **data);
void devm_regmap_del_irq_chip(struct device *dev, int irq,
struct regmap_irq_chip_data *data);
return -EINVAL;
}
+static inline int regmap_set_bits(struct regmap *map,
+ unsigned int reg, unsigned int bits)
+{
+ WARN_ONCE(1, "regmap API is disabled");
+ return -EINVAL;
+}
+
+static inline int regmap_clear_bits(struct regmap *map,
+ unsigned int reg, unsigned int bits)
+{
+ WARN_ONCE(1, "regmap API is disabled");
+ return -EINVAL;
+}
+
+static inline int regmap_test_bits(struct regmap *map,
+ unsigned int reg, unsigned int bits)
+{
+ WARN_ONCE(1, "regmap API is disabled");
+ return -EINVAL;
+}
+
static inline int regmap_field_update_bits_base(struct regmap_field *field,
unsigned int mask, unsigned int val,
bool *change, bool async, bool force)
int regulator_set_voltage_rdev(struct regulator_dev *rdev,
int min_uV, int max_uV,
suspend_state_t state);
+int regulator_do_balance_voltage(struct regulator_dev *rdev,
+ suspend_state_t state, bool skip_coupled);
#else
static inline int regulator_coupler_register(struct regulator_coupler *coupler)
{
{
return -EINVAL;
}
+static inline int regulator_do_balance_voltage(struct regulator_dev *rdev,
+ suspend_state_t state,
+ bool skip_coupled)
+{
+ return -EINVAL;
+}
#endif
#endif
#define __LINUX_REGULATOR_DRIVER_H_
#include <linux/device.h>
+#include <linux/linear_range.h>
#include <linux/notifier.h>
#include <linux/regulator/consumer.h>
#include <linux/ww_mutex.h>
REGULATOR_STATUS_UNDEFINED,
};
-/**
- * struct regulator_linear_range - specify linear voltage ranges
- *
- * Specify a range of voltages for regulator_map_linear_range() and
- * regulator_list_linear_range().
- *
- * @min_uV: Lowest voltage in range
- * @min_sel: Lowest selector for range
- * @max_sel: Highest selector for range
- * @uV_step: Step size
- */
-struct regulator_linear_range {
- unsigned int min_uV;
- unsigned int min_sel;
- unsigned int max_sel;
- unsigned int uV_step;
-};
-
-/* Initialize struct regulator_linear_range */
+/* Initialize struct linear_range for regulators */
#define REGULATOR_LINEAR_RANGE(_min_uV, _min_sel, _max_sel, _step_uV) \
{ \
- .min_uV = _min_uV, \
+ .min = _min_uV, \
.min_sel = _min_sel, \
.max_sel = _max_sel, \
- .uV_step = _step_uV, \
+ .step = _step_uV, \
}
/**
unsigned int ramp_delay;
int min_dropout_uV;
- const struct regulator_linear_range *linear_ranges;
+ const struct linear_range *linear_ranges;
const unsigned int *linear_range_selectors;
int n_linear_ranges;
u8 blocked;
u8 need_qs;
u8 exp_hint; /* Hint for performance. */
- u8 deferred_qs;
+ u8 need_mb; /* Readers need smp_mb(). */
} b; /* Bits. */
u32 s; /* Set of bits. */
};
struct list_head rcu_tasks_holdout_list;
#endif /* #ifdef CONFIG_TASKS_RCU */
+#ifdef CONFIG_TASKS_TRACE_RCU
+ int trc_reader_nesting;
+ int trc_ipi_to_cpu;
+ union rcu_special trc_reader_special;
+ bool trc_reader_checked;
+ struct list_head trc_holdout_list;
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
+
struct sched_info sched_info;
struct list_head tasks;
unsigned long prev_lowest_stack;
#endif
+#ifdef CONFIG_X86_MCE
+ u64 mce_addr;
+ u64 mce_status;
+ struct callback_head mce_kill_me;
+#endif
+
/*
* New fields for task_struct should be added above here, so that
* they are included in the randomized portion of task_struct.
dst->sg.data[which] = src->sg.data[which];
dst->sg.data[which].length = size;
dst->sg.size += size;
+ src->sg.size -= size;
src->sg.data[which].length -= size;
src->sg.data[which].offset += size;
}
*/
extern void arch_disable_smp_support(void);
-extern void arch_enable_nonboot_cpus_begin(void);
-extern void arch_enable_nonboot_cpus_end(void);
+extern void arch_thaw_secondary_cpus_begin(void);
+extern void arch_thaw_secondary_cpus_end(void);
void smp_setup_processor_id(void);
struct gss_ctx {
struct gss_api_mech *mech_type;
void *internal_ctx_id;
+ unsigned int slack, align;
};
#define GSS_C_NO_BUFFER ((struct xdr_netobj) 0)
u32 gss_unwrap(
struct gss_ctx *ctx_id,
int offset,
+ int len,
struct xdr_buf *inbuf);
u32 gss_delete_sec_context(
struct gss_ctx **ctx_id);
u32 (*gss_unwrap)(
struct gss_ctx *ctx_id,
int offset,
+ int len,
struct xdr_buf *buf);
void (*gss_delete_sec_context)(
void *internal_ctx_id);
u32 (*encrypt_v2) (struct krb5_ctx *kctx, u32 offset,
struct xdr_buf *buf,
struct page **pages); /* v2 encryption function */
- u32 (*decrypt_v2) (struct krb5_ctx *kctx, u32 offset,
+ u32 (*decrypt_v2) (struct krb5_ctx *kctx, u32 offset, u32 len,
struct xdr_buf *buf, u32 *headskip,
u32 *tailskip); /* v2 decryption function */
};
struct xdr_buf *outbuf, struct page **pages);
u32
-gss_unwrap_kerberos(struct gss_ctx *ctx_id, int offset,
+gss_unwrap_kerberos(struct gss_ctx *ctx_id, int offset, int len,
struct xdr_buf *buf);
struct page **pages);
u32
-gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset,
+gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, u32 len,
struct xdr_buf *buf, u32 *plainoffset,
u32 *plainlen);
extern void xdr_shift_buf(struct xdr_buf *, size_t);
extern void xdr_buf_from_iov(struct kvec *, struct xdr_buf *);
extern int xdr_buf_subsegment(struct xdr_buf *, struct xdr_buf *, unsigned int, unsigned int);
+extern void xdr_buf_trim(struct xdr_buf *, unsigned int);
extern int read_bytes_from_xdr_buf(struct xdr_buf *, unsigned int, void *, unsigned int);
extern int write_bytes_to_xdr_buf(struct xdr_buf *, unsigned int, void *, unsigned int);
extern void mark_page_accessed(struct page *);
extern void lru_add_drain(void);
extern void lru_add_drain_cpu(int cpu);
+extern void lru_add_drain_cpu_zone(struct zone *zone);
extern void lru_add_drain_all(void);
extern void rotate_reclaimable_page(struct page *page);
extern void deactivate_file_page(struct page *page);
#define TBOOT_UUID {0xff, 0x8d, 0x3c, 0x66, 0xb3, 0xe8, 0x82, 0x4b, 0xbf,\
0xaa, 0x19, 0xea, 0x4d, 0x5, 0x7a, 0x8}
-extern struct tboot *tboot;
-
-static inline int tboot_enabled(void)
-{
- return tboot != NULL;
-}
-
+bool tboot_enabled(void);
extern void tboot_probe(void);
extern void tboot_shutdown(u32 shutdown_type);
extern struct acpi_table_header *tboot_get_dmar_table(
#ifdef CONFIG_PREEMPTION
#define torture_preempt_schedule() preempt_schedule()
#else
-#define torture_preempt_schedule()
+#define torture_preempt_schedule() do { } while (0)
#endif
#endif /* __LINUX_TORTURE_H */
u32 event_type;
u8 digest[20];
u32 event_size;
- u8 event[0];
+ u8 event[];
} __packed;
struct tcg_event_field {
{
unsigned int gso_type = 0;
unsigned int thlen = 0;
+ unsigned int p_off = 0;
unsigned int ip_proto;
if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
if (!skb_partial_csum_set(skb, start, off))
return -EINVAL;
- if (skb_transport_offset(skb) + thlen > skb_headlen(skb))
+ p_off = skb_transport_offset(skb) + thlen;
+ if (p_off > skb_headlen(skb))
return -EINVAL;
} else {
/* gso packets without NEEDS_CSUM do not set transport_offset.
return -EINVAL;
}
- if (keys.control.thoff + thlen > skb_headlen(skb) ||
+ p_off = keys.control.thoff + thlen;
+ if (p_off > skb_headlen(skb) ||
keys.basic.ip_proto != ip_proto)
return -EINVAL;
skb_set_transport_header(skb, keys.control.thoff);
+ } else if (gso_type) {
+ p_off = thlen;
+ if (p_off > skb_headlen(skb))
+ return -EINVAL;
}
}
if (hdr->gso_type != VIRTIO_NET_HDR_GSO_NONE) {
u16 gso_size = __virtio16_to_cpu(little_endian, hdr->gso_size);
+ struct skb_shared_info *shinfo = skb_shinfo(skb);
- skb_shinfo(skb)->gso_size = gso_size;
- skb_shinfo(skb)->gso_type = gso_type;
+ /* Too small packets are not really GSO ones. */
+ if (skb->len - p_off > gso_size) {
+ shinfo->gso_size = gso_size;
+ shinfo->gso_type = gso_type;
- /* Header must be checked, and gso_segs computed. */
- skb_shinfo(skb)->gso_type |= SKB_GSO_DODGY;
- skb_shinfo(skb)->gso_segs = 0;
+ /* Header must be checked, and gso_segs computed. */
+ shinfo->gso_type |= SKB_GSO_DODGY;
+ shinfo->gso_segs = 0;
+ }
}
return 0;
(wait)->flags = 0; \
} while (0)
+bool try_invoke_on_locked_down_task(struct task_struct *p, bool (*func)(struct task_struct *t, void *arg), void *arg);
+
#endif /* _LINUX_WAIT_H */
{
dtm->install = jiffies_to_clock_t(jiffies - stm->install);
dtm->lastuse = jiffies_to_clock_t(jiffies - stm->lastuse);
- dtm->firstuse = jiffies_to_clock_t(jiffies - stm->firstuse);
+ dtm->firstuse = stm->firstuse ?
+ jiffies_to_clock_t(jiffies - stm->firstuse) : 0;
dtm->expires = jiffies_to_clock_t(stm->expires);
}
void rxrpc_kernel_end_call(struct socket *, struct rxrpc_call *);
void rxrpc_kernel_get_peer(struct socket *, struct rxrpc_call *,
struct sockaddr_rxrpc *);
-u64 rxrpc_kernel_get_rtt(struct socket *, struct rxrpc_call *);
+u32 rxrpc_kernel_get_srtt(struct socket *, struct rxrpc_call *);
int rxrpc_kernel_charge_accept(struct socket *, rxrpc_notify_rx_t,
rxrpc_user_attach_call_t, unsigned long, gfp_t,
unsigned int);
struct espintcp_msg partial;
void (*saved_data_ready)(struct sock *sk);
void (*saved_write_space)(struct sock *sk);
+ void (*saved_destruct)(struct sock *sk);
struct work_struct work;
bool tx_running;
};
u32 table_id;
/* filter_set is an optimization that an entry is set */
bool filter_set;
- bool dump_all_families;
bool dump_routes;
bool dump_exceptions;
unsigned char protocol;
#endif
int fib_unmerge(struct net *net);
+static inline bool nhc_l3mdev_matches_dev(const struct fib_nh_common *nhc,
+const struct net_device *dev)
+{
+ if (nhc->nhc_dev == dev ||
+ l3mdev_master_ifindex_rcu(nhc->nhc_dev) == dev->ifindex)
+ return true;
+
+ return false;
+}
+
/* Exported by fib_semantics.c */
int ip_fib_check_default(__be32 gw, struct net_device *dev);
int fib_sync_down_dev(struct net_device *dev, unsigned long event, bool force);
void fib_alias_hw_flags_set(struct net *net, const struct fib_rt_info *fri);
void fib_trie_init(void);
struct fib_table *fib_trie_table(u32 id, struct fib_table *alias);
+bool fib_lookup_good_nhc(const struct fib_nh_common *nhc, int fib_flags,
+ const struct flowi4 *flp);
static inline void fib_combine_itag(u32 *itag, const struct fib_result *res)
{
struct hlist_node nat_bysource;
#endif
/* all members below initialized via memset */
- u8 __nfct_init_offset[0];
+ struct { } __nfct_init_offset;
/* If we were expected by an expectation, this will be it */
struct nf_conn *master;
NF_FLOW_HW_DYING,
NF_FLOW_HW_DEAD,
NF_FLOW_HW_REFRESH,
+ NF_FLOW_HW_PENDING,
};
enum flow_offload_type {
};
struct nh_group {
+ struct nh_group *spare; /* spare group for removals */
u16 num_nh;
bool mpath;
bool has_v4;
{
unsigned int rc = 1;
- if (nexthop_is_multipath(nh)) {
+ if (nh->is_group) {
struct nh_group *nh_grp;
nh_grp = rcu_dereference_rtnl(nh->nh_grp);
- rc = nh_grp->num_nh;
+ if (nh_grp->mpath)
+ rc = nh_grp->num_nh;
}
return rc;
}
static inline
-struct nexthop *nexthop_mpath_select(const struct nexthop *nh, int nhsel)
+struct nexthop *nexthop_mpath_select(const struct nh_group *nhg, int nhsel)
{
- const struct nh_group *nhg = rcu_dereference_rtnl(nh->nh_grp);
-
/* for_nexthops macros in fib_semantics.c grabs a pointer to
* the nexthop before checking nhsel
*/
{
const struct nh_info *nhi;
- if (nexthop_is_multipath(nh)) {
- if (nexthop_num_path(nh) > 1)
- return false;
- nh = nexthop_mpath_select(nh, 0);
- if (!nh)
+ if (nh->is_group) {
+ struct nh_group *nh_grp;
+
+ nh_grp = rcu_dereference_rtnl(nh->nh_grp);
+ if (nh_grp->num_nh > 1)
return false;
+
+ nh = nh_grp->nh_entries[0].nh;
}
nhi = rcu_dereference_rtnl(nh->nh_info);
BUILD_BUG_ON(offsetof(struct fib_nh, nh_common) != 0);
BUILD_BUG_ON(offsetof(struct fib6_nh, nh_common) != 0);
- if (nexthop_is_multipath(nh)) {
- nh = nexthop_mpath_select(nh, nhsel);
- if (!nh)
- return NULL;
+ if (nh->is_group) {
+ struct nh_group *nh_grp;
+
+ nh_grp = rcu_dereference_rtnl(nh->nh_grp);
+ if (nh_grp->mpath) {
+ nh = nexthop_mpath_select(nh_grp, nhsel);
+ if (!nh)
+ return NULL;
+ }
}
nhi = rcu_dereference_rtnl(nh->nh_info);
return &nhi->fib_nhc;
}
+/* called from fib_table_lookup with rcu_lock */
+static inline
+struct fib_nh_common *nexthop_get_nhc_lookup(const struct nexthop *nh,
+ int fib_flags,
+ const struct flowi4 *flp,
+ int *nhsel)
+{
+ struct nh_info *nhi;
+
+ if (nh->is_group) {
+ struct nh_group *nhg = rcu_dereference(nh->nh_grp);
+ int i;
+
+ for (i = 0; i < nhg->num_nh; i++) {
+ struct nexthop *nhe = nhg->nh_entries[i].nh;
+
+ nhi = rcu_dereference(nhe->nh_info);
+ if (fib_lookup_good_nhc(&nhi->fib_nhc, fib_flags, flp)) {
+ *nhsel = i;
+ return &nhi->fib_nhc;
+ }
+ }
+ } else {
+ nhi = rcu_dereference(nh->nh_info);
+ if (fib_lookup_good_nhc(&nhi->fib_nhc, fib_flags, flp)) {
+ *nhsel = 0;
+ return &nhi->fib_nhc;
+ }
+ }
+
+ return NULL;
+}
+
+static inline bool nexthop_uses_dev(const struct nexthop *nh,
+ const struct net_device *dev)
+{
+ struct nh_info *nhi;
+
+ if (nh->is_group) {
+ struct nh_group *nhg = rcu_dereference(nh->nh_grp);
+ int i;
+
+ for (i = 0; i < nhg->num_nh; i++) {
+ struct nexthop *nhe = nhg->nh_entries[i].nh;
+
+ nhi = rcu_dereference(nhe->nh_info);
+ if (nhc_l3mdev_matches_dev(&nhi->fib_nhc, dev))
+ return true;
+ }
+ } else {
+ nhi = rcu_dereference(nh->nh_info);
+ if (nhc_l3mdev_matches_dev(&nhi->fib_nhc, dev))
+ return true;
+ }
+
+ return false;
+}
+
static inline unsigned int fib_info_num_path(const struct fib_info *fi)
{
if (unlikely(fi->nh))
{
struct nh_info *nhi;
- if (nexthop_is_multipath(nh)) {
- nh = nexthop_mpath_select(nh, 0);
+ if (nh->is_group) {
+ struct nh_group *nh_grp;
+
+ nh_grp = rcu_dereference_rtnl(nh->nh_grp);
+ nh = nexthop_mpath_select(nh_grp, 0);
if (!nh)
return NULL;
}
#include <linux/cache.h>
#include <linux/percpu.h>
#include <linux/skbuff.h>
-#include <linux/cryptohash.h>
#include <linux/kref.h>
#include <linux/ktime.h>
rx_opt->num_sacks = 0;
}
-u32 tcp_default_init_rwnd(u32 mss);
void tcp_cwnd_restart(struct sock *sk, s32 delta);
static inline void tcp_slow_start_after_idle_check(struct sock *sk)
return tcp_win_from_space(sk, READ_ONCE(sk->sk_rcvbuf));
}
+/* We provision sk_rcvbuf around 200% of sk_rcvlowat.
+ * If 87.5 % (7/8) of the space has been consumed, we want to override
+ * SO_RCVLOWAT constraint, since we are receiving skbs with too small
+ * len/truesize ratio.
+ */
+static inline bool tcp_rmem_pressure(const struct sock *sk)
+{
+ int rcvbuf = READ_ONCE(sk->sk_rcvbuf);
+ int threshold = rcvbuf - (rcvbuf >> 3);
+
+ return atomic_read(&sk->sk_rmem_alloc) > threshold;
+}
+
extern void tcp_openreq_init_rwin(struct request_sock *req,
const struct sock *sk_listener,
const struct dst_entry *dst);
struct tls_rec *open_rec;
struct list_head tx_list;
atomic_t encrypt_pending;
+ /* protect crypto_wait with encrypt_pending */
+ spinlock_t encrypt_compl_lock;
int async_notify;
u8 async_capable:1;
u8 async_capable:1;
u8 decrypted:1;
atomic_t decrypt_pending;
+ /* protect crypto_wait with decrypt_pending*/
+ spinlock_t decrypt_compl_lock;
bool async_notify;
};
__be16 df, __be16 src_port, __be16 dst_port,
bool xnet, bool nocheck);
-#if IS_ENABLED(CONFIG_IPV6)
int udp_tunnel6_xmit_skb(struct dst_entry *dst, struct sock *sk,
struct sk_buff *skb,
struct net_device *dev, struct in6_addr *saddr,
struct in6_addr *daddr,
__u8 prio, __u8 ttl, __be32 label,
__be16 src_port, __be16 dst_port, bool nocheck);
-#endif
void udp_tunnel_sock_release(struct socket *sock);
static inline void uobj_put_destroy(struct ib_uobject *uobj)
{
- rdma_lookup_put_uobject(uobj, UVERBS_LOOKUP_WRITE);
+ rdma_lookup_put_uobject(uobj, UVERBS_LOOKUP_DESTROY);
}
static inline void uobj_put_read(struct ib_uobject *uobj)
size_t avail_min; /* min avail for wakeup */
size_t avail; /* max used buffer for wakeup */
size_t xruns; /* over/underruns counter */
+ int buffer_ref; /* buffer reference count */
/* misc */
spinlock_t lock;
wait_queue_head_t sleep;
);
+DEFINE_EVENT(regulator_basic, regulator_bypass_enable,
+
+ TP_PROTO(const char *name),
+
+ TP_ARGS(name)
+
+);
+
+DEFINE_EVENT(regulator_basic, regulator_bypass_enable_complete,
+
+ TP_PROTO(const char *name),
+
+ TP_ARGS(name)
+
+);
+
+DEFINE_EVENT(regulator_basic, regulator_bypass_disable,
+
+ TP_PROTO(const char *name),
+
+ TP_ARGS(name)
+
+);
+
+DEFINE_EVENT(regulator_basic, regulator_bypass_disable_complete,
+
+ TP_PROTO(const char *name),
+
+ TP_ARGS(name)
+
+);
+
/*
* Events that take a range of numerical values, mostly for voltages
* and so on.
TRACE_EVENT(rxrpc_rtt_rx,
TP_PROTO(struct rxrpc_call *call, enum rxrpc_rtt_rx_trace why,
rxrpc_serial_t send_serial, rxrpc_serial_t resp_serial,
- s64 rtt, u8 nr, s64 avg),
+ u32 rtt, u32 rto),
- TP_ARGS(call, why, send_serial, resp_serial, rtt, nr, avg),
+ TP_ARGS(call, why, send_serial, resp_serial, rtt, rto),
TP_STRUCT__entry(
__field(unsigned int, call )
__field(enum rxrpc_rtt_rx_trace, why )
- __field(u8, nr )
__field(rxrpc_serial_t, send_serial )
__field(rxrpc_serial_t, resp_serial )
- __field(s64, rtt )
- __field(u64, avg )
+ __field(u32, rtt )
+ __field(u32, rto )
),
TP_fast_assign(
__entry->send_serial = send_serial;
__entry->resp_serial = resp_serial;
__entry->rtt = rtt;
- __entry->nr = nr;
- __entry->avg = avg;
+ __entry->rto = rto;
),
- TP_printk("c=%08x %s sr=%08x rr=%08x rtt=%lld nr=%u avg=%lld",
+ TP_printk("c=%08x %s sr=%08x rr=%08x rtt=%u rto=%u",
__entry->call,
__print_symbolic(__entry->why, rxrpc_rtt_rx_traces),
__entry->send_serial,
__entry->resp_serial,
__entry->rtt,
- __entry->nr,
- __entry->avg)
+ __entry->rto)
);
TRACE_EVENT(rxrpc_timer,
__entry->serial)
);
+TRACE_EVENT(rxrpc_rx_discard_ack,
+ TP_PROTO(unsigned int debug_id, rxrpc_serial_t serial,
+ rxrpc_seq_t first_soft_ack, rxrpc_seq_t call_ackr_first,
+ rxrpc_seq_t prev_pkt, rxrpc_seq_t call_ackr_prev),
+
+ TP_ARGS(debug_id, serial, first_soft_ack, call_ackr_first,
+ prev_pkt, call_ackr_prev),
+
+ TP_STRUCT__entry(
+ __field(unsigned int, debug_id )
+ __field(rxrpc_serial_t, serial )
+ __field(rxrpc_seq_t, first_soft_ack)
+ __field(rxrpc_seq_t, call_ackr_first)
+ __field(rxrpc_seq_t, prev_pkt)
+ __field(rxrpc_seq_t, call_ackr_prev)
+ ),
+
+ TP_fast_assign(
+ __entry->debug_id = debug_id;
+ __entry->serial = serial;
+ __entry->first_soft_ack = first_soft_ack;
+ __entry->call_ackr_first = call_ackr_first;
+ __entry->prev_pkt = prev_pkt;
+ __entry->call_ackr_prev = call_ackr_prev;
+ ),
+
+ TP_printk("c=%08x r=%08x %08x<%08x %08x<%08x",
+ __entry->debug_id,
+ __entry->serial,
+ __entry->first_soft_ack,
+ __entry->call_ackr_first,
+ __entry->prev_pkt,
+ __entry->call_ackr_prev)
+ );
+
#endif /* _TRACE_RXRPC_H */
/* This part must be outside protection */
#define CAP_AUDIT_READ 37
+/*
+ * Allow system performance and observability privileged operations
+ * using perf_events, i915_perf and other kernel subsystems
+ */
+
+#define CAP_PERFMON 38
-#define CAP_LAST_CAP CAP_AUDIT_READ
+#define CAP_LAST_CAP CAP_PERFMON
#define cap_valid(x) ((x) >= 0 && (x) <= CAP_LAST_CAP)
#define FSCRYPT_POLICY_FLAGS_PAD_MASK 0x03
#define FSCRYPT_POLICY_FLAG_DIRECT_KEY 0x04
#define FSCRYPT_POLICY_FLAG_IV_INO_LBLK_64 0x08
-#define FSCRYPT_POLICY_FLAGS_VALID 0x0F
+#define FSCRYPT_POLICY_FLAG_IV_INO_LBLK_32 0x10
+#define FSCRYPT_POLICY_FLAGS_VALID 0x1F
/* Encryption algorithms */
#define FSCRYPT_MODE_AES_256_XTS 1
__u32 guest_count; /* Out */
} __packed;
+#define SEV_STATUS_FLAGS_CONFIG_ES 0x0100
+
/**
* struct sev_user_data_pek_csr - PEK_CSR command parameters
*
__u8 data[0];
};
+/* Maximum number of non-control endpoints in struct usb_raw_eps_info. */
+#define USB_RAW_EPS_NUM_MAX 30
+
+/* Maximum length of UDC endpoint name in struct usb_raw_ep_info. */
+#define USB_RAW_EP_NAME_MAX 16
+
+/* Used as addr in struct usb_raw_ep_info if endpoint accepts any address. */
+#define USB_RAW_EP_ADDR_ANY 0xff
+
+/*
+ * struct usb_raw_ep_caps - exposes endpoint capabilities from struct usb_ep
+ * (technically from its member struct usb_ep_caps).
+ */
+struct usb_raw_ep_caps {
+ __u32 type_control : 1;
+ __u32 type_iso : 1;
+ __u32 type_bulk : 1;
+ __u32 type_int : 1;
+ __u32 dir_in : 1;
+ __u32 dir_out : 1;
+};
+
+/*
+ * struct usb_raw_ep_limits - exposes endpoint limits from struct usb_ep.
+ * @maxpacket_limit: Maximum packet size value supported by this endpoint.
+ * @max_streams: maximum number of streams supported by this endpoint
+ * (actual number is 2^n).
+ * @reserved: Empty, reserved for potential future extensions.
+ */
+struct usb_raw_ep_limits {
+ __u16 maxpacket_limit;
+ __u16 max_streams;
+ __u32 reserved;
+};
+
+/*
+ * struct usb_raw_ep_info - stores information about a gadget endpoint.
+ * @name: Name of the endpoint as it is defined in the UDC driver.
+ * @addr: Address of the endpoint that must be specified in the endpoint
+ * descriptor passed to USB_RAW_IOCTL_EP_ENABLE ioctl.
+ * @caps: Endpoint capabilities.
+ * @limits: Endpoint limits.
+ */
+struct usb_raw_ep_info {
+ __u8 name[USB_RAW_EP_NAME_MAX];
+ __u32 addr;
+ struct usb_raw_ep_caps caps;
+ struct usb_raw_ep_limits limits;
+};
+
+/*
+ * struct usb_raw_eps_info - argument for USB_RAW_IOCTL_EPS_INFO ioctl.
+ * eps: Structures that store information about non-control endpoints.
+ */
+struct usb_raw_eps_info {
+ struct usb_raw_ep_info eps[USB_RAW_EPS_NUM_MAX];
+};
+
/*
* Initializes a Raw Gadget instance.
* Accepts a pointer to the usb_raw_init struct as an argument.
#define USB_RAW_IOCTL_EVENT_FETCH _IOR('U', 2, struct usb_raw_event)
/*
- * Queues an IN (OUT for READ) urb as a response to the last control request
- * received on endpoint 0, provided that was an IN (OUT for READ) request and
- * waits until the urb is completed. Copies received data to user for READ.
+ * Queues an IN (OUT for READ) request as a response to the last setup request
+ * received on endpoint 0 (provided that was an IN (OUT for READ) request), and
+ * waits until the request is completed. Copies received data to user for READ.
* Accepts a pointer to the usb_raw_ep_io struct as an argument.
- * Returns length of trasferred data on success or negative error code on
+ * Returns length of transferred data on success or negative error code on
* failure.
*/
#define USB_RAW_IOCTL_EP0_WRITE _IOW('U', 3, struct usb_raw_ep_io)
#define USB_RAW_IOCTL_EP0_READ _IOWR('U', 4, struct usb_raw_ep_io)
/*
- * Finds an endpoint that supports the transfer type specified in the
- * descriptor and enables it.
- * Accepts a pointer to the usb_endpoint_descriptor struct as an argument.
+ * Finds an endpoint that satisfies the parameters specified in the provided
+ * descriptors (address, transfer type, etc.) and enables it.
+ * Accepts a pointer to the usb_raw_ep_descs struct as an argument.
* Returns enabled endpoint handle on success or negative error code on failure.
*/
#define USB_RAW_IOCTL_EP_ENABLE _IOW('U', 5, struct usb_endpoint_descriptor)
-/* Disables specified endpoint.
+/*
+ * Disables specified endpoint.
* Accepts endpoint handle as an argument.
* Returns 0 on success or negative error code on failure.
*/
#define USB_RAW_IOCTL_EP_DISABLE _IOW('U', 6, __u32)
/*
- * Queues an IN (OUT for READ) urb as a response to the last control request
- * received on endpoint usb_raw_ep_io.ep, provided that was an IN (OUT for READ)
- * request and waits until the urb is completed. Copies received data to user
- * for READ.
+ * Queues an IN (OUT for READ) request as a response to the last setup request
+ * received on endpoint usb_raw_ep_io.ep (provided that was an IN (OUT for READ)
+ * request), and waits until the request is completed. Copies received data to
+ * user for READ.
* Accepts a pointer to the usb_raw_ep_io struct as an argument.
- * Returns length of trasferred data on success or negative error code on
+ * Returns length of transferred data on success or negative error code on
* failure.
*/
#define USB_RAW_IOCTL_EP_WRITE _IOW('U', 7, struct usb_raw_ep_io)
*/
#define USB_RAW_IOCTL_VBUS_DRAW _IOW('U', 10, __u32)
+/*
+ * Fills in the usb_raw_eps_info structure with information about non-control
+ * endpoints available for the currently connected UDC.
+ * Returns the number of available endpoints on success or negative error code
+ * on failure.
+ */
+#define USB_RAW_IOCTL_EPS_INFO _IOR('U', 11, struct usb_raw_eps_info)
+
+/*
+ * Stalls a pending control request on endpoint 0.
+ * Returns 0 on success or negative error code on failure.
+ */
+#define USB_RAW_IOCTL_EP0_STALL _IO('U', 12)
+
+/*
+ * Sets or clears halt or wedge status of the endpoint.
+ * Accepts endpoint handle as an argument.
+ * Returns 0 on success or negative error code on failure.
+ */
+#define USB_RAW_IOCTL_EP_SET_HALT _IOW('U', 13, __u32)
+#define USB_RAW_IOCTL_EP_CLEAR_HALT _IOW('U', 14, __u32)
+#define USB_RAW_IOCTL_EP_SET_WEDGE _IOW('U', 15, __u32)
+
#endif /* _UAPI__LINUX_USB_RAW_GADGET_H */
XFRMA_PROTO, /* __u8 */
XFRMA_ADDRESS_FILTER, /* struct xfrm_address_filter */
XFRMA_PAD,
- XFRMA_OFFLOAD_DEV, /* struct xfrm_state_offload */
+ XFRMA_OFFLOAD_DEV, /* struct xfrm_user_offload */
XFRMA_SET_MARK, /* __u32 */
XFRMA_SET_MARK_MASK, /* __u32 */
XFRMA_IF_ID, /* __u32 */
source "kernel/Kconfig.locks"
+config ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ bool
+
config ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
bool
__setup("noinitrd", no_initrd);
-static int __init early_initrd(char *p)
+static int __init early_initrdmem(char *p)
{
phys_addr_t start;
unsigned long size;
}
return 0;
}
+early_param("initrdmem", early_initrdmem);
+
+/*
+ * This is here as the initrd keyword has been in use since 11/2018
+ * on ARM, PowerPC, and MIPS.
+ * It should not be; it is reserved for bootloaders.
+ */
+static int __init early_initrd(char *p)
+{
+ return early_initrdmem(p);
+}
early_param("initrd", early_initrd);
static int init_linuxrc(struct subprocess_info *info, struct cred *new)
.rcu_tasks_holdout_list = LIST_HEAD_INIT(init_task.rcu_tasks_holdout_list),
.rcu_tasks_idle_cpu = -1,
#endif
+#ifdef CONFIG_TASKS_TRACE_RCU
+ .trc_reader_nesting = 0,
+ .trc_reader_special.s = 0,
+ .trc_holdout_list = LIST_HEAD_INIT(init_task.trc_holdout_list),
+#endif
#ifdef CONFIG_CPUSETS
.mems_allowed_seq = SEQCNT_ZERO(init_task.mems_allowed_seq),
#endif
char *data, *copy;
int ret;
+ /* Cut out the bootconfig data even if we have no bootconfig option */
data = get_boot_config_from_initrd(&size, &csum);
- if (!data)
- goto not_found;
strlcpy(tmp_cmdline, boot_command_line, COMMAND_LINE_SIZE);
parse_args("bootconfig", tmp_cmdline, NULL, 0, 0, 0, NULL,
if (!bootconfig_found)
return;
+ if (!data) {
+ pr_err("'bootconfig' found on command line, but no bootconfig found\n");
+ return;
+ }
+
if (size >= XBC_DATA_MAX) {
pr_err("bootconfig size %d greater than max size %d\n",
size, XBC_DATA_MAX);
extra_init_args = xbc_make_cmdline("init");
}
return;
-not_found:
- pr_err("'bootconfig' found on command line, but no bootconfig found\n");
}
#else
/* Do the rest non-__init'ed, we're now alive */
arch_call_rest_init();
+
+ prevent_tail_call_optimization();
}
/* Call all constructor functions linked into the kernel. */
total++;
}
- *new_pos = pos + 1;
+ ipc = NULL;
if (total >= ids->in_use)
- return NULL;
+ goto out;
for (; pos < ipc_mni; pos++) {
ipc = idr_find(&ids->ipcs_idr, pos);
if (ipc != NULL) {
rcu_read_lock();
ipc_lock_object(ipc);
- return ipc;
+ break;
}
}
-
- /* Out of range - return NULL to terminate iteration */
- return NULL;
+out:
+ *new_pos = pos + 1;
+ return ipc;
}
static void *sysvipc_proc_next(struct seq_file *s, void *it, loff_t *pos)
if (!(map->map_flags & BPF_F_MMAPABLE))
return -EINVAL;
- return remap_vmalloc_range(vma, array_map_vmalloc_addr(array), pgoff);
+ if (vma->vm_pgoff * PAGE_SIZE + (vma->vm_end - vma->vm_start) >
+ PAGE_ALIGN((u64)array->map.max_entries * array->elem_size))
+ return -EINVAL;
+
+ return remap_vmalloc_range(vma, array_map_vmalloc_addr(array),
+ vma->vm_pgoff + pgoff);
}
const struct bpf_map_ops array_map_ops = {
int bpf_prog_calc_tag(struct bpf_prog *fp)
{
- const u32 bits_offset = SHA_MESSAGE_BYTES - sizeof(__be64);
+ const u32 bits_offset = SHA1_BLOCK_SIZE - sizeof(__be64);
u32 raw_size = bpf_prog_tag_scratch_size(fp);
- u32 digest[SHA_DIGEST_WORDS];
- u32 ws[SHA_WORKSPACE_WORDS];
+ u32 digest[SHA1_DIGEST_WORDS];
+ u32 ws[SHA1_WORKSPACE_WORDS];
u32 i, bsize, psize, blocks;
struct bpf_insn *dst;
bool was_ld_map;
if (!raw)
return -ENOMEM;
- sha_init(digest);
+ sha1_init(digest);
memset(ws, 0, sizeof(ws));
/* We need to take out the map fd for the digest calculation
memset(&raw[psize], 0, raw_size - psize);
raw[psize++] = 0x80;
- bsize = round_up(psize, SHA_MESSAGE_BYTES);
- blocks = bsize / SHA_MESSAGE_BYTES;
+ bsize = round_up(psize, SHA1_BLOCK_SIZE);
+ blocks = bsize / SHA1_BLOCK_SIZE;
todo = raw;
if (bsize - psize >= sizeof(__be64)) {
bits = (__be64 *)(todo + bsize - sizeof(__be64));
*bits = cpu_to_be64((psize - 1) << 3);
while (blocks--) {
- sha_transform(digest, todo, ws);
- todo += SHA_MESSAGE_BYTES;
+ sha1_transform(digest, todo, ws);
+ todo += SHA1_BLOCK_SIZE;
}
result = (__force __be32 *)digest;
- for (i = 0; i < SHA_DIGEST_WORDS; i++)
+ for (i = 0; i < SHA1_DIGEST_WORDS; i++)
result[i] = cpu_to_be32(digest[i]);
memcpy(fp->tag, result, sizeof(fp->tag));
mutex_lock(&map->freeze_mutex);
- if ((vma->vm_flags & VM_WRITE) && map->frozen) {
- err = -EPERM;
- goto out;
+ if (vma->vm_flags & VM_WRITE) {
+ if (map->frozen) {
+ err = -EPERM;
+ goto out;
+ }
+ /* map is meant to be read-only, so do not allow mapping as
+ * writable, because it's possible to leak a writable page
+ * reference and allows user-space to still modify it after
+ * freezing, while verifier will assume contents do not change
+ */
+ if (map->map_flags & BPF_F_RDONLY_PROG) {
+ err = -EACCES;
+ goto out;
+ }
}
/* set default open/close callbacks */
if (err)
goto free_value;
- if (copy_to_user(uvalue, value, value_size) != 0)
+ if (copy_to_user(uvalue, value, value_size) != 0) {
+ err = -EFAULT;
goto free_value;
+ }
err = 0;
* but must be positive otherwise set to worse case bounds
* and refine later from tnum.
*/
- if (reg->s32_min_value > 0)
- reg->smin_value = reg->s32_min_value;
- else
- reg->smin_value = 0;
- if (reg->s32_max_value > 0)
+ if (reg->s32_min_value >= 0 && reg->s32_max_value >= 0)
reg->smax_value = reg->s32_max_value;
else
reg->smax_value = U32_MAX;
+ if (reg->s32_min_value >= 0)
+ reg->smin_value = reg->s32_min_value;
+ else
+ reg->smin_value = 0;
}
static void __reg_combine_32_into_64(struct bpf_reg_state *reg)
if (ret_type != RET_INTEGER ||
(func_id != BPF_FUNC_get_stack &&
- func_id != BPF_FUNC_probe_read_str))
+ func_id != BPF_FUNC_probe_read_str &&
+ func_id != BPF_FUNC_probe_read_kernel_str &&
+ func_id != BPF_FUNC_probe_read_user_str))
return;
ret_reg->smax_value = meta->msize_max_value;
return 0;
range = tnum_const(0);
break;
+ case BPF_PROG_TYPE_TRACING:
+ switch (env->prog->expected_attach_type) {
+ case BPF_TRACE_FENTRY:
+ case BPF_TRACE_FEXIT:
+ range = tnum_const(0);
+ break;
+ case BPF_TRACE_RAW_TP:
+ case BPF_MODIFY_RETURN:
+ return 0;
+ default:
+ return -ENOTSUPP;
+ }
+ break;
+ case BPF_PROG_TYPE_EXT:
+ /* freplace program can return anything as its return value
+ * depends on the to-be-replaced kernel func or bpf program.
+ */
default:
return 0;
}
}
#define SECURITY_PREFIX "security_"
-static int check_attach_modify_return(struct bpf_verifier_env *env)
+static int check_attach_modify_return(struct bpf_prog *prog, unsigned long addr)
{
- struct bpf_prog *prog = env->prog;
- unsigned long addr = (unsigned long) prog->aux->trampoline->func.addr;
-
- /* This is expected to be cleaned up in the future with the KRSI effort
- * introducing the LSM_HOOK macro for cleaning up lsm_hooks.h.
- */
if (within_error_injection_list(addr) ||
!strncmp(SECURITY_PREFIX, prog->aux->attach_func_name,
sizeof(SECURITY_PREFIX) - 1))
return 0;
- verbose(env, "fmod_ret attach_btf_id %u (%s) is not modifiable\n",
- prog->aux->attach_btf_id, prog->aux->attach_func_name);
-
return -EINVAL;
}
goto out;
}
}
+
+ if (prog->expected_attach_type == BPF_MODIFY_RETURN) {
+ ret = check_attach_modify_return(prog, addr);
+ if (ret)
+ verbose(env, "%s() is not modifiable\n",
+ prog->aux->attach_func_name);
+ }
+
+ if (ret)
+ goto out;
tr->func.addr = (void *)addr;
prog->aux->trampoline = tr;
-
- if (prog->expected_attach_type == BPF_MODIFY_RETURN)
- ret = check_attach_modify_return(env);
out:
mutex_unlock(&tr->mutex);
if (ret)
return;
/*
- * Paired with the one in cgroup_rstat_cpu_pop_updated(). Either we
- * see NULL updated_next or they see our updated stat.
- */
- smp_mb();
-
- /*
+ * Speculative already-on-list test. This may race leading to
+ * temporary inaccuracies, which is fine.
+ *
* Because @parent's updated_children is terminated with @parent
* instead of NULL, we can tell whether @cgrp is on the list by
* testing the next pointer for NULL.
*nextp = rstatc->updated_next;
rstatc->updated_next = NULL;
- /*
- * Paired with the one in cgroup_rstat_cpu_updated().
- * Either they see NULL updated_next or we see their
- * updated stat.
- */
- smp_mb();
-
return pos;
}
/*
* On x86 it's required to boot all logical CPUs at least once so
* that the init code can get a chance to set CR4.MCE on each
- * CPU. Otherwise, a broadacasted MCE observing CR4.MCE=0b on any
+ * CPU. Otherwise, a broadcasted MCE observing CR4.MCE=0b on any
* core will shutdown the machine.
*/
return !cpumask_test_cpu(cpu, &cpus_booted_once_mask);
#ifdef CONFIG_PM_SLEEP_SMP
static cpumask_var_t frozen_cpus;
-int __freeze_secondary_cpus(int primary, bool suspend)
+int freeze_secondary_cpus(int primary)
{
int cpu, error = 0;
if (cpu == primary)
continue;
- if (suspend && pm_wakeup_pending()) {
+ if (pm_wakeup_pending()) {
pr_info("Wakeup pending. Abort CPU freeze\n");
error = -EBUSY;
break;
/*
* Make sure the CPUs won't be enabled by someone else. We need to do
- * this even in case of failure as all disable_nonboot_cpus() users are
- * supposed to do enable_nonboot_cpus() on the failure path.
+ * this even in case of failure as all freeze_secondary_cpus() users are
+ * supposed to do thaw_secondary_cpus() on the failure path.
*/
cpu_hotplug_disabled++;
return error;
}
-void __weak arch_enable_nonboot_cpus_begin(void)
+void __weak arch_thaw_secondary_cpus_begin(void)
{
}
-void __weak arch_enable_nonboot_cpus_end(void)
+void __weak arch_thaw_secondary_cpus_end(void)
{
}
-void enable_nonboot_cpus(void)
+void thaw_secondary_cpus(void)
{
int cpu, error;
pr_info("Enabling non-boot CPUs ...\n");
- arch_enable_nonboot_cpus_begin();
+ arch_thaw_secondary_cpus_begin();
for_each_cpu(cpu, frozen_cpus) {
trace_suspend_resume(TPS("CPU_ON"), cpu, true);
pr_warn("Error taking CPU%d up: %d\n", cpu, error);
}
- arch_enable_nonboot_cpus_end();
+ arch_thaw_secondary_cpus_end();
cpumask_clear(frozen_cpus);
out:
#include <linux/errno.h>
#include <linux/export.h>
-/*
- * If we have booted due to a crash, max_pfn will be a very low value. We need
- * to know the amount of memory that the previous kernel used.
- */
-unsigned long saved_max_pfn;
-
/*
* stores the physical address of elf header of crash image
*
struct callchain_cpus_entries {
struct rcu_head rcu_head;
- struct perf_callchain_entry *cpu_entries[0];
+ struct perf_callchain_entry *cpu_entries[];
};
int sysctl_perf_event_max_stack __read_mostly = PERF_MAX_STACK_DEPTH;
* @info: the function call argument
*
* Calls the function @func when the task is currently running. This might
- * be on the current CPU, which just calls the function directly
+ * be on the current CPU, which just calls the function directly. This will
+ * retry due to any failures in smp_call_function_single(), such as if the
+ * task_cpu() goes offline concurrently.
*
- * returns: @func return value, or
- * -ESRCH - when the process isn't running
- * -EAGAIN - when the process moved away
+ * returns @func return value or -ESRCH when the process isn't running
*/
static int
task_function_call(struct task_struct *p, remote_function_f func, void *info)
};
int ret;
- do {
- ret = smp_call_function_single(task_cpu(p), remote_function, &data, 1);
- if (!ret)
- ret = data.ret;
- } while (ret == -EAGAIN);
+ for (;;) {
+ ret = smp_call_function_single(task_cpu(p), remote_function,
+ &data, 1);
+ ret = !ret ? data.ret : -EAGAIN;
+
+ if (ret != -EAGAIN)
+ break;
+
+ cond_resched();
+ }
return ret;
}
if (event->attr.type != perf_kprobe.type)
return -ENOENT;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
/*
if (event->attr.type != perf_uprobe.type)
return -ENOENT;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
/*
}
if (attr.namespaces) {
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EACCES;
}
void *aux_priv;
struct perf_event_mmap_page *user_page;
- void *data_pages[0];
+ void *data_pages[];
};
extern void rb_free(struct perf_buffer *rb);
INIT_LIST_HEAD(&p->rcu_tasks_holdout_list);
p->rcu_tasks_idle_cpu = -1;
#endif /* #ifdef CONFIG_TASKS_RCU */
+#ifdef CONFIG_TASKS_TRACE_RCU
+ p->trc_reader_nesting = 0;
+ p->trc_reader_special.s = 0;
+ INIT_LIST_HEAD(&p->trc_holdout_list);
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
}
struct pid *pidfd_pid(const struct file *file)
int __user *child_tidptr)
{
struct kernel_clone_args args = {
- .flags = (clone_flags & ~CSIGNAL),
+ .flags = (lower_32_bits(clone_flags) & ~CSIGNAL),
.pidfd = parent_tidptr,
.child_tid = child_tidptr,
.parent_tid = parent_tidptr,
- .exit_signal = (clone_flags & CSIGNAL),
+ .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL),
.stack = stack_start,
.stack_size = stack_size,
};
pid_t kernel_thread(int (*fn)(void *), void *arg, unsigned long flags)
{
struct kernel_clone_args args = {
- .flags = ((flags | CLONE_VM | CLONE_UNTRACED) & ~CSIGNAL),
- .exit_signal = (flags & CSIGNAL),
+ .flags = ((lower_32_bits(flags) | CLONE_VM |
+ CLONE_UNTRACED) & ~CSIGNAL),
+ .exit_signal = (lower_32_bits(flags) & CSIGNAL),
.stack = (unsigned long)fn,
.stack_size = (unsigned long)arg,
};
#endif
{
struct kernel_clone_args args = {
- .flags = (clone_flags & ~CSIGNAL),
+ .flags = (lower_32_bits(clone_flags) & ~CSIGNAL),
.pidfd = parent_tidptr,
.child_tid = child_tidptr,
.parent_tid = parent_tidptr,
- .exit_signal = (clone_flags & CSIGNAL),
+ .exit_signal = (lower_32_bits(clone_flags) & CSIGNAL),
.stack = newsp,
.tls = tls,
};
return 0;
}
+/* Remove all symbols in given area from kprobe blacklist */
+static void kprobe_remove_area_blacklist(unsigned long start, unsigned long end)
+{
+ struct kprobe_blacklist_entry *ent, *n;
+
+ list_for_each_entry_safe(ent, n, &kprobe_blacklist, list) {
+ if (ent->start_addr < start || ent->start_addr >= end)
+ continue;
+ list_del(&ent->list);
+ kfree(ent);
+ }
+}
+
+static void kprobe_remove_ksym_blacklist(unsigned long entry)
+{
+ kprobe_remove_area_blacklist(entry, entry + 1);
+}
+
int __init __weak arch_populate_kprobe_blacklist(void)
{
return 0;
/* Symbols in __kprobes_text are blacklisted */
ret = kprobe_add_area_blacklist((unsigned long)__kprobes_text_start,
(unsigned long)__kprobes_text_end);
+ if (ret)
+ return ret;
+
+ /* Symbols in noinstr section are blacklisted */
+ ret = kprobe_add_area_blacklist((unsigned long)__noinstr_text_start,
+ (unsigned long)__noinstr_text_end);
return ret ? : arch_populate_kprobe_blacklist();
}
+static void add_module_kprobe_blacklist(struct module *mod)
+{
+ unsigned long start, end;
+ int i;
+
+ if (mod->kprobe_blacklist) {
+ for (i = 0; i < mod->num_kprobe_blacklist; i++)
+ kprobe_add_ksym_blacklist(mod->kprobe_blacklist[i]);
+ }
+
+ start = (unsigned long)mod->kprobes_text_start;
+ if (start) {
+ end = start + mod->kprobes_text_size;
+ kprobe_add_area_blacklist(start, end);
+ }
+
+ start = (unsigned long)mod->noinstr_text_start;
+ if (start) {
+ end = start + mod->noinstr_text_size;
+ kprobe_add_area_blacklist(start, end);
+ }
+}
+
+static void remove_module_kprobe_blacklist(struct module *mod)
+{
+ unsigned long start, end;
+ int i;
+
+ if (mod->kprobe_blacklist) {
+ for (i = 0; i < mod->num_kprobe_blacklist; i++)
+ kprobe_remove_ksym_blacklist(mod->kprobe_blacklist[i]);
+ }
+
+ start = (unsigned long)mod->kprobes_text_start;
+ if (start) {
+ end = start + mod->kprobes_text_size;
+ kprobe_remove_area_blacklist(start, end);
+ }
+
+ start = (unsigned long)mod->noinstr_text_start;
+ if (start) {
+ end = start + mod->noinstr_text_size;
+ kprobe_remove_area_blacklist(start, end);
+ }
+}
+
/* Module notifier call back, checking kprobes on the module */
static int kprobes_module_callback(struct notifier_block *nb,
unsigned long val, void *data)
unsigned int i;
int checkcore = (val == MODULE_STATE_GOING);
+ if (val == MODULE_STATE_COMING) {
+ mutex_lock(&kprobe_mutex);
+ add_module_kprobe_blacklist(mod);
+ mutex_unlock(&kprobe_mutex);
+ }
if (val != MODULE_STATE_GOING && val != MODULE_STATE_LIVE)
return NOTIFY_DONE;
kill_kprobe(p);
}
}
+ if (val == MODULE_STATE_GOING)
+ remove_module_kprobe_blacklist(mod);
mutex_unlock(&kprobe_mutex);
return NOTIFY_DONE;
}
/* kprobes/blacklist -- shows which functions can not be probed */
static void *kprobe_blacklist_seq_start(struct seq_file *m, loff_t *pos)
{
+ mutex_lock(&kprobe_mutex);
return seq_list_start(&kprobe_blacklist, *pos);
}
return 0;
}
+static void kprobe_blacklist_seq_stop(struct seq_file *f, void *v)
+{
+ mutex_unlock(&kprobe_mutex);
+}
+
static const struct seq_operations kprobe_blacklist_seq_ops = {
.start = kprobe_blacklist_seq_start,
.next = kprobe_blacklist_seq_next,
- .stop = kprobe_seq_stop, /* Reuse void function */
+ .stop = kprobe_blacklist_seq_stop,
.show = kprobe_blacklist_seq_show,
};
task->lockdep_recursion = 0;
}
-/*
- * Split the recrursion counter in two to readily detect 'off' vs recursion.
- */
-#define LOCKDEP_RECURSION_BITS 16
-#define LOCKDEP_OFF (1U << LOCKDEP_RECURSION_BITS)
-#define LOCKDEP_RECURSION_MASK (LOCKDEP_OFF - 1)
-
-void lockdep_off(void)
-{
- current->lockdep_recursion += LOCKDEP_OFF;
-}
-EXPORT_SYMBOL(lockdep_off);
-
-void lockdep_on(void)
-{
- current->lockdep_recursion -= LOCKDEP_OFF;
-}
-EXPORT_SYMBOL(lockdep_on);
-
static inline void lockdep_recursion_finish(void)
{
if (WARN_ON_ONCE(--current->lockdep_recursion))
struct hlist_node hash_entry;
u32 hash;
u32 nr_entries;
- unsigned long entries[0] __aligned(sizeof(unsigned long));
+ unsigned long entries[] __aligned(sizeof(unsigned long));
};
#define LOCK_TRACE_SIZE_IN_LONGS \
(sizeof(struct lock_trace) / sizeof(unsigned long))
* set up.
*/
#ifndef CONFIG_DEBUG_RT_MUTEXES
-# define rt_mutex_cmpxchg_relaxed(l,c,n) (cmpxchg_relaxed(&l->owner, c, n) == c)
# define rt_mutex_cmpxchg_acquire(l,c,n) (cmpxchg_acquire(&l->owner, c, n) == c)
# define rt_mutex_cmpxchg_release(l,c,n) (cmpxchg_release(&l->owner, c, n) == c)
}
#else
-# define rt_mutex_cmpxchg_relaxed(l,c,n) (0)
# define rt_mutex_cmpxchg_acquire(l,c,n) (0)
# define rt_mutex_cmpxchg_release(l,c,n) (0)
}
#endif
+ mod->noinstr_text_start = section_objs(info, ".noinstr.text", 1,
+ &mod->noinstr_text_size);
+
#ifdef CONFIG_TRACEPOINTS
mod->tracepoints_ptrs = section_objs(info, "__tracepoints_ptrs",
sizeof(*mod->tracepoints_ptrs),
mod->ei_funcs = section_objs(info, "_error_injection_whitelist",
sizeof(*mod->ei_funcs),
&mod->num_ei_funcs);
+#endif
+#ifdef CONFIG_KPROBES
+ mod->kprobes_text_start = section_objs(info, ".kprobes.text", 1,
+ &mod->kprobes_text_size);
+ mod->kprobe_blacklist = section_objs(info, "_kprobe_blacklist",
+ sizeof(unsigned long),
+ &mod->num_kprobe_blacklist);
#endif
mod->extable = section_objs(info, "__ex_table",
sizeof(*mod->extable), &mod->num_exentries);
struct padata_instance *pinst;
int ret;
- pinst = hlist_entry_safe(node, struct padata_instance, node);
+ pinst = hlist_entry_safe(node, struct padata_instance, cpu_online_node);
if (!pinst_has_cpu(pinst, cpu))
return 0;
struct padata_instance *pinst;
int ret;
- pinst = hlist_entry_safe(node, struct padata_instance, node);
+ pinst = hlist_entry_safe(node, struct padata_instance, cpu_dead_node);
if (!pinst_has_cpu(pinst, cpu))
return 0;
static void __padata_free(struct padata_instance *pinst)
{
#ifdef CONFIG_HOTPLUG_CPU
- cpuhp_state_remove_instance_nocalls(CPUHP_PADATA_DEAD, &pinst->node);
- cpuhp_state_remove_instance_nocalls(hp_online, &pinst->node);
+ cpuhp_state_remove_instance_nocalls(CPUHP_PADATA_DEAD,
+ &pinst->cpu_dead_node);
+ cpuhp_state_remove_instance_nocalls(hp_online, &pinst->cpu_online_node);
#endif
WARN_ON(!list_empty(&pinst->pslist));
mutex_init(&pinst->lock);
#ifdef CONFIG_HOTPLUG_CPU
- cpuhp_state_add_instance_nocalls_cpuslocked(hp_online, &pinst->node);
+ cpuhp_state_add_instance_nocalls_cpuslocked(hp_online,
+ &pinst->cpu_online_node);
cpuhp_state_add_instance_nocalls_cpuslocked(CPUHP_PADATA_DEAD,
- &pinst->node);
+ &pinst->cpu_dead_node);
#endif
put_online_cpus();
{
char name[16]; /* Name of the driver */
int index; /* Minor dev. to use */
+ bool user_specified; /* Specified by command line vs. platform */
char *options; /* Options for the driver */
#ifdef CONFIG_A11Y_BRAILLE_CONSOLE
char *brl_options; /* Options for braille driver */
#ifdef CONFIG_PRINTK
-#define PRINTK_SAFE_CONTEXT_MASK 0x3fffffff
-#define PRINTK_NMI_DIRECT_CONTEXT_MASK 0x40000000
-#define PRINTK_NMI_CONTEXT_MASK 0x80000000
+#define PRINTK_SAFE_CONTEXT_MASK 0x007ffffff
+#define PRINTK_NMI_DIRECT_CONTEXT_MASK 0x008000000
+#define PRINTK_NMI_CONTEXT_MASK 0xff0000000
+
+#define PRINTK_NMI_CONTEXT_OFFSET 0x010000000
extern raw_spinlock_t logbuf_lock;
static struct console_cmdline console_cmdline[MAX_CMDLINECONSOLES];
static int preferred_console = -1;
+static bool has_preferred_console;
int console_set_on_cmdline;
EXPORT_SYMBOL(console_set_on_cmdline);
user->idx = log_next_idx;
user->seq = log_next_seq;
break;
+ case SEEK_CUR:
+ /*
+ * It isn't supported due to the record nature of this
+ * interface: _SET _DATA and _END point to very specific
+ * record positions, while _CUR would be more useful in case
+ * of a byte-based log. Because of that, return the default
+ * errno value for invalid seek operation.
+ */
+ ret = -ESPIPE;
+ break;
default:
ret = -EINVAL;
}
#endif
static int __add_preferred_console(char *name, int idx, char *options,
- char *brl_options)
+ char *brl_options, bool user_specified)
{
struct console_cmdline *c;
int i;
if (strcmp(c->name, name) == 0 && c->index == idx) {
if (!brl_options)
preferred_console = i;
+ if (user_specified)
+ c->user_specified = true;
return 0;
}
}
preferred_console = i;
strlcpy(c->name, name, sizeof(c->name));
c->options = options;
+ c->user_specified = user_specified;
braille_set_options(c, brl_options);
c->index = idx;
char *s, *options, *brl_options = NULL;
int idx;
+ if (str[0] == 0)
+ return 1;
+
if (_braille_console_setup(&str, &brl_options))
return 1;
idx = simple_strtoul(s, NULL, 10);
*s = 0;
- __add_preferred_console(buf, idx, options, brl_options);
+ __add_preferred_console(buf, idx, options, brl_options, true);
console_set_on_cmdline = 1;
return 1;
}
*/
int add_preferred_console(char *name, int idx, char *options)
{
- return __add_preferred_console(name, idx, options, NULL);
+ return __add_preferred_console(name, idx, options, NULL, false);
}
bool console_suspend_enabled = true;
printk_safe_enter_irqsave(flags);
raw_spin_lock(&logbuf_lock);
if (console_seq < log_first_seq) {
- len = sprintf(text,
- "** %llu printk messages dropped **\n",
- log_first_seq - console_seq);
+ len = snprintf(text, sizeof(text),
+ "** %llu printk messages dropped **\n",
+ log_first_seq - console_seq);
/* messages are gone, move to first one */
console_seq = log_first_seq;
early_param("keep_bootcon", keep_bootcon_setup);
+/*
+ * This is called by register_console() to try to match
+ * the newly registered console with any of the ones selected
+ * by either the command line or add_preferred_console() and
+ * setup/enable it.
+ *
+ * Care need to be taken with consoles that are statically
+ * enabled such as netconsole
+ */
+static int try_enable_new_console(struct console *newcon, bool user_specified)
+{
+ struct console_cmdline *c;
+ int i;
+
+ for (i = 0, c = console_cmdline;
+ i < MAX_CMDLINECONSOLES && c->name[0];
+ i++, c++) {
+ if (c->user_specified != user_specified)
+ continue;
+ if (!newcon->match ||
+ newcon->match(newcon, c->name, c->index, c->options) != 0) {
+ /* default matching */
+ BUILD_BUG_ON(sizeof(c->name) != sizeof(newcon->name));
+ if (strcmp(c->name, newcon->name) != 0)
+ continue;
+ if (newcon->index >= 0 &&
+ newcon->index != c->index)
+ continue;
+ if (newcon->index < 0)
+ newcon->index = c->index;
+
+ if (_braille_register_console(newcon, c))
+ return 0;
+
+ if (newcon->setup &&
+ newcon->setup(newcon, c->options) != 0)
+ return -EIO;
+ }
+ newcon->flags |= CON_ENABLED;
+ if (i == preferred_console) {
+ newcon->flags |= CON_CONSDEV;
+ has_preferred_console = true;
+ }
+ return 0;
+ }
+
+ /*
+ * Some consoles, such as pstore and netconsole, can be enabled even
+ * without matching. Accept the pre-enabled consoles only when match()
+ * and setup() had a change to be called.
+ */
+ if (newcon->flags & CON_ENABLED && c->user_specified == user_specified)
+ return 0;
+
+ return -ENOENT;
+}
+
/*
* The console driver calls this routine during kernel initialization
* to register the console printing procedure with printk() and to
*/
void register_console(struct console *newcon)
{
- int i;
unsigned long flags;
struct console *bcon = NULL;
- struct console_cmdline *c;
- static bool has_preferred;
+ int err;
for_each_console(bcon) {
if (WARN(bcon == newcon, "console '%s%d' already registered\n",
if (console_drivers && console_drivers->flags & CON_BOOT)
bcon = console_drivers;
- if (!has_preferred || bcon || !console_drivers)
- has_preferred = preferred_console >= 0;
+ if (!has_preferred_console || bcon || !console_drivers)
+ has_preferred_console = preferred_console >= 0;
/*
* See if we want to use this console driver. If we
* didn't select a console we take the first one
* that registers here.
*/
- if (!has_preferred) {
+ if (!has_preferred_console) {
if (newcon->index < 0)
newcon->index = 0;
if (newcon->setup == NULL ||
newcon->flags |= CON_ENABLED;
if (newcon->device) {
newcon->flags |= CON_CONSDEV;
- has_preferred = true;
+ has_preferred_console = true;
}
}
}
- /*
- * See if this console matches one we selected on
- * the command line.
- */
- for (i = 0, c = console_cmdline;
- i < MAX_CMDLINECONSOLES && c->name[0];
- i++, c++) {
- if (!newcon->match ||
- newcon->match(newcon, c->name, c->index, c->options) != 0) {
- /* default matching */
- BUILD_BUG_ON(sizeof(c->name) != sizeof(newcon->name));
- if (strcmp(c->name, newcon->name) != 0)
- continue;
- if (newcon->index >= 0 &&
- newcon->index != c->index)
- continue;
- if (newcon->index < 0)
- newcon->index = c->index;
-
- if (_braille_register_console(newcon, c))
- return;
-
- if (newcon->setup &&
- newcon->setup(newcon, c->options) != 0)
- break;
- }
+ /* See if this console matches one we selected on the command line */
+ err = try_enable_new_console(newcon, true);
- newcon->flags |= CON_ENABLED;
- if (i == preferred_console) {
- newcon->flags |= CON_CONSDEV;
- has_preferred = true;
- }
- break;
- }
+ /* If not, try to match against the platform default(s) */
+ if (err == -ENOENT)
+ err = try_enable_new_console(newcon, false);
- if (!(newcon->flags & CON_ENABLED))
+ /* printk() messages are not printed to the Braille console. */
+ if (err || newcon->flags & CON_BRL)
return;
/*
console_drivers = newcon;
if (newcon->next)
newcon->next->flags &= ~CON_CONSDEV;
+ /* Ensure this flag is always set for the head of the list */
+ newcon->flags |= CON_CONSDEV;
} else {
newcon->next = console_drivers->next;
console_drivers->next = newcon;
static bool always_kmsg_dump;
module_param_named(always_kmsg_dump, always_kmsg_dump, bool, S_IRUGO | S_IWUSR);
+const char *kmsg_dump_reason_str(enum kmsg_dump_reason reason)
+{
+ switch (reason) {
+ case KMSG_DUMP_PANIC:
+ return "Panic";
+ case KMSG_DUMP_OOPS:
+ return "Oops";
+ case KMSG_DUMP_EMERG:
+ return "Emergency";
+ case KMSG_DUMP_SHUTDOWN:
+ return "Shutdown";
+ default:
+ return "Unknown";
+ }
+}
+EXPORT_SYMBOL_GPL(kmsg_dump_reason_str);
+
/**
* kmsg_dump - dump kernel log to kernel message dumpers.
* @reason: the reason (oops, panic etc) for dumping
struct kmsg_dumper *dumper;
unsigned long flags;
- if ((reason > KMSG_DUMP_OOPS) && !always_kmsg_dump)
- return;
-
rcu_read_lock();
list_for_each_entry_rcu(dumper, &dump_list, list) {
- if (dumper->max_reason && reason > dumper->max_reason)
+ enum kmsg_dump_reason max_reason = dumper->max_reason;
+
+ /*
+ * If client has not provided a specific max_reason, default
+ * to KMSG_DUMP_OOPS, unless always_kmsg_dump was set.
+ */
+ if (max_reason == KMSG_DUMP_UNDEF) {
+ max_reason = always_kmsg_dump ? KMSG_DUMP_MAX :
+ KMSG_DUMP_OOPS;
+ }
+ if (reason > max_reason)
continue;
/* initialize iterator with data about the stored records */
EXPORT_SYMBOL_GPL(kmsg_dump_get_buffer);
/**
- * kmsg_dump_rewind_nolock - reset the interator (unlocked version)
+ * kmsg_dump_rewind_nolock - reset the iterator (unlocked version)
* @dumper: registered kmsg dumper
*
* Reset the dumper's iterator so that kmsg_dump_get_line() and
}
/**
- * kmsg_dump_rewind - reset the interator
+ * kmsg_dump_rewind - reset the iterator
* @dumper: registered kmsg dumper
*
* Reset the dumper's iterator so that kmsg_dump_get_line() and
#include <linux/cpumask.h>
#include <linux/irq_work.h>
#include <linux/printk.h>
+#include <linux/kprobes.h>
#include "internal.h"
return printk_safe_log_store(s, fmt, args);
}
-void notrace printk_nmi_enter(void)
+void noinstr printk_nmi_enter(void)
{
- this_cpu_or(printk_context, PRINTK_NMI_CONTEXT_MASK);
+ this_cpu_add(printk_context, PRINTK_NMI_CONTEXT_OFFSET);
}
-void notrace printk_nmi_exit(void)
+void noinstr printk_nmi_exit(void)
{
- this_cpu_and(printk_context, ~PRINTK_NMI_CONTEXT_MASK);
+ this_cpu_sub(printk_context, PRINTK_NMI_CONTEXT_OFFSET);
}
/*
help
This option selects the full-fledged version of SRCU.
+config TASKS_RCU_GENERIC
+ def_bool TASKS_RCU || TASKS_RUDE_RCU || TASKS_TRACE_RCU
+ select SRCU
+ help
+ This option enables generic infrastructure code supporting
+ task-based RCU implementations. Not for manual selection.
+
config TASKS_RCU
def_bool PREEMPTION
- select SRCU
help
This option enables a task-based RCU implementation that uses
only voluntary context switch (not preemption!), idle, and
- user-mode execution as quiescent states.
+ user-mode execution as quiescent states. Not for manual selection.
+
+config TASKS_RUDE_RCU
+ def_bool 0
+ help
+ This option enables a task-based RCU implementation that uses
+ only context switch (including preemption) and user-mode
+ execution as quiescent states. It forces IPIs and context
+ switches on all online CPUs, including idle ones, so use
+ with caution.
+
+config TASKS_TRACE_RCU
+ def_bool 0
+ help
+ This option enables a task-based RCU implementation that uses
+ explicit rcu_read_lock_trace() read-side markers, and allows
+ these readers to appear in the idle loop as well as on the CPU
+ hotplug code paths. It can force IPIs on online CPUs, including
+ idle ones, so use with caution.
config RCU_STALL_COMMON
def_bool TREE_RCU
Say Y here if you want to help to debug reduced OS jitter.
Say N here if you are unsure.
+config TASKS_TRACE_RCU_READ_MB
+ bool "Tasks Trace RCU readers use memory barriers in user and idle"
+ depends on RCU_EXPERT
+ default PREEMPT_RT || NR_CPUS < 8
+ help
+ Use this option to further reduce the number of IPIs sent
+ to CPUs executing in userspace or idle during tasks trace
+ RCU grace periods. Given that a reasonable setting of
+ the rcupdate.rcu_task_ipi_delay kernel boot parameter
+ eliminates such IPIs for many workloads, proper setting
+ of this Kconfig option is important mostly for aggressive
+ real-time installations and for battery-powered devices,
+ hence the default chosen above.
+
+ Say Y here if you hate IPIs.
+ Say N here if you hate read-side memory barriers.
+ Take the default if you are unsure.
+
endmenu # "RCU Subsystem"
select TORTURE_TEST
select SRCU
select TASKS_RCU
+ select TASKS_RUDE_RCU
+ select TASKS_TRACE_RCU
default n
help
This option provides a kernel module that runs performance
select TORTURE_TEST
select SRCU
select TASKS_RCU
+ select TASKS_RUDE_RCU
+ select TASKS_TRACE_RCU
default n
help
This option provides a kernel module that runs torture tests
void rcu_expedite_gp(void);
void rcu_unexpedite_gp(void);
void rcupdate_announce_bootup_oddness(void);
+void show_rcu_tasks_gp_kthreads(void);
void rcu_request_urgent_qs_task(struct task_struct *t);
#endif /* #else #ifdef CONFIG_TINY_RCU */
enum rcutorture_type {
RCU_FLAVOR,
RCU_TASKS_FLAVOR,
+ RCU_TASKS_RUDE_FLAVOR,
+ RCU_TASKS_TRACING_FLAVOR,
RCU_TRIVIAL_FLAVOR,
SRCU_FLAVOR,
INVALID_RCU_FLAVOR
unsigned long secs,
unsigned long c_old,
unsigned long c);
+void rcu_gp_set_torture_wait(int duration);
#else
static inline void rcutorture_get_gp_data(enum rcutorture_type test_type,
int *flags, unsigned long *gp_seq)
#define do_trace_rcu_torture_read(rcutorturename, rhp, secs, c_old, c) \
do { } while (0)
#endif
+static inline void rcu_gp_set_torture_wait(int duration) { }
#endif
#if IS_ENABLED(CONFIG_RCU_TORTURE_TEST) || IS_MODULE(CONFIG_RCU_TORTURE_TEST)
#endif
#ifdef CONFIG_TINY_RCU
+static inline bool rcu_dynticks_zero_in_eqs(int cpu, int *vp) { return false; }
static inline unsigned long rcu_get_gp_seq(void) { return 0; }
static inline unsigned long rcu_exp_batches_completed(void) { return 0; }
static inline unsigned long
static inline int rcu_get_gp_kthreads_prio(void) { return 0; }
static inline void rcu_fwd_progress_check(unsigned long j) { }
#else /* #ifdef CONFIG_TINY_RCU */
+bool rcu_dynticks_zero_in_eqs(int cpu, int *vp);
unsigned long rcu_get_gp_seq(void);
unsigned long rcu_exp_batches_completed(void);
unsigned long srcu_batches_completed(struct srcu_struct *sp);
torture_param(int, verbose, 1, "Enable verbose debugging printk()s");
torture_param(int, writer_holdoff, 0, "Holdoff (us) between GPs, zero to disable");
torture_param(int, kfree_rcu_test, 0, "Do we run a kfree_rcu() perf test?");
+torture_param(int, kfree_mult, 1, "Multiple of kfree_obj size to allocate.");
static char *perf_type = "rcu";
module_param(perf_type, charp, 0444);
}
for (i = 0; i < kfree_alloc_num; i++) {
- alloc_ptr = kmalloc(sizeof(struct kfree_obj), GFP_KERNEL);
+ alloc_ptr = kmalloc(kfree_mult * sizeof(struct kfree_obj), GFP_KERNEL);
if (!alloc_ptr)
return -ENOMEM;
schedule_timeout_uninterruptible(1);
}
+ pr_alert("kfree object size=%lu\n", kfree_mult * sizeof(struct kfree_obj));
+
kfree_reader_tasks = kcalloc(kfree_nrealthreads, sizeof(kfree_reader_tasks[0]),
GFP_KERNEL);
if (kfree_reader_tasks == NULL) {
#include <linux/err.h>
#include <linux/spinlock.h>
#include <linux/smp.h>
-#include <linux/rcupdate.h>
+#include <linux/rcupdate_wait.h>
#include <linux/interrupt.h>
#include <linux/sched/signal.h>
#include <uapi/linux/sched/types.h>
#include <linux/sched/sysctl.h>
#include <linux/oom.h>
#include <linux/tick.h>
+#include <linux/rcupdate_trace.h>
#include "rcu.h"
MODULE_LICENSE("GPL");
MODULE_AUTHOR("Paul E. McKenney <paulmck@linux.ibm.com> and Josh Triplett <josh@joshtriplett.org>");
+#ifndef data_race
+#define data_race(expr) \
+ ({ \
+ expr; \
+ })
+#endif
+#ifndef ASSERT_EXCLUSIVE_WRITER
+#define ASSERT_EXCLUSIVE_WRITER(var) do { } while (0)
+#endif
+#ifndef ASSERT_EXCLUSIVE_ACCESS
+#define ASSERT_EXCLUSIVE_ACCESS(var) do { } while (0)
+#endif
/* Bits for ->extendables field, extendables param, and related definitions. */
#define RCUTORTURE_RDR_SHIFT 8 /* Put SRCU index in upper bits. */
torture_param(int, stall_cpu_holdoff, 10,
"Time to wait before starting stall (s).");
torture_param(int, stall_cpu_irqsoff, 0, "Disable interrupts while stalling.");
+torture_param(int, stall_cpu_block, 0, "Sleep while stalling.");
+torture_param(int, stall_gp_kthread, 0,
+ "Grace-period kthread stall duration (s).");
torture_param(int, stat_interval, 60,
"Number of seconds between stats printk()s");
torture_param(int, stutter, 5, "Number of seconds to run/halt test");
call_rcu_tasks(&p->rtort_rcu, rcu_torture_cb);
}
+static void synchronize_rcu_mult_test(void)
+{
+ synchronize_rcu_mult(call_rcu_tasks, call_rcu);
+}
+
static struct rcu_torture_ops tasks_ops = {
.ttype = RCU_TASKS_FLAVOR,
.init = rcu_sync_torture_init,
.get_gp_seq = rcu_no_completed,
.deferred_free = rcu_tasks_torture_deferred_free,
.sync = synchronize_rcu_tasks,
- .exp_sync = synchronize_rcu_tasks,
+ .exp_sync = synchronize_rcu_mult_test,
.call = call_rcu_tasks,
.cb_barrier = rcu_barrier_tasks,
.fqs = NULL,
.name = "trivial"
};
+/*
+ * Definitions for rude RCU-tasks torture testing.
+ */
+
+static void rcu_tasks_rude_torture_deferred_free(struct rcu_torture *p)
+{
+ call_rcu_tasks_rude(&p->rtort_rcu, rcu_torture_cb);
+}
+
+static struct rcu_torture_ops tasks_rude_ops = {
+ .ttype = RCU_TASKS_RUDE_FLAVOR,
+ .init = rcu_sync_torture_init,
+ .readlock = rcu_torture_read_lock_trivial,
+ .read_delay = rcu_read_delay, /* just reuse rcu's version. */
+ .readunlock = rcu_torture_read_unlock_trivial,
+ .get_gp_seq = rcu_no_completed,
+ .deferred_free = rcu_tasks_rude_torture_deferred_free,
+ .sync = synchronize_rcu_tasks_rude,
+ .exp_sync = synchronize_rcu_tasks_rude,
+ .call = call_rcu_tasks_rude,
+ .cb_barrier = rcu_barrier_tasks_rude,
+ .fqs = NULL,
+ .stats = NULL,
+ .irq_capable = 1,
+ .name = "tasks-rude"
+};
+
+/*
+ * Definitions for tracing RCU-tasks torture testing.
+ */
+
+static int tasks_tracing_torture_read_lock(void)
+{
+ rcu_read_lock_trace();
+ return 0;
+}
+
+static void tasks_tracing_torture_read_unlock(int idx)
+{
+ rcu_read_unlock_trace();
+}
+
+static void rcu_tasks_tracing_torture_deferred_free(struct rcu_torture *p)
+{
+ call_rcu_tasks_trace(&p->rtort_rcu, rcu_torture_cb);
+}
+
+static struct rcu_torture_ops tasks_tracing_ops = {
+ .ttype = RCU_TASKS_TRACING_FLAVOR,
+ .init = rcu_sync_torture_init,
+ .readlock = tasks_tracing_torture_read_lock,
+ .read_delay = srcu_read_delay, /* just reuse srcu's version. */
+ .readunlock = tasks_tracing_torture_read_unlock,
+ .get_gp_seq = rcu_no_completed,
+ .deferred_free = rcu_tasks_tracing_torture_deferred_free,
+ .sync = synchronize_rcu_tasks_trace,
+ .exp_sync = synchronize_rcu_tasks_trace,
+ .call = call_rcu_tasks_trace,
+ .cb_barrier = rcu_barrier_tasks_trace,
+ .fqs = NULL,
+ .stats = NULL,
+ .irq_capable = 1,
+ .slow_gps = 1,
+ .name = "tasks-tracing"
+};
+
static unsigned long rcutorture_seq_diff(unsigned long new, unsigned long old)
{
if (!cur_ops->gp_diff)
static bool __maybe_unused torturing_tasks(void)
{
- return cur_ops == &tasks_ops;
+ return cur_ops == &tasks_ops || cur_ops == &tasks_rude_ops;
}
/*
/* Wait for the next test interval. */
oldstarttime = boost_starttime;
- while (ULONG_CMP_LT(jiffies, oldstarttime)) {
+ while (time_before(jiffies, oldstarttime)) {
schedule_timeout_interruptible(oldstarttime - jiffies);
stutter_wait("rcu_torture_boost");
if (torture_must_stop())
/* Do one boost-test interval. */
endtime = oldstarttime + test_boost_duration * HZ;
call_rcu_time = jiffies;
- while (ULONG_CMP_LT(jiffies, endtime)) {
+ while (time_before(jiffies, endtime)) {
/* If we don't have a callback in flight, post one. */
if (!smp_load_acquire(&rbi.inflight)) {
/* RCU core before ->inflight = 1. */
VERBOSE_TOROUT_STRING("rcu_torture_fqs task started");
do {
fqs_resume_time = jiffies + fqs_stutter * HZ;
- while (ULONG_CMP_LT(jiffies, fqs_resume_time) &&
+ while (time_before(jiffies, fqs_resume_time) &&
!kthread_should_stop()) {
schedule_timeout_interruptible(1);
}
struct torture_random_state *trsp,
struct rt_read_seg *rtrsp)
{
+ unsigned long flags;
int idxnew = -1;
int idxold = *readstate;
int statesnew = ~*readstate & newstate;
rcu_read_unlock_bh();
if (statesold & RCUTORTURE_RDR_SCHED)
rcu_read_unlock_sched();
- if (statesold & RCUTORTURE_RDR_RCU)
+ if (statesold & RCUTORTURE_RDR_RCU) {
+ bool lockit = !statesnew && !(torture_random(trsp) & 0xffff);
+
+ if (lockit)
+ raw_spin_lock_irqsave(¤t->pi_lock, flags);
cur_ops->readunlock(idxold >> RCUTORTURE_RDR_SHIFT);
+ if (lockit)
+ raw_spin_unlock_irqrestore(¤t->pi_lock, flags);
+ }
/* Delay if neither beginning nor end and there was a change. */
if ((statesnew || statesold) && *readstate && newstate)
rcu_read_lock_bh_held() ||
rcu_read_lock_sched_held() ||
srcu_read_lock_held(srcu_ctlp) ||
+ rcu_read_lock_trace_held() ||
torturing_tasks());
if (p == NULL) {
/* Wait for rcu_torture_writer to get underway */
atomic_long_read(&n_rcu_torture_timers));
torture_onoff_stats();
pr_cont("barrier: %ld/%ld:%ld\n",
- n_barrier_successes,
- n_barrier_attempts,
- n_rcu_torture_barrier_error);
+ data_race(n_barrier_successes),
+ data_race(n_barrier_attempts),
+ data_race(n_rcu_torture_barrier_error));
pr_alert("%s%s ", torture_type, TORTURE_FLAG);
if (atomic_read(&n_rcu_torture_mberror) ||
"test_boost=%d/%d test_boost_interval=%d "
"test_boost_duration=%d shutdown_secs=%d "
"stall_cpu=%d stall_cpu_holdoff=%d stall_cpu_irqsoff=%d "
+ "stall_cpu_block=%d "
"n_barrier_cbs=%d "
"onoff_interval=%d onoff_holdoff=%d\n",
torture_type, tag, nrealreaders, nfakewriters,
test_boost, cur_ops->can_boost,
test_boost_interval, test_boost_duration, shutdown_secs,
stall_cpu, stall_cpu_holdoff, stall_cpu_irqsoff,
+ stall_cpu_block,
n_barrier_cbs,
onoff_interval, onoff_holdoff);
}
*/
static int rcu_torture_stall(void *args)
{
+ int idx;
unsigned long stop_at;
VERBOSE_TOROUT_STRING("rcu_torture_stall task started");
schedule_timeout_interruptible(stall_cpu_holdoff * HZ);
VERBOSE_TOROUT_STRING("rcu_torture_stall end holdoff");
}
- if (!kthread_should_stop()) {
+ if (!kthread_should_stop() && stall_gp_kthread > 0) {
+ VERBOSE_TOROUT_STRING("rcu_torture_stall begin GP stall");
+ rcu_gp_set_torture_wait(stall_gp_kthread * HZ);
+ for (idx = 0; idx < stall_gp_kthread + 2; idx++) {
+ if (kthread_should_stop())
+ break;
+ schedule_timeout_uninterruptible(HZ);
+ }
+ }
+ if (!kthread_should_stop() && stall_cpu > 0) {
+ VERBOSE_TOROUT_STRING("rcu_torture_stall begin CPU stall");
stop_at = ktime_get_seconds() + stall_cpu;
/* RCU CPU stall is expected behavior in following code. */
- rcu_read_lock();
+ idx = cur_ops->readlock();
if (stall_cpu_irqsoff)
local_irq_disable();
- else
+ else if (!stall_cpu_block)
preempt_disable();
pr_alert("rcu_torture_stall start on CPU %d.\n",
- smp_processor_id());
+ raw_smp_processor_id());
while (ULONG_CMP_LT((unsigned long)ktime_get_seconds(),
stop_at))
- continue; /* Induce RCU CPU stall warning. */
+ if (stall_cpu_block)
+ schedule_timeout_uninterruptible(HZ);
if (stall_cpu_irqsoff)
local_irq_enable();
- else
+ else if (!stall_cpu_block)
preempt_enable();
- rcu_read_unlock();
- pr_alert("rcu_torture_stall end.\n");
+ cur_ops->readunlock(idx);
}
+ pr_alert("rcu_torture_stall end.\n");
torture_shutdown_absorb("rcu_torture_stall");
while (!kthread_should_stop())
schedule_timeout_interruptible(10 * HZ);
/* Spawn CPU-stall kthread, if stall_cpu specified. */
static int __init rcu_torture_stall_init(void)
{
- if (stall_cpu <= 0)
+ if (stall_cpu <= 0 && stall_gp_kthread <= 0)
return 0;
return torture_create_kthread(rcu_torture_stall, NULL, stall_task);
}
unsigned long rcu_launder_gp_seq_start;
};
-struct rcu_fwd *rcu_fwds;
-bool rcu_fwd_emergency_stop;
+static struct rcu_fwd *rcu_fwds;
+static bool rcu_fwd_emergency_stop;
static void rcu_torture_fwd_cb_hist(struct rcu_fwd *rfp)
{
int firsterr = 0;
static struct rcu_torture_ops *torture_ops[] = {
&rcu_ops, &rcu_busted_ops, &srcu_ops, &srcud_ops,
- &busted_srcud_ops, &tasks_ops, &trivial_ops,
+ &busted_srcud_ops, &tasks_ops, &tasks_rude_ops,
+ &tasks_tracing_ops, &trivial_ops,
};
if (!torture_init_begin(torture_type, verbose))
#include "rcu.h"
#include "rcu_segcblist.h"
+#ifndef data_race
+#define data_race(expr) \
+ ({ \
+ expr; \
+ })
+#endif
+#ifndef ASSERT_EXCLUSIVE_WRITER
+#define ASSERT_EXCLUSIVE_WRITER(var) do { } while (0)
+#endif
+#ifndef ASSERT_EXCLUSIVE_ACCESS
+#define ASSERT_EXCLUSIVE_ACCESS(var) do { } while (0)
+#endif
+
/* Holdoff in nanoseconds for auto-expediting. */
#define DEFAULT_SRCU_EXP_HOLDOFF (25 * 1000)
static ulong exp_holdoff = DEFAULT_SRCU_EXP_HOLDOFF;
struct srcu_data *sdp;
sdp = per_cpu_ptr(ssp->sda, cpu);
- u0 = sdp->srcu_unlock_count[!idx];
- u1 = sdp->srcu_unlock_count[idx];
+ u0 = data_race(sdp->srcu_unlock_count[!idx]);
+ u1 = data_race(sdp->srcu_unlock_count[idx]);
/*
* Make sure that a lock is always counted if the corresponding
*/
smp_rmb();
- l0 = sdp->srcu_lock_count[!idx];
- l1 = sdp->srcu_lock_count[idx];
+ l0 = data_race(sdp->srcu_lock_count[!idx]);
+ l1 = data_race(sdp->srcu_lock_count[idx]);
c0 = l0 - u0;
c1 = l1 - u1;
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0+ */
+/*
+ * Task-based RCU implementations.
+ *
+ * Copyright (C) 2020 Paul E. McKenney
+ */
+
+#ifdef CONFIG_TASKS_RCU_GENERIC
+
+////////////////////////////////////////////////////////////////////////
+//
+// Generic data structures.
+
+struct rcu_tasks;
+typedef void (*rcu_tasks_gp_func_t)(struct rcu_tasks *rtp);
+typedef void (*pregp_func_t)(void);
+typedef void (*pertask_func_t)(struct task_struct *t, struct list_head *hop);
+typedef void (*postscan_func_t)(struct list_head *hop);
+typedef void (*holdouts_func_t)(struct list_head *hop, bool ndrpt, bool *frptp);
+typedef void (*postgp_func_t)(struct rcu_tasks *rtp);
+
+/**
+ * Definition for a Tasks-RCU-like mechanism.
+ * @cbs_head: Head of callback list.
+ * @cbs_tail: Tail pointer for callback list.
+ * @cbs_wq: Wait queue allowning new callback to get kthread's attention.
+ * @cbs_lock: Lock protecting callback list.
+ * @kthread_ptr: This flavor's grace-period/callback-invocation kthread.
+ * @gp_func: This flavor's grace-period-wait function.
+ * @gp_state: Grace period's most recent state transition (debugging).
+ * @gp_jiffies: Time of last @gp_state transition.
+ * @gp_start: Most recent grace-period start in jiffies.
+ * @n_gps: Number of grace periods completed since boot.
+ * @n_ipis: Number of IPIs sent to encourage grace periods to end.
+ * @n_ipis_fails: Number of IPI-send failures.
+ * @pregp_func: This flavor's pre-grace-period function (optional).
+ * @pertask_func: This flavor's per-task scan function (optional).
+ * @postscan_func: This flavor's post-task scan function (optional).
+ * @holdout_func: This flavor's holdout-list scan function (optional).
+ * @postgp_func: This flavor's post-grace-period function (optional).
+ * @call_func: This flavor's call_rcu()-equivalent function.
+ * @name: This flavor's textual name.
+ * @kname: This flavor's kthread name.
+ */
+struct rcu_tasks {
+ struct rcu_head *cbs_head;
+ struct rcu_head **cbs_tail;
+ struct wait_queue_head cbs_wq;
+ raw_spinlock_t cbs_lock;
+ int gp_state;
+ unsigned long gp_jiffies;
+ unsigned long gp_start;
+ unsigned long n_gps;
+ unsigned long n_ipis;
+ unsigned long n_ipis_fails;
+ struct task_struct *kthread_ptr;
+ rcu_tasks_gp_func_t gp_func;
+ pregp_func_t pregp_func;
+ pertask_func_t pertask_func;
+ postscan_func_t postscan_func;
+ holdouts_func_t holdouts_func;
+ postgp_func_t postgp_func;
+ call_rcu_func_t call_func;
+ char *name;
+ char *kname;
+};
+
+#define DEFINE_RCU_TASKS(rt_name, gp, call, n) \
+static struct rcu_tasks rt_name = \
+{ \
+ .cbs_tail = &rt_name.cbs_head, \
+ .cbs_wq = __WAIT_QUEUE_HEAD_INITIALIZER(rt_name.cbs_wq), \
+ .cbs_lock = __RAW_SPIN_LOCK_UNLOCKED(rt_name.cbs_lock), \
+ .gp_func = gp, \
+ .call_func = call, \
+ .name = n, \
+ .kname = #rt_name, \
+}
+
+/* Track exiting tasks in order to allow them to be waited for. */
+DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
+
+/* Avoid IPIing CPUs early in the grace period. */
+#define RCU_TASK_IPI_DELAY (HZ / 2)
+static int rcu_task_ipi_delay __read_mostly = RCU_TASK_IPI_DELAY;
+module_param(rcu_task_ipi_delay, int, 0644);
+
+/* Control stall timeouts. Disable with <= 0, otherwise jiffies till stall. */
+#define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
+static int rcu_task_stall_timeout __read_mostly = RCU_TASK_STALL_TIMEOUT;
+module_param(rcu_task_stall_timeout, int, 0644);
+
+/* RCU tasks grace-period state for debugging. */
+#define RTGS_INIT 0
+#define RTGS_WAIT_WAIT_CBS 1
+#define RTGS_WAIT_GP 2
+#define RTGS_PRE_WAIT_GP 3
+#define RTGS_SCAN_TASKLIST 4
+#define RTGS_POST_SCAN_TASKLIST 5
+#define RTGS_WAIT_SCAN_HOLDOUTS 6
+#define RTGS_SCAN_HOLDOUTS 7
+#define RTGS_POST_GP 8
+#define RTGS_WAIT_READERS 9
+#define RTGS_INVOKE_CBS 10
+#define RTGS_WAIT_CBS 11
+static const char * const rcu_tasks_gp_state_names[] = {
+ "RTGS_INIT",
+ "RTGS_WAIT_WAIT_CBS",
+ "RTGS_WAIT_GP",
+ "RTGS_PRE_WAIT_GP",
+ "RTGS_SCAN_TASKLIST",
+ "RTGS_POST_SCAN_TASKLIST",
+ "RTGS_WAIT_SCAN_HOLDOUTS",
+ "RTGS_SCAN_HOLDOUTS",
+ "RTGS_POST_GP",
+ "RTGS_WAIT_READERS",
+ "RTGS_INVOKE_CBS",
+ "RTGS_WAIT_CBS",
+};
+
+////////////////////////////////////////////////////////////////////////
+//
+// Generic code.
+
+/* Record grace-period phase and time. */
+static void set_tasks_gp_state(struct rcu_tasks *rtp, int newstate)
+{
+ rtp->gp_state = newstate;
+ rtp->gp_jiffies = jiffies;
+}
+
+/* Return state name. */
+static const char *tasks_gp_state_getname(struct rcu_tasks *rtp)
+{
+ int i = data_race(rtp->gp_state); // Let KCSAN detect update races
+ int j = READ_ONCE(i); // Prevent the compiler from reading twice
+
+ if (j >= ARRAY_SIZE(rcu_tasks_gp_state_names))
+ return "???";
+ return rcu_tasks_gp_state_names[j];
+}
+
+// Enqueue a callback for the specified flavor of Tasks RCU.
+static void call_rcu_tasks_generic(struct rcu_head *rhp, rcu_callback_t func,
+ struct rcu_tasks *rtp)
+{
+ unsigned long flags;
+ bool needwake;
+
+ rhp->next = NULL;
+ rhp->func = func;
+ raw_spin_lock_irqsave(&rtp->cbs_lock, flags);
+ needwake = !rtp->cbs_head;
+ WRITE_ONCE(*rtp->cbs_tail, rhp);
+ rtp->cbs_tail = &rhp->next;
+ raw_spin_unlock_irqrestore(&rtp->cbs_lock, flags);
+ /* We can't create the thread unless interrupts are enabled. */
+ if (needwake && READ_ONCE(rtp->kthread_ptr))
+ wake_up(&rtp->cbs_wq);
+}
+
+// Wait for a grace period for the specified flavor of Tasks RCU.
+static void synchronize_rcu_tasks_generic(struct rcu_tasks *rtp)
+{
+ /* Complain if the scheduler has not started. */
+ RCU_LOCKDEP_WARN(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
+ "synchronize_rcu_tasks called too soon");
+
+ /* Wait for the grace period. */
+ wait_rcu_gp(rtp->call_func);
+}
+
+/* RCU-tasks kthread that detects grace periods and invokes callbacks. */
+static int __noreturn rcu_tasks_kthread(void *arg)
+{
+ unsigned long flags;
+ struct rcu_head *list;
+ struct rcu_head *next;
+ struct rcu_tasks *rtp = arg;
+
+ /* Run on housekeeping CPUs by default. Sysadm can move if desired. */
+ housekeeping_affine(current, HK_FLAG_RCU);
+ WRITE_ONCE(rtp->kthread_ptr, current); // Let GPs start!
+
+ /*
+ * Each pass through the following loop makes one check for
+ * newly arrived callbacks, and, if there are some, waits for
+ * one RCU-tasks grace period and then invokes the callbacks.
+ * This loop is terminated by the system going down. ;-)
+ */
+ for (;;) {
+
+ /* Pick up any new callbacks. */
+ raw_spin_lock_irqsave(&rtp->cbs_lock, flags);
+ smp_mb__after_spinlock(); // Order updates vs. GP.
+ list = rtp->cbs_head;
+ rtp->cbs_head = NULL;
+ rtp->cbs_tail = &rtp->cbs_head;
+ raw_spin_unlock_irqrestore(&rtp->cbs_lock, flags);
+
+ /* If there were none, wait a bit and start over. */
+ if (!list) {
+ wait_event_interruptible(rtp->cbs_wq,
+ READ_ONCE(rtp->cbs_head));
+ if (!rtp->cbs_head) {
+ WARN_ON(signal_pending(current));
+ set_tasks_gp_state(rtp, RTGS_WAIT_WAIT_CBS);
+ schedule_timeout_interruptible(HZ/10);
+ }
+ continue;
+ }
+
+ // Wait for one grace period.
+ set_tasks_gp_state(rtp, RTGS_WAIT_GP);
+ rtp->gp_start = jiffies;
+ rtp->gp_func(rtp);
+ rtp->n_gps++;
+
+ /* Invoke the callbacks. */
+ set_tasks_gp_state(rtp, RTGS_INVOKE_CBS);
+ while (list) {
+ next = list->next;
+ local_bh_disable();
+ list->func(list);
+ local_bh_enable();
+ list = next;
+ cond_resched();
+ }
+ /* Paranoid sleep to keep this from entering a tight loop */
+ schedule_timeout_uninterruptible(HZ/10);
+
+ set_tasks_gp_state(rtp, RTGS_WAIT_CBS);
+ }
+}
+
+/* Spawn RCU-tasks grace-period kthread, e.g., at core_initcall() time. */
+static void __init rcu_spawn_tasks_kthread_generic(struct rcu_tasks *rtp)
+{
+ struct task_struct *t;
+
+ t = kthread_run(rcu_tasks_kthread, rtp, "%s_kthread", rtp->kname);
+ if (WARN_ONCE(IS_ERR(t), "%s: Could not start %s grace-period kthread, OOM is now expected behavior\n", __func__, rtp->name))
+ return;
+ smp_mb(); /* Ensure others see full kthread. */
+}
+
+#ifndef CONFIG_TINY_RCU
+
+/*
+ * Print any non-default Tasks RCU settings.
+ */
+static void __init rcu_tasks_bootup_oddness(void)
+{
+#if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
+ if (rcu_task_stall_timeout != RCU_TASK_STALL_TIMEOUT)
+ pr_info("\tTasks-RCU CPU stall warnings timeout set to %d (rcu_task_stall_timeout).\n", rcu_task_stall_timeout);
+#endif /* #ifdef CONFIG_TASKS_RCU */
+#ifdef CONFIG_TASKS_RCU
+ pr_info("\tTrampoline variant of Tasks RCU enabled.\n");
+#endif /* #ifdef CONFIG_TASKS_RCU */
+#ifdef CONFIG_TASKS_RUDE_RCU
+ pr_info("\tRude variant of Tasks RCU enabled.\n");
+#endif /* #ifdef CONFIG_TASKS_RUDE_RCU */
+#ifdef CONFIG_TASKS_TRACE_RCU
+ pr_info("\tTracing variant of Tasks RCU enabled.\n");
+#endif /* #ifdef CONFIG_TASKS_TRACE_RCU */
+}
+
+#endif /* #ifndef CONFIG_TINY_RCU */
+
+/* Dump out rcutorture-relevant state common to all RCU-tasks flavors. */
+static void show_rcu_tasks_generic_gp_kthread(struct rcu_tasks *rtp, char *s)
+{
+ pr_info("%s: %s(%d) since %lu g:%lu i:%lu/%lu %c%c %s\n",
+ rtp->kname,
+ tasks_gp_state_getname(rtp), data_race(rtp->gp_state),
+ jiffies - data_race(rtp->gp_jiffies),
+ data_race(rtp->n_gps),
+ data_race(rtp->n_ipis_fails), data_race(rtp->n_ipis),
+ ".k"[!!data_race(rtp->kthread_ptr)],
+ ".C"[!!data_race(rtp->cbs_head)],
+ s);
+}
+
+static void exit_tasks_rcu_finish_trace(struct task_struct *t);
+
+#if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU)
+
+////////////////////////////////////////////////////////////////////////
+//
+// Shared code between task-list-scanning variants of Tasks RCU.
+
+/* Wait for one RCU-tasks grace period. */
+static void rcu_tasks_wait_gp(struct rcu_tasks *rtp)
+{
+ struct task_struct *g, *t;
+ unsigned long lastreport;
+ LIST_HEAD(holdouts);
+ int fract;
+
+ set_tasks_gp_state(rtp, RTGS_PRE_WAIT_GP);
+ rtp->pregp_func();
+
+ /*
+ * There were callbacks, so we need to wait for an RCU-tasks
+ * grace period. Start off by scanning the task list for tasks
+ * that are not already voluntarily blocked. Mark these tasks
+ * and make a list of them in holdouts.
+ */
+ set_tasks_gp_state(rtp, RTGS_SCAN_TASKLIST);
+ rcu_read_lock();
+ for_each_process_thread(g, t)
+ rtp->pertask_func(t, &holdouts);
+ rcu_read_unlock();
+
+ set_tasks_gp_state(rtp, RTGS_POST_SCAN_TASKLIST);
+ rtp->postscan_func(&holdouts);
+
+ /*
+ * Each pass through the following loop scans the list of holdout
+ * tasks, removing any that are no longer holdouts. When the list
+ * is empty, we are done.
+ */
+ lastreport = jiffies;
+
+ /* Start off with HZ/10 wait and slowly back off to 1 HZ wait. */
+ fract = 10;
+
+ for (;;) {
+ bool firstreport;
+ bool needreport;
+ int rtst;
+
+ if (list_empty(&holdouts))
+ break;
+
+ /* Slowly back off waiting for holdouts */
+ set_tasks_gp_state(rtp, RTGS_WAIT_SCAN_HOLDOUTS);
+ schedule_timeout_interruptible(HZ/fract);
+
+ if (fract > 1)
+ fract--;
+
+ rtst = READ_ONCE(rcu_task_stall_timeout);
+ needreport = rtst > 0 && time_after(jiffies, lastreport + rtst);
+ if (needreport)
+ lastreport = jiffies;
+ firstreport = true;
+ WARN_ON(signal_pending(current));
+ set_tasks_gp_state(rtp, RTGS_SCAN_HOLDOUTS);
+ rtp->holdouts_func(&holdouts, needreport, &firstreport);
+ }
+
+ set_tasks_gp_state(rtp, RTGS_POST_GP);
+ rtp->postgp_func(rtp);
+}
+
+#endif /* #if defined(CONFIG_TASKS_RCU) || defined(CONFIG_TASKS_TRACE_RCU) */
+
+#ifdef CONFIG_TASKS_RCU
+
+////////////////////////////////////////////////////////////////////////
+//
+// Simple variant of RCU whose quiescent states are voluntary context
+// switch, cond_resched_rcu_qs(), user-space execution, and idle.
+// As such, grace periods can take one good long time. There are no
+// read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
+// because this implementation is intended to get the system into a safe
+// state for some of the manipulations involved in tracing and the like.
+// Finally, this implementation does not support high call_rcu_tasks()
+// rates from multiple CPUs. If this is required, per-CPU callback lists
+// will be needed.
+
+/* Pre-grace-period preparation. */
+static void rcu_tasks_pregp_step(void)
+{
+ /*
+ * Wait for all pre-existing t->on_rq and t->nvcsw transitions
+ * to complete. Invoking synchronize_rcu() suffices because all
+ * these transitions occur with interrupts disabled. Without this
+ * synchronize_rcu(), a read-side critical section that started
+ * before the grace period might be incorrectly seen as having
+ * started after the grace period.
+ *
+ * This synchronize_rcu() also dispenses with the need for a
+ * memory barrier on the first store to t->rcu_tasks_holdout,
+ * as it forces the store to happen after the beginning of the
+ * grace period.
+ */
+ synchronize_rcu();
+}
+
+/* Per-task initial processing. */
+static void rcu_tasks_pertask(struct task_struct *t, struct list_head *hop)
+{
+ if (t != current && READ_ONCE(t->on_rq) && !is_idle_task(t)) {
+ get_task_struct(t);
+ t->rcu_tasks_nvcsw = READ_ONCE(t->nvcsw);
+ WRITE_ONCE(t->rcu_tasks_holdout, true);
+ list_add(&t->rcu_tasks_holdout_list, hop);
+ }
+}
+
+/* Processing between scanning taskslist and draining the holdout list. */
+void rcu_tasks_postscan(struct list_head *hop)
+{
+ /*
+ * Wait for tasks that are in the process of exiting. This
+ * does only part of the job, ensuring that all tasks that were
+ * previously exiting reach the point where they have disabled
+ * preemption, allowing the later synchronize_rcu() to finish
+ * the job.
+ */
+ synchronize_srcu(&tasks_rcu_exit_srcu);
+}
+
+/* See if tasks are still holding out, complain if so. */
+static void check_holdout_task(struct task_struct *t,
+ bool needreport, bool *firstreport)
+{
+ int cpu;
+
+ if (!READ_ONCE(t->rcu_tasks_holdout) ||
+ t->rcu_tasks_nvcsw != READ_ONCE(t->nvcsw) ||
+ !READ_ONCE(t->on_rq) ||
+ (IS_ENABLED(CONFIG_NO_HZ_FULL) &&
+ !is_idle_task(t) && t->rcu_tasks_idle_cpu >= 0)) {
+ WRITE_ONCE(t->rcu_tasks_holdout, false);
+ list_del_init(&t->rcu_tasks_holdout_list);
+ put_task_struct(t);
+ return;
+ }
+ rcu_request_urgent_qs_task(t);
+ if (!needreport)
+ return;
+ if (*firstreport) {
+ pr_err("INFO: rcu_tasks detected stalls on tasks:\n");
+ *firstreport = false;
+ }
+ cpu = task_cpu(t);
+ pr_alert("%p: %c%c nvcsw: %lu/%lu holdout: %d idle_cpu: %d/%d\n",
+ t, ".I"[is_idle_task(t)],
+ "N."[cpu < 0 || !tick_nohz_full_cpu(cpu)],
+ t->rcu_tasks_nvcsw, t->nvcsw, t->rcu_tasks_holdout,
+ t->rcu_tasks_idle_cpu, cpu);
+ sched_show_task(t);
+}
+
+/* Scan the holdout lists for tasks no longer holding out. */
+static void check_all_holdout_tasks(struct list_head *hop,
+ bool needreport, bool *firstreport)
+{
+ struct task_struct *t, *t1;
+
+ list_for_each_entry_safe(t, t1, hop, rcu_tasks_holdout_list) {
+ check_holdout_task(t, needreport, firstreport);
+ cond_resched();
+ }
+}
+
+/* Finish off the Tasks-RCU grace period. */
+static void rcu_tasks_postgp(struct rcu_tasks *rtp)
+{
+ /*
+ * Because ->on_rq and ->nvcsw are not guaranteed to have a full
+ * memory barriers prior to them in the schedule() path, memory
+ * reordering on other CPUs could cause their RCU-tasks read-side
+ * critical sections to extend past the end of the grace period.
+ * However, because these ->nvcsw updates are carried out with
+ * interrupts disabled, we can use synchronize_rcu() to force the
+ * needed ordering on all such CPUs.
+ *
+ * This synchronize_rcu() also confines all ->rcu_tasks_holdout
+ * accesses to be within the grace period, avoiding the need for
+ * memory barriers for ->rcu_tasks_holdout accesses.
+ *
+ * In addition, this synchronize_rcu() waits for exiting tasks
+ * to complete their final preempt_disable() region of execution,
+ * cleaning up after the synchronize_srcu() above.
+ */
+ synchronize_rcu();
+}
+
+void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func);
+DEFINE_RCU_TASKS(rcu_tasks, rcu_tasks_wait_gp, call_rcu_tasks, "RCU Tasks");
+
+/**
+ * call_rcu_tasks() - Queue an RCU for invocation task-based grace period
+ * @rhp: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all currently executing RCU
+ * read-side critical sections have completed. call_rcu_tasks() assumes
+ * that the read-side critical sections end at a voluntary context
+ * switch (not a preemption!), cond_resched_rcu_qs(), entry into idle,
+ * or transition to usermode execution. As such, there are no read-side
+ * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
+ * this primitive is intended to determine that all tasks have passed
+ * through a safe state, not so much for data-strcuture synchronization.
+ *
+ * See the description of call_rcu() for more detailed information on
+ * memory ordering guarantees.
+ */
+void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
+{
+ call_rcu_tasks_generic(rhp, func, &rcu_tasks);
+}
+EXPORT_SYMBOL_GPL(call_rcu_tasks);
+
+/**
+ * synchronize_rcu_tasks - wait until an rcu-tasks grace period has elapsed.
+ *
+ * Control will return to the caller some time after a full rcu-tasks
+ * grace period has elapsed, in other words after all currently
+ * executing rcu-tasks read-side critical sections have elapsed. These
+ * read-side critical sections are delimited by calls to schedule(),
+ * cond_resched_tasks_rcu_qs(), idle execution, userspace execution, calls
+ * to synchronize_rcu_tasks(), and (in theory, anyway) cond_resched().
+ *
+ * This is a very specialized primitive, intended only for a few uses in
+ * tracing and other situations requiring manipulation of function
+ * preambles and profiling hooks. The synchronize_rcu_tasks() function
+ * is not (yet) intended for heavy use from multiple CPUs.
+ *
+ * See the description of synchronize_rcu() for more detailed information
+ * on memory ordering guarantees.
+ */
+void synchronize_rcu_tasks(void)
+{
+ synchronize_rcu_tasks_generic(&rcu_tasks);
+}
+EXPORT_SYMBOL_GPL(synchronize_rcu_tasks);
+
+/**
+ * rcu_barrier_tasks - Wait for in-flight call_rcu_tasks() callbacks.
+ *
+ * Although the current implementation is guaranteed to wait, it is not
+ * obligated to, for example, if there are no pending callbacks.
+ */
+void rcu_barrier_tasks(void)
+{
+ /* There is only one callback queue, so this is easy. ;-) */
+ synchronize_rcu_tasks();
+}
+EXPORT_SYMBOL_GPL(rcu_barrier_tasks);
+
+static int __init rcu_spawn_tasks_kthread(void)
+{
+ rcu_tasks.pregp_func = rcu_tasks_pregp_step;
+ rcu_tasks.pertask_func = rcu_tasks_pertask;
+ rcu_tasks.postscan_func = rcu_tasks_postscan;
+ rcu_tasks.holdouts_func = check_all_holdout_tasks;
+ rcu_tasks.postgp_func = rcu_tasks_postgp;
+ rcu_spawn_tasks_kthread_generic(&rcu_tasks);
+ return 0;
+}
+core_initcall(rcu_spawn_tasks_kthread);
+
+static void show_rcu_tasks_classic_gp_kthread(void)
+{
+ show_rcu_tasks_generic_gp_kthread(&rcu_tasks, "");
+}
+
+/* Do the srcu_read_lock() for the above synchronize_srcu(). */
+void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu)
+{
+ preempt_disable();
+ current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu);
+ preempt_enable();
+}
+
+/* Do the srcu_read_unlock() for the above synchronize_srcu(). */
+void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu)
+{
+ struct task_struct *t = current;
+
+ preempt_disable();
+ __srcu_read_unlock(&tasks_rcu_exit_srcu, t->rcu_tasks_idx);
+ preempt_enable();
+ exit_tasks_rcu_finish_trace(t);
+}
+
+#else /* #ifdef CONFIG_TASKS_RCU */
+static void show_rcu_tasks_classic_gp_kthread(void) { }
+void exit_tasks_rcu_start(void) { }
+void exit_tasks_rcu_finish(void) { exit_tasks_rcu_finish_trace(current); }
+#endif /* #else #ifdef CONFIG_TASKS_RCU */
+
+#ifdef CONFIG_TASKS_RUDE_RCU
+
+////////////////////////////////////////////////////////////////////////
+//
+// "Rude" variant of Tasks RCU, inspired by Steve Rostedt's trick of
+// passing an empty function to schedule_on_each_cpu(). This approach
+// provides an asynchronous call_rcu_tasks_rude() API and batching
+// of concurrent calls to the synchronous synchronize_rcu_rude() API.
+// This sends IPIs far and wide and induces otherwise unnecessary context
+// switches on all online CPUs, whether idle or not.
+
+// Empty function to allow workqueues to force a context switch.
+static void rcu_tasks_be_rude(struct work_struct *work)
+{
+}
+
+// Wait for one rude RCU-tasks grace period.
+static void rcu_tasks_rude_wait_gp(struct rcu_tasks *rtp)
+{
+ rtp->n_ipis += cpumask_weight(cpu_online_mask);
+ schedule_on_each_cpu(rcu_tasks_be_rude);
+}
+
+void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func);
+DEFINE_RCU_TASKS(rcu_tasks_rude, rcu_tasks_rude_wait_gp, call_rcu_tasks_rude,
+ "RCU Tasks Rude");
+
+/**
+ * call_rcu_tasks_rude() - Queue a callback rude task-based grace period
+ * @rhp: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all currently executing RCU
+ * read-side critical sections have completed. call_rcu_tasks_rude()
+ * assumes that the read-side critical sections end at context switch,
+ * cond_resched_rcu_qs(), or transition to usermode execution. As such,
+ * there are no read-side primitives analogous to rcu_read_lock() and
+ * rcu_read_unlock() because this primitive is intended to determine
+ * that all tasks have passed through a safe state, not so much for
+ * data-strcuture synchronization.
+ *
+ * See the description of call_rcu() for more detailed information on
+ * memory ordering guarantees.
+ */
+void call_rcu_tasks_rude(struct rcu_head *rhp, rcu_callback_t func)
+{
+ call_rcu_tasks_generic(rhp, func, &rcu_tasks_rude);
+}
+EXPORT_SYMBOL_GPL(call_rcu_tasks_rude);
+
+/**
+ * synchronize_rcu_tasks_rude - wait for a rude rcu-tasks grace period
+ *
+ * Control will return to the caller some time after a rude rcu-tasks
+ * grace period has elapsed, in other words after all currently
+ * executing rcu-tasks read-side critical sections have elapsed. These
+ * read-side critical sections are delimited by calls to schedule(),
+ * cond_resched_tasks_rcu_qs(), userspace execution, and (in theory,
+ * anyway) cond_resched().
+ *
+ * This is a very specialized primitive, intended only for a few uses in
+ * tracing and other situations requiring manipulation of function preambles
+ * and profiling hooks. The synchronize_rcu_tasks_rude() function is not
+ * (yet) intended for heavy use from multiple CPUs.
+ *
+ * See the description of synchronize_rcu() for more detailed information
+ * on memory ordering guarantees.
+ */
+void synchronize_rcu_tasks_rude(void)
+{
+ synchronize_rcu_tasks_generic(&rcu_tasks_rude);
+}
+EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_rude);
+
+/**
+ * rcu_barrier_tasks_rude - Wait for in-flight call_rcu_tasks_rude() callbacks.
+ *
+ * Although the current implementation is guaranteed to wait, it is not
+ * obligated to, for example, if there are no pending callbacks.
+ */
+void rcu_barrier_tasks_rude(void)
+{
+ /* There is only one callback queue, so this is easy. ;-) */
+ synchronize_rcu_tasks_rude();
+}
+EXPORT_SYMBOL_GPL(rcu_barrier_tasks_rude);
+
+static int __init rcu_spawn_tasks_rude_kthread(void)
+{
+ rcu_spawn_tasks_kthread_generic(&rcu_tasks_rude);
+ return 0;
+}
+core_initcall(rcu_spawn_tasks_rude_kthread);
+
+static void show_rcu_tasks_rude_gp_kthread(void)
+{
+ show_rcu_tasks_generic_gp_kthread(&rcu_tasks_rude, "");
+}
+
+#else /* #ifdef CONFIG_TASKS_RUDE_RCU */
+static void show_rcu_tasks_rude_gp_kthread(void) {}
+#endif /* #else #ifdef CONFIG_TASKS_RUDE_RCU */
+
+////////////////////////////////////////////////////////////////////////
+//
+// Tracing variant of Tasks RCU. This variant is designed to be used
+// to protect tracing hooks, including those of BPF. This variant
+// therefore:
+//
+// 1. Has explicit read-side markers to allow finite grace periods
+// in the face of in-kernel loops for PREEMPT=n builds.
+//
+// 2. Protects code in the idle loop, exception entry/exit, and
+// CPU-hotplug code paths, similar to the capabilities of SRCU.
+//
+// 3. Avoids expensive read-side instruction, having overhead similar
+// to that of Preemptible RCU.
+//
+// There are of course downsides. The grace-period code can send IPIs to
+// CPUs, even when those CPUs are in the idle loop or in nohz_full userspace.
+// It is necessary to scan the full tasklist, much as for Tasks RCU. There
+// is a single callback queue guarded by a single lock, again, much as for
+// Tasks RCU. If needed, these downsides can be at least partially remedied.
+//
+// Perhaps most important, this variant of RCU does not affect the vanilla
+// flavors, rcu_preempt and rcu_sched. The fact that RCU Tasks Trace
+// readers can operate from idle, offline, and exception entry/exit in no
+// way allows rcu_preempt and rcu_sched readers to also do so.
+
+// The lockdep state must be outside of #ifdef to be useful.
+#ifdef CONFIG_DEBUG_LOCK_ALLOC
+static struct lock_class_key rcu_lock_trace_key;
+struct lockdep_map rcu_trace_lock_map =
+ STATIC_LOCKDEP_MAP_INIT("rcu_read_lock_trace", &rcu_lock_trace_key);
+EXPORT_SYMBOL_GPL(rcu_trace_lock_map);
+#endif /* #ifdef CONFIG_DEBUG_LOCK_ALLOC */
+
+#ifdef CONFIG_TASKS_TRACE_RCU
+
+atomic_t trc_n_readers_need_end; // Number of waited-for readers.
+DECLARE_WAIT_QUEUE_HEAD(trc_wait); // List of holdout tasks.
+
+// Record outstanding IPIs to each CPU. No point in sending two...
+static DEFINE_PER_CPU(bool, trc_ipi_to_cpu);
+
+// The number of detections of task quiescent state relying on
+// heavyweight readers executing explicit memory barriers.
+unsigned long n_heavy_reader_attempts;
+unsigned long n_heavy_reader_updates;
+unsigned long n_heavy_reader_ofl_updates;
+
+void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func);
+DEFINE_RCU_TASKS(rcu_tasks_trace, rcu_tasks_wait_gp, call_rcu_tasks_trace,
+ "RCU Tasks Trace");
+
+/*
+ * This irq_work handler allows rcu_read_unlock_trace() to be invoked
+ * while the scheduler locks are held.
+ */
+static void rcu_read_unlock_iw(struct irq_work *iwp)
+{
+ wake_up(&trc_wait);
+}
+static DEFINE_IRQ_WORK(rcu_tasks_trace_iw, rcu_read_unlock_iw);
+
+/* If we are the last reader, wake up the grace-period kthread. */
+void rcu_read_unlock_trace_special(struct task_struct *t, int nesting)
+{
+ int nq = t->trc_reader_special.b.need_qs;
+
+ if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB) &&
+ t->trc_reader_special.b.need_mb)
+ smp_mb(); // Pairs with update-side barriers.
+ // Update .need_qs before ->trc_reader_nesting for irq/NMI handlers.
+ if (nq)
+ WRITE_ONCE(t->trc_reader_special.b.need_qs, false);
+ WRITE_ONCE(t->trc_reader_nesting, nesting);
+ if (nq && atomic_dec_and_test(&trc_n_readers_need_end))
+ irq_work_queue(&rcu_tasks_trace_iw);
+}
+EXPORT_SYMBOL_GPL(rcu_read_unlock_trace_special);
+
+/* Add a task to the holdout list, if it is not already on the list. */
+static void trc_add_holdout(struct task_struct *t, struct list_head *bhp)
+{
+ if (list_empty(&t->trc_holdout_list)) {
+ get_task_struct(t);
+ list_add(&t->trc_holdout_list, bhp);
+ }
+}
+
+/* Remove a task from the holdout list, if it is in fact present. */
+static void trc_del_holdout(struct task_struct *t)
+{
+ if (!list_empty(&t->trc_holdout_list)) {
+ list_del_init(&t->trc_holdout_list);
+ put_task_struct(t);
+ }
+}
+
+/* IPI handler to check task state. */
+static void trc_read_check_handler(void *t_in)
+{
+ struct task_struct *t = current;
+ struct task_struct *texp = t_in;
+
+ // If the task is no longer running on this CPU, leave.
+ if (unlikely(texp != t)) {
+ if (WARN_ON_ONCE(atomic_dec_and_test(&trc_n_readers_need_end)))
+ wake_up(&trc_wait);
+ goto reset_ipi; // Already on holdout list, so will check later.
+ }
+
+ // If the task is not in a read-side critical section, and
+ // if this is the last reader, awaken the grace-period kthread.
+ if (likely(!t->trc_reader_nesting)) {
+ if (WARN_ON_ONCE(atomic_dec_and_test(&trc_n_readers_need_end)))
+ wake_up(&trc_wait);
+ // Mark as checked after decrement to avoid false
+ // positives on the above WARN_ON_ONCE().
+ WRITE_ONCE(t->trc_reader_checked, true);
+ goto reset_ipi;
+ }
+ WRITE_ONCE(t->trc_reader_checked, true);
+
+ // Get here if the task is in a read-side critical section. Set
+ // its state so that it will awaken the grace-period kthread upon
+ // exit from that critical section.
+ WARN_ON_ONCE(t->trc_reader_special.b.need_qs);
+ WRITE_ONCE(t->trc_reader_special.b.need_qs, true);
+
+reset_ipi:
+ // Allow future IPIs to be sent on CPU and for task.
+ // Also order this IPI handler against any later manipulations of
+ // the intended task.
+ smp_store_release(&per_cpu(trc_ipi_to_cpu, smp_processor_id()), false); // ^^^
+ smp_store_release(&texp->trc_ipi_to_cpu, -1); // ^^^
+}
+
+/* Callback function for scheduler to check locked-down task. */
+static bool trc_inspect_reader(struct task_struct *t, void *arg)
+{
+ int cpu = task_cpu(t);
+ bool in_qs = false;
+ bool ofl = cpu_is_offline(cpu);
+
+ if (task_curr(t)) {
+ WARN_ON_ONCE(ofl & !is_idle_task(t));
+
+ // If no chance of heavyweight readers, do it the hard way.
+ if (!ofl && !IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
+ return false;
+
+ // If heavyweight readers are enabled on the remote task,
+ // we can inspect its state despite its currently running.
+ // However, we cannot safely change its state.
+ n_heavy_reader_attempts++;
+ if (!ofl && // Check for "running" idle tasks on offline CPUs.
+ !rcu_dynticks_zero_in_eqs(cpu, &t->trc_reader_nesting))
+ return false; // No quiescent state, do it the hard way.
+ n_heavy_reader_updates++;
+ if (ofl)
+ n_heavy_reader_ofl_updates++;
+ in_qs = true;
+ } else {
+ in_qs = likely(!t->trc_reader_nesting);
+ }
+
+ // Mark as checked. Because this is called from the grace-period
+ // kthread, also remove the task from the holdout list.
+ t->trc_reader_checked = true;
+ trc_del_holdout(t);
+
+ if (in_qs)
+ return true; // Already in quiescent state, done!!!
+
+ // The task is in a read-side critical section, so set up its
+ // state so that it will awaken the grace-period kthread upon exit
+ // from that critical section.
+ atomic_inc(&trc_n_readers_need_end); // One more to wait on.
+ WARN_ON_ONCE(t->trc_reader_special.b.need_qs);
+ WRITE_ONCE(t->trc_reader_special.b.need_qs, true);
+ return true;
+}
+
+/* Attempt to extract the state for the specified task. */
+static void trc_wait_for_one_reader(struct task_struct *t,
+ struct list_head *bhp)
+{
+ int cpu;
+
+ // If a previous IPI is still in flight, let it complete.
+ if (smp_load_acquire(&t->trc_ipi_to_cpu) != -1) // Order IPI
+ return;
+
+ // The current task had better be in a quiescent state.
+ if (t == current) {
+ t->trc_reader_checked = true;
+ trc_del_holdout(t);
+ WARN_ON_ONCE(t->trc_reader_nesting);
+ return;
+ }
+
+ // Attempt to nail down the task for inspection.
+ get_task_struct(t);
+ if (try_invoke_on_locked_down_task(t, trc_inspect_reader, NULL)) {
+ put_task_struct(t);
+ return;
+ }
+ put_task_struct(t);
+
+ // If currently running, send an IPI, either way, add to list.
+ trc_add_holdout(t, bhp);
+ if (task_curr(t) && time_after(jiffies, rcu_tasks_trace.gp_start + rcu_task_ipi_delay)) {
+ // The task is currently running, so try IPIing it.
+ cpu = task_cpu(t);
+
+ // If there is already an IPI outstanding, let it happen.
+ if (per_cpu(trc_ipi_to_cpu, cpu) || t->trc_ipi_to_cpu >= 0)
+ return;
+
+ atomic_inc(&trc_n_readers_need_end);
+ per_cpu(trc_ipi_to_cpu, cpu) = true;
+ t->trc_ipi_to_cpu = cpu;
+ rcu_tasks_trace.n_ipis++;
+ if (smp_call_function_single(cpu,
+ trc_read_check_handler, t, 0)) {
+ // Just in case there is some other reason for
+ // failure than the target CPU being offline.
+ rcu_tasks_trace.n_ipis_fails++;
+ per_cpu(trc_ipi_to_cpu, cpu) = false;
+ t->trc_ipi_to_cpu = cpu;
+ if (atomic_dec_and_test(&trc_n_readers_need_end)) {
+ WARN_ON_ONCE(1);
+ wake_up(&trc_wait);
+ }
+ }
+ }
+}
+
+/* Initialize for a new RCU-tasks-trace grace period. */
+static void rcu_tasks_trace_pregp_step(void)
+{
+ int cpu;
+
+ // Allow for fast-acting IPIs.
+ atomic_set(&trc_n_readers_need_end, 1);
+
+ // There shouldn't be any old IPIs, but...
+ for_each_possible_cpu(cpu)
+ WARN_ON_ONCE(per_cpu(trc_ipi_to_cpu, cpu));
+
+ // Disable CPU hotplug across the tasklist scan.
+ // This also waits for all readers in CPU-hotplug code paths.
+ cpus_read_lock();
+}
+
+/* Do first-round processing for the specified task. */
+static void rcu_tasks_trace_pertask(struct task_struct *t,
+ struct list_head *hop)
+{
+ WRITE_ONCE(t->trc_reader_special.b.need_qs, false);
+ WRITE_ONCE(t->trc_reader_checked, false);
+ t->trc_ipi_to_cpu = -1;
+ trc_wait_for_one_reader(t, hop);
+}
+
+/*
+ * Do intermediate processing between task and holdout scans and
+ * pick up the idle tasks.
+ */
+static void rcu_tasks_trace_postscan(struct list_head *hop)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu)
+ rcu_tasks_trace_pertask(idle_task(cpu), hop);
+
+ // Re-enable CPU hotplug now that the tasklist scan has completed.
+ cpus_read_unlock();
+
+ // Wait for late-stage exiting tasks to finish exiting.
+ // These might have passed the call to exit_tasks_rcu_finish().
+ synchronize_rcu();
+ // Any tasks that exit after this point will set ->trc_reader_checked.
+}
+
+/* Show the state of a task stalling the current RCU tasks trace GP. */
+static void show_stalled_task_trace(struct task_struct *t, bool *firstreport)
+{
+ int cpu;
+
+ if (*firstreport) {
+ pr_err("INFO: rcu_tasks_trace detected stalls on tasks:\n");
+ *firstreport = false;
+ }
+ // FIXME: This should attempt to use try_invoke_on_nonrunning_task().
+ cpu = task_cpu(t);
+ pr_alert("P%d: %c%c%c nesting: %d%c cpu: %d\n",
+ t->pid,
+ ".I"[READ_ONCE(t->trc_ipi_to_cpu) > 0],
+ ".i"[is_idle_task(t)],
+ ".N"[cpu > 0 && tick_nohz_full_cpu(cpu)],
+ t->trc_reader_nesting,
+ " N"[!!t->trc_reader_special.b.need_qs],
+ cpu);
+ sched_show_task(t);
+}
+
+/* List stalled IPIs for RCU tasks trace. */
+static void show_stalled_ipi_trace(void)
+{
+ int cpu;
+
+ for_each_possible_cpu(cpu)
+ if (per_cpu(trc_ipi_to_cpu, cpu))
+ pr_alert("\tIPI outstanding to CPU %d\n", cpu);
+}
+
+/* Do one scan of the holdout list. */
+static void check_all_holdout_tasks_trace(struct list_head *hop,
+ bool needreport, bool *firstreport)
+{
+ struct task_struct *g, *t;
+
+ // Disable CPU hotplug across the holdout list scan.
+ cpus_read_lock();
+
+ list_for_each_entry_safe(t, g, hop, trc_holdout_list) {
+ // If safe and needed, try to check the current task.
+ if (READ_ONCE(t->trc_ipi_to_cpu) == -1 &&
+ !READ_ONCE(t->trc_reader_checked))
+ trc_wait_for_one_reader(t, hop);
+
+ // If check succeeded, remove this task from the list.
+ if (READ_ONCE(t->trc_reader_checked))
+ trc_del_holdout(t);
+ else if (needreport)
+ show_stalled_task_trace(t, firstreport);
+ }
+
+ // Re-enable CPU hotplug now that the holdout list scan has completed.
+ cpus_read_unlock();
+
+ if (needreport) {
+ if (firstreport)
+ pr_err("INFO: rcu_tasks_trace detected stalls? (Late IPI?)\n");
+ show_stalled_ipi_trace();
+ }
+}
+
+/* Wait for grace period to complete and provide ordering. */
+static void rcu_tasks_trace_postgp(struct rcu_tasks *rtp)
+{
+ bool firstreport;
+ struct task_struct *g, *t;
+ LIST_HEAD(holdouts);
+ long ret;
+
+ // Remove the safety count.
+ smp_mb__before_atomic(); // Order vs. earlier atomics
+ atomic_dec(&trc_n_readers_need_end);
+ smp_mb__after_atomic(); // Order vs. later atomics
+
+ // Wait for readers.
+ set_tasks_gp_state(rtp, RTGS_WAIT_READERS);
+ for (;;) {
+ ret = wait_event_idle_exclusive_timeout(
+ trc_wait,
+ atomic_read(&trc_n_readers_need_end) == 0,
+ READ_ONCE(rcu_task_stall_timeout));
+ if (ret)
+ break; // Count reached zero.
+ // Stall warning time, so make a list of the offenders.
+ for_each_process_thread(g, t)
+ if (READ_ONCE(t->trc_reader_special.b.need_qs))
+ trc_add_holdout(t, &holdouts);
+ firstreport = true;
+ list_for_each_entry_safe(t, g, &holdouts, trc_holdout_list)
+ if (READ_ONCE(t->trc_reader_special.b.need_qs)) {
+ show_stalled_task_trace(t, &firstreport);
+ trc_del_holdout(t);
+ }
+ if (firstreport)
+ pr_err("INFO: rcu_tasks_trace detected stalls? (Counter/taskslist mismatch?)\n");
+ show_stalled_ipi_trace();
+ pr_err("\t%d holdouts\n", atomic_read(&trc_n_readers_need_end));
+ }
+ smp_mb(); // Caller's code must be ordered after wakeup.
+ // Pairs with pretty much every ordering primitive.
+}
+
+/* Report any needed quiescent state for this exiting task. */
+static void exit_tasks_rcu_finish_trace(struct task_struct *t)
+{
+ WRITE_ONCE(t->trc_reader_checked, true);
+ WARN_ON_ONCE(t->trc_reader_nesting);
+ WRITE_ONCE(t->trc_reader_nesting, 0);
+ if (WARN_ON_ONCE(READ_ONCE(t->trc_reader_special.b.need_qs)))
+ rcu_read_unlock_trace_special(t, 0);
+}
+
+/**
+ * call_rcu_tasks_trace() - Queue a callback trace task-based grace period
+ * @rhp: structure to be used for queueing the RCU updates.
+ * @func: actual callback function to be invoked after the grace period
+ *
+ * The callback function will be invoked some time after a full grace
+ * period elapses, in other words after all currently executing RCU
+ * read-side critical sections have completed. call_rcu_tasks_trace()
+ * assumes that the read-side critical sections end at context switch,
+ * cond_resched_rcu_qs(), or transition to usermode execution. As such,
+ * there are no read-side primitives analogous to rcu_read_lock() and
+ * rcu_read_unlock() because this primitive is intended to determine
+ * that all tasks have passed through a safe state, not so much for
+ * data-strcuture synchronization.
+ *
+ * See the description of call_rcu() for more detailed information on
+ * memory ordering guarantees.
+ */
+void call_rcu_tasks_trace(struct rcu_head *rhp, rcu_callback_t func)
+{
+ call_rcu_tasks_generic(rhp, func, &rcu_tasks_trace);
+}
+EXPORT_SYMBOL_GPL(call_rcu_tasks_trace);
+
+/**
+ * synchronize_rcu_tasks_trace - wait for a trace rcu-tasks grace period
+ *
+ * Control will return to the caller some time after a trace rcu-tasks
+ * grace period has elapsed, in other words after all currently
+ * executing rcu-tasks read-side critical sections have elapsed. These
+ * read-side critical sections are delimited by calls to schedule(),
+ * cond_resched_tasks_rcu_qs(), userspace execution, and (in theory,
+ * anyway) cond_resched().
+ *
+ * This is a very specialized primitive, intended only for a few uses in
+ * tracing and other situations requiring manipulation of function preambles
+ * and profiling hooks. The synchronize_rcu_tasks_trace() function is not
+ * (yet) intended for heavy use from multiple CPUs.
+ *
+ * See the description of synchronize_rcu() for more detailed information
+ * on memory ordering guarantees.
+ */
+void synchronize_rcu_tasks_trace(void)
+{
+ RCU_LOCKDEP_WARN(lock_is_held(&rcu_trace_lock_map), "Illegal synchronize_rcu_tasks_trace() in RCU Tasks Trace read-side critical section");
+ synchronize_rcu_tasks_generic(&rcu_tasks_trace);
+}
+EXPORT_SYMBOL_GPL(synchronize_rcu_tasks_trace);
+
+/**
+ * rcu_barrier_tasks_trace - Wait for in-flight call_rcu_tasks_trace() callbacks.
+ *
+ * Although the current implementation is guaranteed to wait, it is not
+ * obligated to, for example, if there are no pending callbacks.
+ */
+void rcu_barrier_tasks_trace(void)
+{
+ /* There is only one callback queue, so this is easy. ;-) */
+ synchronize_rcu_tasks_trace();
+}
+EXPORT_SYMBOL_GPL(rcu_barrier_tasks_trace);
+
+static int __init rcu_spawn_tasks_trace_kthread(void)
+{
+ rcu_tasks_trace.pregp_func = rcu_tasks_trace_pregp_step;
+ rcu_tasks_trace.pertask_func = rcu_tasks_trace_pertask;
+ rcu_tasks_trace.postscan_func = rcu_tasks_trace_postscan;
+ rcu_tasks_trace.holdouts_func = check_all_holdout_tasks_trace;
+ rcu_tasks_trace.postgp_func = rcu_tasks_trace_postgp;
+ rcu_spawn_tasks_kthread_generic(&rcu_tasks_trace);
+ return 0;
+}
+core_initcall(rcu_spawn_tasks_trace_kthread);
+
+static void show_rcu_tasks_trace_gp_kthread(void)
+{
+ char buf[64];
+
+ sprintf(buf, "N%d h:%lu/%lu/%lu", atomic_read(&trc_n_readers_need_end),
+ data_race(n_heavy_reader_ofl_updates),
+ data_race(n_heavy_reader_updates),
+ data_race(n_heavy_reader_attempts));
+ show_rcu_tasks_generic_gp_kthread(&rcu_tasks_trace, buf);
+}
+
+#else /* #ifdef CONFIG_TASKS_TRACE_RCU */
+static void exit_tasks_rcu_finish_trace(struct task_struct *t) { }
+static inline void show_rcu_tasks_trace_gp_kthread(void) {}
+#endif /* #else #ifdef CONFIG_TASKS_TRACE_RCU */
+
+void show_rcu_tasks_gp_kthreads(void)
+{
+ show_rcu_tasks_classic_gp_kthread();
+ show_rcu_tasks_rude_gp_kthread();
+ show_rcu_tasks_trace_gp_kthread();
+}
+
+#else /* #ifdef CONFIG_TASKS_RCU_GENERIC */
+static inline void rcu_tasks_bootup_oddness(void) {}
+void show_rcu_tasks_gp_kthreads(void) {}
+#endif /* #else #ifdef CONFIG_TASKS_RCU_GENERIC */
#endif
#define MODULE_PARAM_PREFIX "rcutree."
+#ifndef data_race
+#define data_race(expr) \
+ ({ \
+ expr; \
+ })
+#endif
+#ifndef ASSERT_EXCLUSIVE_WRITER
+#define ASSERT_EXCLUSIVE_WRITER(var) do { } while (0)
+#endif
+#ifndef ASSERT_EXCLUSIVE_ACCESS
+#define ASSERT_EXCLUSIVE_ACCESS(var) do { } while (0)
+#endif
+
/* Data structures. */
/*
*/
#define RCU_DYNTICK_CTRL_MASK 0x1
#define RCU_DYNTICK_CTRL_CTR (RCU_DYNTICK_CTRL_MASK + 1)
-#ifndef rcu_eqs_special_exit
-#define rcu_eqs_special_exit() do { } while (0)
-#endif
static DEFINE_PER_CPU_SHARED_ALIGNED(struct rcu_data, rcu_data) = {
.dynticks_nesting = 1,
static bool dump_tree;
module_param(dump_tree, bool, 0444);
/* By default, use RCU_SOFTIRQ instead of rcuc kthreads. */
-static bool use_softirq = 1;
+static bool use_softirq = true;
module_param(use_softirq, bool, 0444);
/* Control rcu_node-tree auto-balancing at boot time. */
static bool rcu_fanout_exact;
/*
* Record entry into an extended quiescent state. This is only to be
- * called when not already in an extended quiescent state.
+ * called when not already in an extended quiescent state, that is,
+ * RCU is watching prior to the call to this function and is no longer
+ * watching upon return.
*/
-static void rcu_dynticks_eqs_enter(void)
+static noinstr void rcu_dynticks_eqs_enter(void)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
int seq;
* critical sections, and we also must force ordering with the
* next idle sojourn.
*/
+ rcu_dynticks_task_trace_enter(); // Before ->dynticks update!
seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks);
- /* Better be in an extended quiescent state! */
+ // RCU is no longer watching. Better be in extended quiescent state!
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
(seq & RCU_DYNTICK_CTRL_CTR));
/* Better not have special action (TLB flush) pending! */
/*
* Record exit from an extended quiescent state. This is only to be
- * called from an extended quiescent state.
+ * called from an extended quiescent state, that is, RCU is not watching
+ * prior to the call to this function and is watching upon return.
*/
-static void rcu_dynticks_eqs_exit(void)
+static noinstr void rcu_dynticks_eqs_exit(void)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
int seq;
* critical section.
*/
seq = atomic_add_return(RCU_DYNTICK_CTRL_CTR, &rdp->dynticks);
+ // RCU is now watching. Better not be in an extended quiescent state!
+ rcu_dynticks_task_trace_exit(); // After ->dynticks update!
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
!(seq & RCU_DYNTICK_CTRL_CTR));
if (seq & RCU_DYNTICK_CTRL_MASK) {
atomic_andnot(RCU_DYNTICK_CTRL_MASK, &rdp->dynticks);
smp_mb__after_atomic(); /* _exit after clearing mask. */
- /* Prefer duplicate flushes to losing a flush. */
- rcu_eqs_special_exit();
}
}
*
* No ordering, as we are sampling CPU-local information.
*/
-static bool rcu_dynticks_curr_cpu_in_eqs(void)
+static __always_inline bool rcu_dynticks_curr_cpu_in_eqs(void)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
return snap != rcu_dynticks_snap(rdp);
}
+/*
+ * Return true if the referenced integer is zero while the specified
+ * CPU remains within a single extended quiescent state.
+ */
+bool rcu_dynticks_zero_in_eqs(int cpu, int *vp)
+{
+ struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
+ int snap;
+
+ // If not quiescent, force back to earlier extended quiescent state.
+ snap = atomic_read(&rdp->dynticks) & ~(RCU_DYNTICK_CTRL_MASK |
+ RCU_DYNTICK_CTRL_CTR);
+
+ smp_rmb(); // Order ->dynticks and *vp reads.
+ if (READ_ONCE(*vp))
+ return false; // Non-zero, so report failure;
+ smp_rmb(); // Order *vp read and ->dynticks re-read.
+
+ // If still in the same extended quiescent state, we are good!
+ return snap == (atomic_read(&rdp->dynticks) & ~RCU_DYNTICK_CTRL_MASK);
+}
+
/*
* Set the special (bottom) bit of the specified CPU so that it
* will take special action (such as flushing its TLB) on the
EXPORT_SYMBOL_GPL(rcu_momentary_dyntick_idle);
/**
- * rcu_is_cpu_rrupt_from_idle - see if interrupted from idle
+ * rcu_is_cpu_rrupt_from_idle - see if 'interrupted' from idle
*
* If the current CPU is idle and running at a first-level (not nested)
- * interrupt from idle, return true. The caller must have at least
- * disabled preemption.
+ * interrupt, or directly, from idle, return true.
+ *
+ * The caller must have at least disabled IRQs.
*/
static int rcu_is_cpu_rrupt_from_idle(void)
{
- /* Called only from within the scheduling-clock interrupt */
- lockdep_assert_in_irq();
+ long nesting;
+
+ /*
+ * Usually called from the tick; but also used from smp_function_call()
+ * for expedited grace periods. This latter can result in running from
+ * the idle task, instead of an actual IPI.
+ */
+ lockdep_assert_irqs_disabled();
/* Check for counter underflows */
RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) < 0,
"RCU dynticks_nmi_nesting counter underflow/zero!");
/* Are we at first interrupt nesting level? */
- if (__this_cpu_read(rcu_data.dynticks_nmi_nesting) != 1)
+ nesting = __this_cpu_read(rcu_data.dynticks_nmi_nesting);
+ if (nesting > 1)
return false;
+ /*
+ * If we're not in an interrupt, we must be in the idle task!
+ */
+ WARN_ON_ONCE(!nesting && !is_idle_task(current));
+
/* Does CPU appear to be idle from an RCU standpoint? */
return __this_cpu_read(rcu_data.dynticks_nesting) == 0;
}
* the possibility of usermode upcalls having messed up our count
* of interrupt nesting level during the prior busy period.
*/
-static void rcu_eqs_enter(bool user)
+static noinstr void rcu_eqs_enter(bool user)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) &&
rdp->dynticks_nesting == 0);
if (rdp->dynticks_nesting != 1) {
+ // RCU will still be watching, so just do accounting and leave.
rdp->dynticks_nesting--;
return;
}
lockdep_assert_irqs_disabled();
+ instrumentation_begin();
trace_rcu_dyntick(TPS("Start"), rdp->dynticks_nesting, 0, atomic_read(&rdp->dynticks));
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
rdp = this_cpu_ptr(&rcu_data);
do_nocb_deferred_wakeup(rdp);
rcu_prepare_for_idle();
rcu_preempt_deferred_qs(current);
+ instrumentation_end();
WRITE_ONCE(rdp->dynticks_nesting, 0); /* Avoid irq-access tearing. */
+ // RCU is watching here ...
rcu_dynticks_eqs_enter();
+ // ... but is no longer watching here.
rcu_dynticks_task_enter();
}
* If you add or remove a call to rcu_user_enter(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
-void rcu_user_enter(void)
+noinstr void rcu_user_enter(void)
{
lockdep_assert_irqs_disabled();
rcu_eqs_enter(true);
}
#endif /* CONFIG_NO_HZ_FULL */
-/*
+/**
+ * rcu_nmi_exit - inform RCU of exit from NMI context
+ *
* If we are returning from the outermost NMI handler that interrupted an
* RCU-idle period, update rdp->dynticks and rdp->dynticks_nmi_nesting
* to let the RCU grace-period handling know that the CPU is back to
* being RCU-idle.
*
- * If you add or remove a call to rcu_nmi_exit_common(), be sure to test
+ * If you add or remove a call to rcu_nmi_exit(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
-static __always_inline void rcu_nmi_exit_common(bool irq)
+noinstr void rcu_nmi_exit(void)
{
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
* leave it in non-RCU-idle state.
*/
if (rdp->dynticks_nmi_nesting != 1) {
+ instrumentation_begin();
trace_rcu_dyntick(TPS("--="), rdp->dynticks_nmi_nesting, rdp->dynticks_nmi_nesting - 2,
atomic_read(&rdp->dynticks));
WRITE_ONCE(rdp->dynticks_nmi_nesting, /* No store tearing. */
rdp->dynticks_nmi_nesting - 2);
+ instrumentation_end();
return;
}
+ instrumentation_begin();
/* This NMI interrupted an RCU-idle CPU, restore RCU-idleness. */
trace_rcu_dyntick(TPS("Startirq"), rdp->dynticks_nmi_nesting, 0, atomic_read(&rdp->dynticks));
WRITE_ONCE(rdp->dynticks_nmi_nesting, 0); /* Avoid store tearing. */
- if (irq)
+ if (!in_nmi())
rcu_prepare_for_idle();
+ instrumentation_end();
+ // RCU is watching here ...
rcu_dynticks_eqs_enter();
+ // ... but is no longer watching here.
- if (irq)
+ if (!in_nmi())
rcu_dynticks_task_enter();
}
-/**
- * rcu_nmi_exit - inform RCU of exit from NMI context
- *
- * If you add or remove a call to rcu_nmi_exit(), be sure to test
- * with CONFIG_RCU_EQS_DEBUG=y.
- */
-void rcu_nmi_exit(void)
-{
- rcu_nmi_exit_common(false);
-}
-
/**
* rcu_irq_exit - inform RCU that current CPU is exiting irq towards idle
*
* If you add or remove a call to rcu_irq_exit(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
-void rcu_irq_exit(void)
+void noinstr rcu_irq_exit(void)
+{
+ lockdep_assert_irqs_disabled();
+ rcu_nmi_exit();
+}
+
+/**
+ * rcu_irq_exit_preempt - Inform RCU that current CPU is exiting irq
+ * towards in kernel preemption
+ *
+ * Same as rcu_irq_exit() but has a sanity check that scheduling is safe
+ * from RCU point of view. Invoked from return from interrupt before kernel
+ * preemption.
+ */
+void rcu_irq_exit_preempt(void)
{
lockdep_assert_irqs_disabled();
- rcu_nmi_exit_common(true);
+ rcu_nmi_exit();
+
+ RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) <= 0,
+ "RCU dynticks_nesting counter underflow/zero!");
+ RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
+ DYNTICK_IRQ_NONIDLE,
+ "Bad RCU dynticks_nmi_nesting counter\n");
+ RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
+ "RCU in extended quiescent state!");
}
+#ifdef CONFIG_PROVE_RCU
+/**
+ * rcu_irq_exit_check_preempt - Validate that scheduling is possible
+ */
+void rcu_irq_exit_check_preempt(void)
+{
+ lockdep_assert_irqs_disabled();
+
+ RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nesting) <= 0,
+ "RCU dynticks_nesting counter underflow/zero!");
+ RCU_LOCKDEP_WARN(__this_cpu_read(rcu_data.dynticks_nmi_nesting) !=
+ DYNTICK_IRQ_NONIDLE,
+ "Bad RCU dynticks_nmi_nesting counter\n");
+ RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
+ "RCU in extended quiescent state!");
+}
+#endif /* #ifdef CONFIG_PROVE_RCU */
+
/*
* Wrapper for rcu_irq_exit() where interrupts are enabled.
*
* allow for the possibility of usermode upcalls messing up our count of
* interrupt nesting level during the busy period that is just now starting.
*/
-static void rcu_eqs_exit(bool user)
+static void noinstr rcu_eqs_exit(bool user)
{
struct rcu_data *rdp;
long oldval;
oldval = rdp->dynticks_nesting;
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && oldval < 0);
if (oldval) {
+ // RCU was already watching, so just do accounting and leave.
rdp->dynticks_nesting++;
return;
}
rcu_dynticks_task_exit();
+ // RCU is not watching here ...
rcu_dynticks_eqs_exit();
+ // ... but is watching here.
+ instrumentation_begin();
rcu_cleanup_after_idle();
trace_rcu_dyntick(TPS("End"), rdp->dynticks_nesting, 1, atomic_read(&rdp->dynticks));
WARN_ON_ONCE(IS_ENABLED(CONFIG_RCU_EQS_DEBUG) && !user && !is_idle_task(current));
WRITE_ONCE(rdp->dynticks_nesting, 1);
WARN_ON_ONCE(rdp->dynticks_nmi_nesting);
WRITE_ONCE(rdp->dynticks_nmi_nesting, DYNTICK_IRQ_NONIDLE);
+ instrumentation_end();
}
/**
* If you add or remove a call to rcu_user_exit(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
-void rcu_user_exit(void)
+void noinstr rcu_user_exit(void)
{
rcu_eqs_exit(1);
}
+
+/**
+ * __rcu_irq_enter_check_tick - Enable scheduler tick on CPU if RCU needs it.
+ *
+ * The scheduler tick is not normally enabled when CPUs enter the kernel
+ * from nohz_full userspace execution. After all, nohz_full userspace
+ * execution is an RCU quiescent state and the time executing in the kernel
+ * is quite short. Except of course when it isn't. And it is not hard to
+ * cause a large system to spend tens of seconds or even minutes looping
+ * in the kernel, which can cause a number of problems, include RCU CPU
+ * stall warnings.
+ *
+ * Therefore, if a nohz_full CPU fails to report a quiescent state
+ * in a timely manner, the RCU grace-period kthread sets that CPU's
+ * ->rcu_urgent_qs flag with the expectation that the next interrupt or
+ * exception will invoke this function, which will turn on the scheduler
+ * tick, which will enable RCU to detect that CPU's quiescent states,
+ * for example, due to cond_resched() calls in CONFIG_PREEMPT=n kernels.
+ * The tick will be disabled once a quiescent state is reported for
+ * this CPU.
+ *
+ * Of course, in carefully tuned systems, there might never be an
+ * interrupt or exception. In that case, the RCU grace-period kthread
+ * will eventually cause one to happen. However, in less carefully
+ * controlled environments, this function allows RCU to get what it
+ * needs without creating otherwise useless interruptions.
+ */
+void __rcu_irq_enter_check_tick(void)
+{
+ struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
+
+ // Enabling the tick is unsafe in NMI handlers.
+ if (WARN_ON_ONCE(in_nmi()))
+ return;
+
+ RCU_LOCKDEP_WARN(rcu_dynticks_curr_cpu_in_eqs(),
+ "Illegal rcu_irq_enter_check_tick() from extended quiescent state");
+
+ if (!tick_nohz_full_cpu(rdp->cpu) ||
+ !READ_ONCE(rdp->rcu_urgent_qs) ||
+ READ_ONCE(rdp->rcu_forced_tick)) {
+ // RCU doesn't need nohz_full help from this CPU, or it is
+ // already getting that help.
+ return;
+ }
+
+ // We get here only when not in an extended quiescent state and
+ // from interrupts (as opposed to NMIs). Therefore, (1) RCU is
+ // already watching and (2) The fact that we are in an interrupt
+ // handler and that the rcu_node lock is an irq-disabled lock
+ // prevents self-deadlock. So we can safely recheck under the lock.
+ // Note that the nohz_full state currently cannot change.
+ raw_spin_lock_rcu_node(rdp->mynode);
+ if (rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
+ // A nohz_full CPU is in the kernel and RCU needs a
+ // quiescent state. Turn on the tick!
+ WRITE_ONCE(rdp->rcu_forced_tick, true);
+ tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
+ }
+ raw_spin_unlock_rcu_node(rdp->mynode);
+}
#endif /* CONFIG_NO_HZ_FULL */
/**
- * rcu_nmi_enter_common - inform RCU of entry to NMI context
+ * rcu_nmi_enter - inform RCU of entry to NMI context
* @irq: Is this call from rcu_irq_enter?
*
* If the CPU was idle from RCU's viewpoint, update rdp->dynticks and
* long as the nesting level does not overflow an int. (You will probably
* run out of stack space first.)
*
- * If you add or remove a call to rcu_nmi_enter_common(), be sure to test
+ * If you add or remove a call to rcu_nmi_enter(), be sure to test
* with CONFIG_RCU_EQS_DEBUG=y.
*/
-static __always_inline void rcu_nmi_enter_common(bool irq)
+noinstr void rcu_nmi_enter(void)
{
long incby = 2;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
*/
if (rcu_dynticks_curr_cpu_in_eqs()) {
- if (irq)
+ if (!in_nmi())
rcu_dynticks_task_exit();
+ // RCU is not watching here ...
rcu_dynticks_eqs_exit();
+ // ... but is watching here.
- if (irq)
+ if (!in_nmi())
rcu_cleanup_after_idle();
incby = 1;
- } else if (irq && tick_nohz_full_cpu(rdp->cpu) &&
- rdp->dynticks_nmi_nesting == DYNTICK_IRQ_NONIDLE &&
- READ_ONCE(rdp->rcu_urgent_qs) &&
- !READ_ONCE(rdp->rcu_forced_tick)) {
- raw_spin_lock_rcu_node(rdp->mynode);
- // Recheck under lock.
- if (rdp->rcu_urgent_qs && !rdp->rcu_forced_tick) {
- WRITE_ONCE(rdp->rcu_forced_tick, true);
- tick_dep_set_cpu(rdp->cpu, TICK_DEP_BIT_RCU);
- }
- raw_spin_unlock_rcu_node(rdp->mynode);
+ } else if (!in_nmi()) {
+ instrumentation_begin();
+ rcu_irq_enter_check_tick();
+ instrumentation_end();
}
+ instrumentation_begin();
trace_rcu_dyntick(incby == 1 ? TPS("Endirq") : TPS("++="),
rdp->dynticks_nmi_nesting,
rdp->dynticks_nmi_nesting + incby, atomic_read(&rdp->dynticks));
+ instrumentation_end();
WRITE_ONCE(rdp->dynticks_nmi_nesting, /* Prevent store tearing. */
rdp->dynticks_nmi_nesting + incby);
barrier();
}
-/**
- * rcu_nmi_enter - inform RCU of entry to NMI context
- */
-void rcu_nmi_enter(void)
-{
- rcu_nmi_enter_common(false);
-}
-NOKPROBE_SYMBOL(rcu_nmi_enter);
-
/**
* rcu_irq_enter - inform RCU that current CPU is entering irq away from idle
*
* If you add or remove a call to rcu_irq_enter(), be sure to test with
* CONFIG_RCU_EQS_DEBUG=y.
*/
-void rcu_irq_enter(void)
+noinstr void rcu_irq_enter(void)
{
lockdep_assert_irqs_disabled();
- rcu_nmi_enter_common(true);
+ rcu_nmi_enter();
}
/*
}
}
+noinstr bool __rcu_is_watching(void)
+{
+ return !rcu_dynticks_curr_cpu_in_eqs();
+}
+
/**
* rcu_is_watching - see if RCU thinks that the current CPU is not idle
*
* if the current CPU is not in its idle loop or is in an interrupt or
* NMI handler, return true.
*/
-bool notrace rcu_is_watching(void)
+bool rcu_is_watching(void)
{
bool ret;
if (in_nmi() || !rcu_scheduler_fully_active)
return true;
- preempt_disable();
+ preempt_disable_notrace();
rdp = this_cpu_ptr(&rcu_data);
rnp = rdp->mynode;
if (rdp->grpmask & rcu_rnp_online_cpus(rnp))
ret = true;
- preempt_enable();
+ preempt_enable_notrace();
return ret;
}
EXPORT_SYMBOL_GPL(rcu_lockdep_current_cpu_online);
trace_rcu_this_gp(rnp, rdp, gp_seq_req, TPS("NoGPkthread"));
goto unlock_out;
}
- trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("newreq"));
+ trace_rcu_grace_period(rcu_state.name, data_race(rcu_state.gp_seq), TPS("newreq"));
ret = true; /* Caller must wake GP kthread. */
unlock_out:
/* Push furthest requested GP to leaf node and rcu_data structure. */
schedule_timeout_uninterruptible(delay);
}
+static unsigned long sleep_duration;
+
+/* Allow rcutorture to stall the grace-period kthread. */
+void rcu_gp_set_torture_wait(int duration)
+{
+ if (IS_ENABLED(CONFIG_RCU_TORTURE_TEST) && duration > 0)
+ WRITE_ONCE(sleep_duration, duration);
+}
+EXPORT_SYMBOL_GPL(rcu_gp_set_torture_wait);
+
+/* Actually implement the aforementioned wait. */
+static void rcu_gp_torture_wait(void)
+{
+ unsigned long duration;
+
+ if (!IS_ENABLED(CONFIG_RCU_TORTURE_TEST))
+ return;
+ duration = xchg(&sleep_duration, 0UL);
+ if (duration > 0) {
+ pr_alert("%s: Waiting %lu jiffies\n", __func__, duration);
+ schedule_timeout_uninterruptible(duration);
+ pr_alert("%s: Wait complete\n", __func__);
+ }
+}
+
/*
* Initialize a new grace period. Return false if no grace period required.
*/
record_gp_stall_check_time();
/* Record GP times before starting GP, hence rcu_seq_start(). */
rcu_seq_start(&rcu_state.gp_seq);
+ ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start"));
raw_spin_unlock_irq_rcu_node(rnp);
{
struct rcu_node *rnp = rcu_get_root();
- /* Someone like call_rcu() requested a force-quiescent-state scan. */
+ // If under overload conditions, force an immediate FQS scan.
+ if (*gfp & RCU_GP_FLAG_OVLD)
+ return true;
+
+ // Someone like call_rcu() requested a force-quiescent-state scan.
*gfp = READ_ONCE(rcu_state.gp_flags);
if (*gfp & RCU_GP_FLAG_FQS)
return true;
- /* The current grace period has completed. */
+ // The current grace period has completed.
if (!READ_ONCE(rnp->qsmask) && !rcu_preempt_blocked_readers_cgp(rnp))
return true;
static void rcu_gp_fqs_loop(void)
{
bool first_gp_fqs;
- int gf;
+ int gf = 0;
unsigned long j;
int ret;
struct rcu_node *rnp = rcu_get_root();
first_gp_fqs = true;
j = READ_ONCE(jiffies_till_first_fqs);
+ if (rcu_state.cbovld)
+ gf = RCU_GP_FLAG_OVLD;
ret = 0;
for (;;) {
if (!ret) {
rcu_state.gp_state = RCU_GP_WAIT_FQS;
ret = swait_event_idle_timeout_exclusive(
rcu_state.gp_wq, rcu_gp_fqs_check_wake(&gf), j);
+ rcu_gp_torture_wait();
rcu_state.gp_state = RCU_GP_DOING_FQS;
/* Locking provides needed memory barriers. */
/* If grace period done, leave loop. */
!rcu_preempt_blocked_readers_cgp(rnp))
break;
/* If time for quiescent-state forcing, do it. */
- if (ULONG_CMP_GE(jiffies, rcu_state.jiffies_force_qs) ||
+ if (!time_after(rcu_state.jiffies_force_qs, jiffies) ||
(gf & RCU_GP_FLAG_FQS)) {
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq,
TPS("fqsstart"));
rcu_gp_fqs(first_gp_fqs);
- first_gp_fqs = false;
+ gf = 0;
+ if (first_gp_fqs) {
+ first_gp_fqs = false;
+ gf = rcu_state.cbovld ? RCU_GP_FLAG_OVLD : 0;
+ }
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq,
TPS("fqsend"));
cond_resched_tasks_rcu_qs();
j = 1;
else
j = rcu_state.jiffies_force_qs - j;
+ gf = 0;
}
}
}
/* Declare grace period done, trace first to use old GP number. */
trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("end"));
rcu_seq_end(&rcu_state.gp_seq);
+ ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
rcu_state.gp_state = RCU_GP_IDLE;
/* Check for GP requests since above loop. */
rdp = this_cpu_ptr(&rcu_data);
swait_event_idle_exclusive(rcu_state.gp_wq,
READ_ONCE(rcu_state.gp_flags) &
RCU_GP_FLAG_INIT);
+ rcu_gp_torture_wait();
rcu_state.gp_state = RCU_GP_DONE_GPS;
/* Locking provides needed memory barrier. */
if (rcu_gp_init())
struct delayed_work monitor_work;
bool monitor_todo;
bool initialized;
+ // Number of objects for which GP not started
+ int count;
};
static DEFINE_PER_CPU(struct kfree_rcu_cpu, krc);
krcp->head = NULL;
}
+ WRITE_ONCE(krcp->count, 0);
+
/*
* One work is per one batch, so there are two "free channels",
* "bhead_free" and "head_free" the batch can handle. It can be
krcp->head = head;
}
+ WRITE_ONCE(krcp->count, krcp->count + 1);
+
// Set timer to drain after KFREE_DRAIN_JIFFIES.
if (rcu_scheduler_active == RCU_SCHEDULER_RUNNING &&
!krcp->monitor_todo) {
}
EXPORT_SYMBOL_GPL(kfree_call_rcu);
+static unsigned long
+kfree_rcu_shrink_count(struct shrinker *shrink, struct shrink_control *sc)
+{
+ int cpu;
+ unsigned long count = 0;
+
+ /* Snapshot count of all CPUs */
+ for_each_online_cpu(cpu) {
+ struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
+
+ count += READ_ONCE(krcp->count);
+ }
+
+ return count;
+}
+
+static unsigned long
+kfree_rcu_shrink_scan(struct shrinker *shrink, struct shrink_control *sc)
+{
+ int cpu, freed = 0;
+ unsigned long flags;
+
+ for_each_online_cpu(cpu) {
+ int count;
+ struct kfree_rcu_cpu *krcp = per_cpu_ptr(&krc, cpu);
+
+ count = krcp->count;
+ spin_lock_irqsave(&krcp->lock, flags);
+ if (krcp->monitor_todo)
+ kfree_rcu_drain_unlock(krcp, flags);
+ else
+ spin_unlock_irqrestore(&krcp->lock, flags);
+
+ sc->nr_to_scan -= count;
+ freed += count;
+
+ if (sc->nr_to_scan <= 0)
+ break;
+ }
+
+ return freed;
+}
+
+static struct shrinker kfree_rcu_shrinker = {
+ .count_objects = kfree_rcu_shrink_count,
+ .scan_objects = kfree_rcu_shrink_scan,
+ .batch = 0,
+ .seeks = DEFAULT_SEEKS,
+};
+
void __init kfree_rcu_scheduler_running(void)
{
int cpu;
nbits = bitmap_weight(&oldmask, BITS_PER_LONG);
/* Allow lockless access for expedited grace periods. */
smp_store_release(&rcu_state.ncpus, rcu_state.ncpus + nbits); /* ^^^ */
+ ASSERT_EXCLUSIVE_WRITER(rcu_state.ncpus);
rcu_gpnum_ovf(rnp, rdp); /* Offline-induced counter wrap? */
rdp->rcu_onl_gp_seq = READ_ONCE(rcu_state.gp_seq);
rdp->rcu_onl_gp_flags = READ_ONCE(rcu_state.gp_flags);
INIT_DELAYED_WORK(&krcp->monitor_work, kfree_rcu_monitor);
krcp->initialized = true;
}
+ if (register_shrinker(&kfree_rcu_shrinker))
+ pr_err("Failed to register kfree_rcu() shrinker!\n");
}
void __init rcu_init(void)
/* Values for rcu_state structure's gp_flags field. */
#define RCU_GP_FLAG_INIT 0x1 /* Need grace-period initialization. */
#define RCU_GP_FLAG_FQS 0x2 /* Need grace-period quiescent-state forcing. */
+#define RCU_GP_FLAG_OVLD 0x4 /* Experiencing callback overload. */
/* Values for rcu_state structure's gp_state field. */
#define RCU_GP_IDLE 0 /* Initial state and no GP in progress. */
static bool rcu_nohz_full_cpu(void);
static void rcu_dynticks_task_enter(void);
static void rcu_dynticks_task_exit(void);
+static void rcu_dynticks_task_trace_enter(void);
+static void rcu_dynticks_task_trace_exit(void);
/* Forward declarations for tree_stall.h */
static void record_gp_stall_check_time(void);
static bool sync_rcu_exp_done(struct rcu_node *rnp)
{
raw_lockdep_assert_held_rcu_node(rnp);
- return rnp->exp_tasks == NULL &&
+ return READ_ONCE(rnp->exp_tasks) == NULL &&
READ_ONCE(rnp->expmask) == 0;
}
* until such time as the ->expmask bits are cleared.
*/
if (rcu_preempt_has_tasks(rnp))
- rnp->exp_tasks = rnp->blkd_tasks.next;
+ WRITE_ONCE(rnp->exp_tasks, rnp->blkd_tasks.next);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
/* IPI the remaining CPUs for expedited quiescent state. */
}
pr_cont(" } %lu jiffies s: %lu root: %#lx/%c\n",
jiffies - jiffies_start, rcu_state.expedited_sequence,
- READ_ONCE(rnp_root->expmask),
- ".T"[!!rnp_root->exp_tasks]);
+ data_race(rnp_root->expmask),
+ ".T"[!!data_race(rnp_root->exp_tasks)]);
if (ndetected) {
pr_err("blocking rcu_node structures:");
rcu_for_each_node_breadth_first(rnp) {
continue;
pr_cont(" l=%u:%d-%d:%#lx/%c",
rnp->level, rnp->grplo, rnp->grphi,
- READ_ONCE(rnp->expmask),
- ".T"[!!rnp->exp_tasks]);
+ data_race(rnp->expmask),
+ ".T"[!!data_race(rnp->exp_tasks)]);
}
pr_cont("\n");
}
*/
static void rcu_exp_handler(void *unused)
{
+ int depth = rcu_preempt_depth();
unsigned long flags;
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode;
* critical section. If also enabled or idle, immediately
* report the quiescent state, otherwise defer.
*/
- if (!rcu_preempt_depth()) {
+ if (!depth) {
if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) ||
rcu_dynticks_curr_cpu_in_eqs()) {
rcu_report_exp_rdp(rdp);
* can have caused this quiescent state to already have been
* reported, so we really do need to check ->expmask.
*/
- if (rcu_preempt_depth() > 0) {
+ if (depth > 0) {
raw_spin_lock_irqsave_rcu_node(rnp, flags);
if (rnp->expmask & rdp->grpmask) {
rdp->exp_deferred_qs = true;
return;
}
- /*
- * The final and least likely case is where the interrupted
- * code was just about to or just finished exiting the RCU-preempt
- * read-side critical section, and no, we can't tell which.
- * So either way, set ->deferred_qs to flag later code that
- * a quiescent state is required.
- *
- * If the CPU is fully enabled (or if some buggy RCU-preempt
- * read-side critical section is being used from idle), just
- * invoke rcu_preempt_deferred_qs() to immediately report the
- * quiescent state. We cannot use rcu_read_unlock_special()
- * because we are in an interrupt handler, which will cause that
- * function to take an early exit without doing anything.
- *
- * Otherwise, force a context switch after the CPU enables everything.
- */
- rdp->exp_deferred_qs = true;
- if (!(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)) ||
- WARN_ON_ONCE(rcu_dynticks_curr_cpu_in_eqs())) {
- rcu_preempt_deferred_qs(t);
- } else {
- set_tsk_need_resched(t);
- set_preempt_need_resched();
- }
+ // Finally, negative nesting depth should not happen.
+ WARN_ON_ONCE(1);
}
/* PREEMPTION=y, so no PREEMPTION=n expedited grace period to clean up after. */
*/
static int rcu_print_task_exp_stall(struct rcu_node *rnp)
{
- struct task_struct *t;
+ unsigned long flags;
int ndetected = 0;
+ struct task_struct *t;
- if (!rnp->exp_tasks)
+ if (!READ_ONCE(rnp->exp_tasks))
return 0;
+ raw_spin_lock_irqsave_rcu_node(rnp, flags);
t = list_entry(rnp->exp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
pr_cont(" P%d", t->pid);
ndetected++;
}
+ raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
return ndetected;
}
WARN_ON_ONCE(rnp->completedqs == rnp->gp_seq);
}
if (!rnp->exp_tasks && (blkd_state & RCU_EXP_BLKD))
- rnp->exp_tasks = &t->rcu_node_entry;
+ WRITE_ONCE(rnp->exp_tasks, &t->rcu_node_entry);
WARN_ON_ONCE(!(blkd_state & RCU_GP_BLKD) !=
!(rnp->qsmask & rdp->grpmask));
WARN_ON_ONCE(!(blkd_state & RCU_EXP_BLKD) !=
rcu_qs();
if (rdp->exp_deferred_qs)
rcu_report_exp_rdp(rdp);
+ rcu_tasks_qs(current, preempt);
trace_rcu_utilization(TPS("End context switch"));
}
EXPORT_SYMBOL_GPL(rcu_note_context_switch);
return READ_ONCE(rnp->gp_tasks) != NULL;
}
-/* Bias and limit values for ->rcu_read_lock_nesting. */
-#define RCU_NEST_BIAS INT_MAX
-#define RCU_NEST_NMAX (-INT_MAX / 2)
+/* limit value for ->rcu_read_lock_nesting. */
#define RCU_NEST_PMAX (INT_MAX / 2)
static void rcu_preempt_read_enter(void)
current->rcu_read_lock_nesting++;
}
-static void rcu_preempt_read_exit(void)
+static int rcu_preempt_read_exit(void)
{
- current->rcu_read_lock_nesting--;
+ return --current->rcu_read_lock_nesting;
}
static void rcu_preempt_depth_set(int val)
{
struct task_struct *t = current;
- if (rcu_preempt_depth() != 1) {
- rcu_preempt_read_exit();
- } else {
+ if (rcu_preempt_read_exit() == 0) {
barrier(); /* critical section before exit code. */
- rcu_preempt_depth_set(-RCU_NEST_BIAS);
- barrier(); /* assign before ->rcu_read_unlock_special load */
if (unlikely(READ_ONCE(t->rcu_read_unlock_special.s)))
rcu_read_unlock_special(t);
- barrier(); /* ->rcu_read_unlock_special load before assign */
- rcu_preempt_depth_set(0);
}
if (IS_ENABLED(CONFIG_PROVE_LOCKING)) {
int rrln = rcu_preempt_depth();
- WARN_ON_ONCE(rrln < 0 && rrln > RCU_NEST_NMAX);
+ WARN_ON_ONCE(rrln < 0 || rrln > RCU_NEST_PMAX);
}
}
EXPORT_SYMBOL_GPL(__rcu_read_unlock);
if (&t->rcu_node_entry == rnp->gp_tasks)
WRITE_ONCE(rnp->gp_tasks, np);
if (&t->rcu_node_entry == rnp->exp_tasks)
- rnp->exp_tasks = np;
+ WRITE_ONCE(rnp->exp_tasks, np);
if (IS_ENABLED(CONFIG_RCU_BOOST)) {
/* Snapshot ->boost_mtx ownership w/rnp->lock held. */
drop_boost_mutex = rt_mutex_owner(&rnp->boost_mtx) == t;
if (&t->rcu_node_entry == rnp->boost_tasks)
- rnp->boost_tasks = np;
+ WRITE_ONCE(rnp->boost_tasks, np);
}
/*
{
return (__this_cpu_read(rcu_data.exp_deferred_qs) ||
READ_ONCE(t->rcu_read_unlock_special.s)) &&
- rcu_preempt_depth() <= 0;
+ rcu_preempt_depth() == 0;
}
/*
static void rcu_preempt_deferred_qs(struct task_struct *t)
{
unsigned long flags;
- bool couldrecurse = rcu_preempt_depth() >= 0;
if (!rcu_preempt_need_deferred_qs(t))
return;
- if (couldrecurse)
- rcu_preempt_depth_set(rcu_preempt_depth() - RCU_NEST_BIAS);
local_irq_save(flags);
rcu_preempt_deferred_qs_irqrestore(t, flags);
- if (couldrecurse)
- rcu_preempt_depth_set(rcu_preempt_depth() + RCU_NEST_BIAS);
}
/*
struct rcu_data *rdp = this_cpu_ptr(&rcu_data);
struct rcu_node *rnp = rdp->mynode;
- exp = (t->rcu_blocked_node && t->rcu_blocked_node->exp_tasks) ||
- (rdp->grpmask & READ_ONCE(rnp->expmask)) ||
- tick_nohz_full_cpu(rdp->cpu);
+ exp = (t->rcu_blocked_node &&
+ READ_ONCE(t->rcu_blocked_node->exp_tasks)) ||
+ (rdp->grpmask & READ_ONCE(rnp->expmask));
// Need to defer quiescent state until everything is enabled.
- if (irqs_were_disabled && use_softirq &&
- (in_interrupt() ||
- (exp && !t->rcu_read_unlock_special.b.deferred_qs))) {
- // Using softirq, safe to awaken, and we get
- // no help from enabling irqs, unlike bh/preempt.
+ if (use_softirq && (in_irq() || (exp && !irqs_were_disabled))) {
+ // Using softirq, safe to awaken, and either the
+ // wakeup is free or there is an expedited GP.
raise_softirq_irqoff(RCU_SOFTIRQ);
} else {
// Enabling BH or preempt does reschedule, so...
- // Also if no expediting or NO_HZ_FULL, slow is OK.
+ // Also if no expediting, slow is OK.
+ // Plus nohz_full CPUs eventually get tick enabled.
set_tsk_need_resched(current);
set_preempt_need_resched();
if (IS_ENABLED(CONFIG_IRQ_WORK) && irqs_were_disabled &&
irq_work_queue_on(&rdp->defer_qs_iw, rdp->cpu);
}
}
- t->rcu_read_unlock_special.b.deferred_qs = true;
local_irq_restore(flags);
return;
}
} else if (rcu_preempt_need_deferred_qs(t)) {
rcu_preempt_deferred_qs(t); /* Report deferred QS. */
return;
- } else if (!rcu_preempt_depth()) {
+ } else if (!WARN_ON_ONCE(rcu_preempt_depth())) {
rcu_qs(); /* Report immediate QS. */
return;
}
pr_info("%s: %d:%d ->qsmask %#lx ->qsmaskinit %#lx ->qsmaskinitnext %#lx\n",
__func__, rnp1->grplo, rnp1->grphi, rnp1->qsmask, rnp1->qsmaskinit, rnp1->qsmaskinitnext);
pr_info("%s: ->gp_tasks %p ->boost_tasks %p ->exp_tasks %p\n",
- __func__, READ_ONCE(rnp->gp_tasks), rnp->boost_tasks,
- rnp->exp_tasks);
+ __func__, READ_ONCE(rnp->gp_tasks), data_race(rnp->boost_tasks),
+ READ_ONCE(rnp->exp_tasks));
pr_info("%s: ->blkd_tasks", __func__);
i = 0;
list_for_each(lhp, &rnp->blkd_tasks) {
this_cpu_write(rcu_data.rcu_urgent_qs, false);
if (unlikely(raw_cpu_read(rcu_data.rcu_need_heavy_qs)))
rcu_momentary_dyntick_idle();
- if (!preempt)
- rcu_tasks_qs(current);
+ rcu_tasks_qs(current, preempt);
out:
trace_rcu_utilization(TPS("End context switch"));
}
for (;;) {
WRITE_ONCE(rnp->boost_kthread_status, RCU_KTHREAD_WAITING);
trace_rcu_utilization(TPS("End boost kthread@rcu_wait"));
- rcu_wait(rnp->boost_tasks || rnp->exp_tasks);
+ rcu_wait(READ_ONCE(rnp->boost_tasks) ||
+ READ_ONCE(rnp->exp_tasks));
trace_rcu_utilization(TPS("Start boost kthread@rcu_wait"));
WRITE_ONCE(rnp->boost_kthread_status, RCU_KTHREAD_RUNNING);
more2boost = rcu_boost(rnp);
(rnp->gp_tasks != NULL &&
rnp->boost_tasks == NULL &&
rnp->qsmask == 0 &&
- (ULONG_CMP_GE(jiffies, rnp->boost_time) || rcu_state.cbovld))) {
+ (!time_after(rnp->boost_time, jiffies) || rcu_state.cbovld))) {
if (rnp->exp_tasks == NULL)
- rnp->boost_tasks = rnp->gp_tasks;
+ WRITE_ONCE(rnp->boost_tasks, rnp->gp_tasks);
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
rcu_wake_cond(rnp->boost_kthread_task,
READ_ONCE(rnp->boost_kthread_status));
#ifdef CONFIG_NO_HZ_FULL
if (tick_nohz_full_cpu(smp_processor_id()) &&
(!rcu_gp_in_progress() ||
- ULONG_CMP_LT(jiffies, READ_ONCE(rcu_state.gp_start) + HZ)))
+ time_before(jiffies, READ_ONCE(rcu_state.gp_start) + HZ)))
return true;
#endif /* #ifdef CONFIG_NO_HZ_FULL */
return false;
}
/* Record the current task on dyntick-idle entry. */
-static void rcu_dynticks_task_enter(void)
+static void noinstr rcu_dynticks_task_enter(void)
{
#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
WRITE_ONCE(current->rcu_tasks_idle_cpu, smp_processor_id());
}
/* Record no current task on dyntick-idle exit. */
-static void rcu_dynticks_task_exit(void)
+static void noinstr rcu_dynticks_task_exit(void)
{
#if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL)
WRITE_ONCE(current->rcu_tasks_idle_cpu, -1);
#endif /* #if defined(CONFIG_TASKS_RCU) && defined(CONFIG_NO_HZ_FULL) */
}
+
+/* Turn on heavyweight RCU tasks trace readers on idle/user entry. */
+static void rcu_dynticks_task_trace_enter(void)
+{
+#ifdef CONFIG_TASKS_RCU_TRACE
+ if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
+ current->trc_reader_special.b.need_mb = true;
+#endif /* #ifdef CONFIG_TASKS_RCU_TRACE */
+}
+
+/* Turn off heavyweight RCU tasks trace readers on idle/user exit. */
+static void rcu_dynticks_task_trace_exit(void)
+{
+#ifdef CONFIG_TASKS_RCU_TRACE
+ if (IS_ENABLED(CONFIG_TASKS_TRACE_RCU_READ_MB))
+ current->trc_reader_special.b.need_mb = false;
+#endif /* #ifdef CONFIG_TASKS_RCU_TRACE */
+}
int sysctl_panic_on_rcu_stall __read_mostly;
#ifdef CONFIG_PROVE_RCU
-#define RCU_STALL_DELAY_DELTA (5 * HZ)
+#define RCU_STALL_DELAY_DELTA (5 * HZ)
#else
-#define RCU_STALL_DELAY_DELTA 0
+#define RCU_STALL_DELAY_DELTA 0
#endif
+#define RCU_STALL_MIGHT_DIV 8
+#define RCU_STALL_MIGHT_MIN (2 * HZ)
/* Limit-check stall timeouts specified at boottime and runtime. */
int rcu_jiffies_till_stall_check(void)
}
EXPORT_SYMBOL_GPL(rcu_jiffies_till_stall_check);
+/**
+ * rcu_gp_might_be_stalled - Is it likely that the grace period is stalled?
+ *
+ * Returns @true if the current grace period is sufficiently old that
+ * it is reasonable to assume that it might be stalled. This can be
+ * useful when deciding whether to allocate memory to enable RCU-mediated
+ * freeing on the one hand or just invoking synchronize_rcu() on the other.
+ * The latter is preferable when the grace period is stalled.
+ *
+ * Note that sampling of the .gp_start and .gp_seq fields must be done
+ * carefully to avoid false positives at the beginnings and ends of
+ * grace periods.
+ */
+bool rcu_gp_might_be_stalled(void)
+{
+ unsigned long d = rcu_jiffies_till_stall_check() / RCU_STALL_MIGHT_DIV;
+ unsigned long j = jiffies;
+
+ if (d < RCU_STALL_MIGHT_MIN)
+ d = RCU_STALL_MIGHT_MIN;
+ smp_mb(); // jiffies before .gp_seq to avoid false positives.
+ if (!rcu_gp_in_progress())
+ return false;
+ // Long delays at this point avoids false positive, but a delay
+ // of ULONG_MAX/4 jiffies voids your no-false-positive warranty.
+ smp_mb(); // .gp_seq before second .gp_start
+ // And ditto here.
+ return !time_before(j, READ_ONCE(rcu_state.gp_start) + d);
+}
+
/* Don't do RCU CPU stall warnings during long sysrq printouts. */
void rcu_sysrq_start(void)
{
WRITE_ONCE(rcu_state.gp_start, j);
j1 = rcu_jiffies_till_stall_check();
- /* Record ->gp_start before ->jiffies_stall. */
- smp_store_release(&rcu_state.jiffies_stall, j + j1); /* ^^^ */
+ smp_mb(); // ->gp_start before ->jiffies_stall and caller's ->gp_seq.
+ WRITE_ONCE(rcu_state.jiffies_stall, j + j1);
rcu_state.jiffies_resched = j + j1 / 2;
rcu_state.n_force_qs_gpstart = READ_ONCE(rcu_state.n_force_qs);
}
raw_spin_unlock_irqrestore_rcu_node(rnp, flags);
}
+// Communicate task state back to the RCU CPU stall warning request.
+struct rcu_stall_chk_rdr {
+ int nesting;
+ union rcu_special rs;
+ bool on_blkd_list;
+};
+
+/*
+ * Report out the state of a not-running task that is stalling the
+ * current RCU grace period.
+ */
+static bool check_slow_task(struct task_struct *t, void *arg)
+{
+ struct rcu_node *rnp;
+ struct rcu_stall_chk_rdr *rscrp = arg;
+
+ if (task_curr(t))
+ return false; // It is running, so decline to inspect it.
+ rscrp->nesting = t->rcu_read_lock_nesting;
+ rscrp->rs = t->rcu_read_unlock_special;
+ rnp = t->rcu_blocked_node;
+ rscrp->on_blkd_list = !list_empty(&t->rcu_node_entry);
+ return true;
+}
+
/*
* Scan the current list of tasks blocked within RCU read-side critical
* sections, printing out the tid of each.
*/
static int rcu_print_task_stall(struct rcu_node *rnp)
{
- struct task_struct *t;
int ndetected = 0;
+ struct rcu_stall_chk_rdr rscr;
+ struct task_struct *t;
if (!rcu_preempt_blocked_readers_cgp(rnp))
return 0;
t = list_entry(rnp->gp_tasks->prev,
struct task_struct, rcu_node_entry);
list_for_each_entry_continue(t, &rnp->blkd_tasks, rcu_node_entry) {
- pr_cont(" P%d", t->pid);
+ if (!try_invoke_on_locked_down_task(t, check_slow_task, &rscr))
+ pr_cont(" P%d", t->pid);
+ else
+ pr_cont(" P%d/%d:%c%c%c%c",
+ t->pid, rscr.nesting,
+ ".b"[rscr.rs.b.blocked],
+ ".q"[rscr.rs.b.need_qs],
+ ".e"[rscr.rs.b.exp_hint],
+ ".l"[rscr.on_blkd_list]);
ndetected++;
}
pr_cont("\n");
return gp_state_names[gs];
}
+/* Is the RCU grace-period kthread being starved of CPU time? */
+static bool rcu_is_gp_kthread_starving(unsigned long *jp)
+{
+ unsigned long j = jiffies - READ_ONCE(rcu_state.gp_activity);
+
+ if (jp)
+ *jp = j;
+ return j > 2 * HZ;
+}
+
/*
* Print out diagnostic information for the specified stalled CPU.
*
static void print_cpu_stall_info(int cpu)
{
unsigned long delta;
+ bool falsepositive;
char fast_no_hz[72];
struct rcu_data *rdp = per_cpu_ptr(&rcu_data, cpu);
char *ticks_title;
}
print_cpu_stall_fast_no_hz(fast_no_hz, cpu);
delta = rcu_seq_ctr(rdp->mynode->gp_seq - rdp->rcu_iw_gp_seq);
- pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s\n",
+ falsepositive = rcu_is_gp_kthread_starving(NULL) &&
+ rcu_dynticks_in_eqs(rcu_dynticks_snap(rdp));
+ pr_err("\t%d-%c%c%c%c: (%lu %s) idle=%03x/%ld/%#lx softirq=%u/%u fqs=%ld %s%s\n",
cpu,
"O."[!!cpu_online(cpu)],
"o."[!!(rdp->grpmask & rdp->mynode->qsmaskinit)],
rcu_dynticks_snap(rdp) & 0xfff,
rdp->dynticks_nesting, rdp->dynticks_nmi_nesting,
rdp->softirq_snap, kstat_softirqs_cpu(RCU_SOFTIRQ, cpu),
- READ_ONCE(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
- fast_no_hz);
+ data_race(rcu_state.n_force_qs) - rcu_state.n_force_qs_gpstart,
+ fast_no_hz,
+ falsepositive ? " (false positive?)" : "");
}
/* Complain about starvation of grace-period kthread. */
struct task_struct *gpk = rcu_state.gp_kthread;
unsigned long j;
- j = jiffies - READ_ONCE(rcu_state.gp_activity);
- if (j > 2 * HZ) {
+ if (rcu_is_gp_kthread_starving(&j)) {
pr_err("%s kthread starved for %ld jiffies! g%ld f%#x %s(%d) ->state=%#lx ->cpu=%d\n",
rcu_state.name, j,
(long)rcu_seq_current(&rcu_state.gp_seq),
- READ_ONCE(rcu_state.gp_flags),
+ data_race(rcu_state.gp_flags),
gp_state_getname(rcu_state.gp_state), rcu_state.gp_state,
gpk ? gpk->state : ~0, gpk ? task_cpu(gpk) : -1);
if (gpk) {
+ pr_err("\tUnless %s kthread gets sufficient CPU time, OOM is now expected behavior.\n", rcu_state.name);
pr_err("RCU grace-period kthread stack dump:\n");
sched_show_task(gpk);
wake_up_process(gpk);
}
}
-static void print_other_cpu_stall(unsigned long gp_seq)
+static void print_other_cpu_stall(unsigned long gp_seq, unsigned long gps)
{
int cpu;
unsigned long flags;
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("\t(detected by %d, t=%ld jiffies, g=%ld, q=%lu)\n",
- smp_processor_id(), (long)(jiffies - rcu_state.gp_start),
+ smp_processor_id(), (long)(jiffies - gps),
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
if (ndetected) {
rcu_dump_cpu_stacks();
pr_err("INFO: Stall ended before state dump start\n");
} else {
j = jiffies;
- gpa = READ_ONCE(rcu_state.gp_activity);
+ gpa = data_race(rcu_state.gp_activity);
pr_err("All QSes seen, last %s kthread activity %ld (%ld-%ld), jiffies_till_next_fqs=%ld, root ->qsmask %#lx\n",
rcu_state.name, j - gpa, j, gpa,
- READ_ONCE(jiffies_till_next_fqs),
+ data_race(jiffies_till_next_fqs),
rcu_get_root()->qsmask);
- /* In this case, the current CPU might be at fault. */
- sched_show_task(current);
}
}
/* Rewrite if needed in case of slow consoles. */
rcu_force_quiescent_state(); /* Kick them all. */
}
-static void print_cpu_stall(void)
+static void print_cpu_stall(unsigned long gps)
{
int cpu;
unsigned long flags;
for_each_possible_cpu(cpu)
totqlen += rcu_get_n_cbs_cpu(cpu);
pr_cont("\t(t=%lu jiffies g=%ld q=%lu)\n",
- jiffies - rcu_state.gp_start,
+ jiffies - gps,
(long)rcu_seq_current(&rcu_state.gp_seq), totqlen);
rcu_check_gp_kthread_starvation();
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* We haven't checked in, so go dump stack. */
- print_cpu_stall();
+ print_cpu_stall(gps);
if (rcu_cpu_stall_ftrace_dump)
rcu_ftrace_dump(DUMP_ALL);
cmpxchg(&rcu_state.jiffies_stall, js, jn) == js) {
/* They had a few time units to dump stack, so complain. */
- print_other_cpu_stall(gs2);
+ print_other_cpu_stall(gs2, gps);
if (rcu_cpu_stall_ftrace_dump)
rcu_ftrace_dump(DUMP_ALL);
}
struct task_struct *t = READ_ONCE(rcu_state.gp_kthread);
j = jiffies;
- ja = j - READ_ONCE(rcu_state.gp_activity);
- jr = j - READ_ONCE(rcu_state.gp_req_activity);
- jw = j - READ_ONCE(rcu_state.gp_wake_time);
+ ja = j - data_race(rcu_state.gp_activity);
+ jr = j - data_race(rcu_state.gp_req_activity);
+ jw = j - data_race(rcu_state.gp_wake_time);
pr_info("%s: wait state: %s(%d) ->state: %#lx delta ->gp_activity %lu ->gp_req_activity %lu ->gp_wake_time %lu ->gp_wake_seq %ld ->gp_seq %ld ->gp_seq_needed %ld ->gp_flags %#x\n",
rcu_state.name, gp_state_getname(rcu_state.gp_state),
rcu_state.gp_state, t ? t->state : 0x1ffffL,
- ja, jr, jw, (long)READ_ONCE(rcu_state.gp_wake_seq),
- (long)READ_ONCE(rcu_state.gp_seq),
- (long)READ_ONCE(rcu_get_root()->gp_seq_needed),
- READ_ONCE(rcu_state.gp_flags));
+ ja, jr, jw, (long)data_race(rcu_state.gp_wake_seq),
+ (long)data_race(rcu_state.gp_seq),
+ (long)data_race(rcu_get_root()->gp_seq_needed),
+ data_race(rcu_state.gp_flags));
rcu_for_each_node_breadth_first(rnp) {
if (ULONG_CMP_GE(READ_ONCE(rcu_state.gp_seq),
READ_ONCE(rnp->gp_seq_needed)))
continue;
pr_info("\trcu_node %d:%d ->gp_seq %ld ->gp_seq_needed %ld\n",
- rnp->grplo, rnp->grphi, (long)READ_ONCE(rnp->gp_seq),
- (long)READ_ONCE(rnp->gp_seq_needed));
+ rnp->grplo, rnp->grphi, (long)data_race(rnp->gp_seq),
+ (long)data_race(rnp->gp_seq_needed));
if (!rcu_is_leaf_node(rnp))
continue;
for_each_leaf_node_possible_cpu(rnp, cpu) {
READ_ONCE(rdp->gp_seq_needed)))
continue;
pr_info("\tcpu %d ->gp_seq_needed %ld\n",
- cpu, (long)READ_ONCE(rdp->gp_seq_needed));
+ cpu, (long)data_race(rdp->gp_seq_needed));
}
}
for_each_possible_cpu(cpu) {
if (rcu_segcblist_is_offloaded(&rdp->cblist))
show_rcu_nocb_state(rdp);
}
- /* sched_show_task(rcu_state.gp_kthread); */
+ show_rcu_tasks_gp_kthreads();
}
EXPORT_SYMBOL_GPL(show_rcu_gp_kthreads);
#include <linux/sched/isolation.h>
#include <linux/kprobes.h>
#include <linux/slab.h>
+#include <linux/irq_work.h>
#define CREATE_TRACE_POINTS
#endif
#define MODULE_PARAM_PREFIX "rcupdate."
+#ifndef data_race
+#define data_race(expr) \
+ ({ \
+ expr; \
+ })
+#endif
+#ifndef ASSERT_EXCLUSIVE_WRITER
+#define ASSERT_EXCLUSIVE_WRITER(var) do { } while (0)
+#endif
+#ifndef ASSERT_EXCLUSIVE_ACCESS
+#define ASSERT_EXCLUSIVE_ACCESS(var) do { } while (0)
+#endif
+
#ifndef CONFIG_TINY_RCU
module_param(rcu_expedited, int, 0);
module_param(rcu_normal, int, 0);
* rcu_read_lock_held_common() - might we be in RCU-sched read-side critical section?
* @ret: Best guess answer if lockdep cannot be relied on
*
- * Returns true if lockdep must be ignored, in which case *ret contains
+ * Returns true if lockdep must be ignored, in which case ``*ret`` contains
* the best guess described below. Otherwise returns false, in which
- * case *ret tells the caller nothing and the caller should instead
+ * case ``*ret`` tells the caller nothing and the caller should instead
* consult lockdep.
*
- * If CONFIG_DEBUG_LOCK_ALLOC is selected, set *ret to nonzero iff in an
+ * If CONFIG_DEBUG_LOCK_ALLOC is selected, set ``*ret`` to nonzero iff in an
* RCU-sched read-side critical section. In absence of
* CONFIG_DEBUG_LOCK_ALLOC, this assumes we are in an RCU-sched read-side
* critical section unless it can prove otherwise. Note that disabling
*
* Note that if the CPU is in the idle loop from an RCU point of view (ie:
* that we are in the section between rcu_idle_enter() and rcu_idle_exit())
- * then rcu_read_lock_held() sets *ret to false even if the CPU did an
+ * then rcu_read_lock_held() sets ``*ret`` to false even if the CPU did an
* rcu_read_lock(). The reason for this is that RCU ignores CPUs that are
* in such a section, considering these as in extended quiescent state,
* so such a CPU is effectively never in an RCU read-side critical section
static bool rcu_read_lock_held_common(bool *ret)
{
if (!debug_lockdep_rcu_enabled()) {
- *ret = 1;
+ *ret = true;
return true;
}
if (!rcu_is_watching()) {
- *ret = 0;
+ *ret = false;
return true;
}
if (!rcu_lockdep_current_cpu_online()) {
- *ret = 0;
+ *ret = false;
return true;
}
return false;
STATIC_LOCKDEP_MAP_INIT("rcu_callback", &rcu_callback_key);
EXPORT_SYMBOL_GPL(rcu_callback_map);
-int notrace debug_lockdep_rcu_enabled(void)
+noinstr int notrace debug_lockdep_rcu_enabled(void)
{
return rcu_scheduler_active != RCU_SCHEDULER_INACTIVE && debug_locks &&
current->lockdep_recursion == 0;
}
EXPORT_SYMBOL_GPL(debug_lockdep_rcu_enabled);
-NOKPROBE_SYMBOL(debug_lockdep_rcu_enabled);
/**
* rcu_read_lock_held() - might we be in RCU read-side critical section?
EXPORT_SYMBOL_GPL(rcu_cpu_stall_suppress_at_boot);
module_param(rcu_cpu_stall_suppress_at_boot, int, 0444);
-#ifdef CONFIG_TASKS_RCU
-
-/*
- * Simple variant of RCU whose quiescent states are voluntary context
- * switch, cond_resched_rcu_qs(), user-space execution, and idle.
- * As such, grace periods can take one good long time. There are no
- * read-side primitives similar to rcu_read_lock() and rcu_read_unlock()
- * because this implementation is intended to get the system into a safe
- * state for some of the manipulations involved in tracing and the like.
- * Finally, this implementation does not support high call_rcu_tasks()
- * rates from multiple CPUs. If this is required, per-CPU callback lists
- * will be needed.
- */
-
-/* Global list of callbacks and associated lock. */
-static struct rcu_head *rcu_tasks_cbs_head;
-static struct rcu_head **rcu_tasks_cbs_tail = &rcu_tasks_cbs_head;
-static DECLARE_WAIT_QUEUE_HEAD(rcu_tasks_cbs_wq);
-static DEFINE_RAW_SPINLOCK(rcu_tasks_cbs_lock);
-
-/* Track exiting tasks in order to allow them to be waited for. */
-DEFINE_STATIC_SRCU(tasks_rcu_exit_srcu);
-
-/* Control stall timeouts. Disable with <= 0, otherwise jiffies till stall. */
-#define RCU_TASK_STALL_TIMEOUT (HZ * 60 * 10)
-static int rcu_task_stall_timeout __read_mostly = RCU_TASK_STALL_TIMEOUT;
-module_param(rcu_task_stall_timeout, int, 0644);
-
-static struct task_struct *rcu_tasks_kthread_ptr;
-
-/**
- * call_rcu_tasks() - Queue an RCU for invocation task-based grace period
- * @rhp: structure to be used for queueing the RCU updates.
- * @func: actual callback function to be invoked after the grace period
- *
- * The callback function will be invoked some time after a full grace
- * period elapses, in other words after all currently executing RCU
- * read-side critical sections have completed. call_rcu_tasks() assumes
- * that the read-side critical sections end at a voluntary context
- * switch (not a preemption!), cond_resched_rcu_qs(), entry into idle,
- * or transition to usermode execution. As such, there are no read-side
- * primitives analogous to rcu_read_lock() and rcu_read_unlock() because
- * this primitive is intended to determine that all tasks have passed
- * through a safe state, not so much for data-strcuture synchronization.
- *
- * See the description of call_rcu() for more detailed information on
- * memory ordering guarantees.
- */
-void call_rcu_tasks(struct rcu_head *rhp, rcu_callback_t func)
-{
- unsigned long flags;
- bool needwake;
-
- rhp->next = NULL;
- rhp->func = func;
- raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags);
- needwake = !rcu_tasks_cbs_head;
- WRITE_ONCE(*rcu_tasks_cbs_tail, rhp);
- rcu_tasks_cbs_tail = &rhp->next;
- raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
- /* We can't create the thread unless interrupts are enabled. */
- if (needwake && READ_ONCE(rcu_tasks_kthread_ptr))
- wake_up(&rcu_tasks_cbs_wq);
-}
-EXPORT_SYMBOL_GPL(call_rcu_tasks);
-
-/**
- * synchronize_rcu_tasks - wait until an rcu-tasks grace period has elapsed.
- *
- * Control will return to the caller some time after a full rcu-tasks
- * grace period has elapsed, in other words after all currently
- * executing rcu-tasks read-side critical sections have elapsed. These
- * read-side critical sections are delimited by calls to schedule(),
- * cond_resched_tasks_rcu_qs(), idle execution, userspace execution, calls
- * to synchronize_rcu_tasks(), and (in theory, anyway) cond_resched().
- *
- * This is a very specialized primitive, intended only for a few uses in
- * tracing and other situations requiring manipulation of function
- * preambles and profiling hooks. The synchronize_rcu_tasks() function
- * is not (yet) intended for heavy use from multiple CPUs.
- *
- * Note that this guarantee implies further memory-ordering guarantees.
- * On systems with more than one CPU, when synchronize_rcu_tasks() returns,
- * each CPU is guaranteed to have executed a full memory barrier since the
- * end of its last RCU-tasks read-side critical section whose beginning
- * preceded the call to synchronize_rcu_tasks(). In addition, each CPU
- * having an RCU-tasks read-side critical section that extends beyond
- * the return from synchronize_rcu_tasks() is guaranteed to have executed
- * a full memory barrier after the beginning of synchronize_rcu_tasks()
- * and before the beginning of that RCU-tasks read-side critical section.
- * Note that these guarantees include CPUs that are offline, idle, or
- * executing in user mode, as well as CPUs that are executing in the kernel.
- *
- * Furthermore, if CPU A invoked synchronize_rcu_tasks(), which returned
- * to its caller on CPU B, then both CPU A and CPU B are guaranteed
- * to have executed a full memory barrier during the execution of
- * synchronize_rcu_tasks() -- even if CPU A and CPU B are the same CPU
- * (but again only if the system has more than one CPU).
- */
-void synchronize_rcu_tasks(void)
-{
- /* Complain if the scheduler has not started. */
- RCU_LOCKDEP_WARN(rcu_scheduler_active == RCU_SCHEDULER_INACTIVE,
- "synchronize_rcu_tasks called too soon");
-
- /* Wait for the grace period. */
- wait_rcu_gp(call_rcu_tasks);
-}
-EXPORT_SYMBOL_GPL(synchronize_rcu_tasks);
-
-/**
- * rcu_barrier_tasks - Wait for in-flight call_rcu_tasks() callbacks.
- *
- * Although the current implementation is guaranteed to wait, it is not
- * obligated to, for example, if there are no pending callbacks.
- */
-void rcu_barrier_tasks(void)
-{
- /* There is only one callback queue, so this is easy. ;-) */
- synchronize_rcu_tasks();
-}
-EXPORT_SYMBOL_GPL(rcu_barrier_tasks);
-
-/* See if tasks are still holding out, complain if so. */
-static void check_holdout_task(struct task_struct *t,
- bool needreport, bool *firstreport)
-{
- int cpu;
-
- if (!READ_ONCE(t->rcu_tasks_holdout) ||
- t->rcu_tasks_nvcsw != READ_ONCE(t->nvcsw) ||
- !READ_ONCE(t->on_rq) ||
- (IS_ENABLED(CONFIG_NO_HZ_FULL) &&
- !is_idle_task(t) && t->rcu_tasks_idle_cpu >= 0)) {
- WRITE_ONCE(t->rcu_tasks_holdout, false);
- list_del_init(&t->rcu_tasks_holdout_list);
- put_task_struct(t);
- return;
- }
- rcu_request_urgent_qs_task(t);
- if (!needreport)
- return;
- if (*firstreport) {
- pr_err("INFO: rcu_tasks detected stalls on tasks:\n");
- *firstreport = false;
- }
- cpu = task_cpu(t);
- pr_alert("%p: %c%c nvcsw: %lu/%lu holdout: %d idle_cpu: %d/%d\n",
- t, ".I"[is_idle_task(t)],
- "N."[cpu < 0 || !tick_nohz_full_cpu(cpu)],
- t->rcu_tasks_nvcsw, t->nvcsw, t->rcu_tasks_holdout,
- t->rcu_tasks_idle_cpu, cpu);
- sched_show_task(t);
-}
-
-/* RCU-tasks kthread that detects grace periods and invokes callbacks. */
-static int __noreturn rcu_tasks_kthread(void *arg)
-{
- unsigned long flags;
- struct task_struct *g, *t;
- unsigned long lastreport;
- struct rcu_head *list;
- struct rcu_head *next;
- LIST_HEAD(rcu_tasks_holdouts);
- int fract;
-
- /* Run on housekeeping CPUs by default. Sysadm can move if desired. */
- housekeeping_affine(current, HK_FLAG_RCU);
-
- /*
- * Each pass through the following loop makes one check for
- * newly arrived callbacks, and, if there are some, waits for
- * one RCU-tasks grace period and then invokes the callbacks.
- * This loop is terminated by the system going down. ;-)
- */
- for (;;) {
-
- /* Pick up any new callbacks. */
- raw_spin_lock_irqsave(&rcu_tasks_cbs_lock, flags);
- list = rcu_tasks_cbs_head;
- rcu_tasks_cbs_head = NULL;
- rcu_tasks_cbs_tail = &rcu_tasks_cbs_head;
- raw_spin_unlock_irqrestore(&rcu_tasks_cbs_lock, flags);
-
- /* If there were none, wait a bit and start over. */
- if (!list) {
- wait_event_interruptible(rcu_tasks_cbs_wq,
- READ_ONCE(rcu_tasks_cbs_head));
- if (!rcu_tasks_cbs_head) {
- WARN_ON(signal_pending(current));
- schedule_timeout_interruptible(HZ/10);
- }
- continue;
- }
-
- /*
- * Wait for all pre-existing t->on_rq and t->nvcsw
- * transitions to complete. Invoking synchronize_rcu()
- * suffices because all these transitions occur with
- * interrupts disabled. Without this synchronize_rcu(),
- * a read-side critical section that started before the
- * grace period might be incorrectly seen as having started
- * after the grace period.
- *
- * This synchronize_rcu() also dispenses with the
- * need for a memory barrier on the first store to
- * ->rcu_tasks_holdout, as it forces the store to happen
- * after the beginning of the grace period.
- */
- synchronize_rcu();
-
- /*
- * There were callbacks, so we need to wait for an
- * RCU-tasks grace period. Start off by scanning
- * the task list for tasks that are not already
- * voluntarily blocked. Mark these tasks and make
- * a list of them in rcu_tasks_holdouts.
- */
- rcu_read_lock();
- for_each_process_thread(g, t) {
- if (t != current && READ_ONCE(t->on_rq) &&
- !is_idle_task(t)) {
- get_task_struct(t);
- t->rcu_tasks_nvcsw = READ_ONCE(t->nvcsw);
- WRITE_ONCE(t->rcu_tasks_holdout, true);
- list_add(&t->rcu_tasks_holdout_list,
- &rcu_tasks_holdouts);
- }
- }
- rcu_read_unlock();
-
- /*
- * Wait for tasks that are in the process of exiting.
- * This does only part of the job, ensuring that all
- * tasks that were previously exiting reach the point
- * where they have disabled preemption, allowing the
- * later synchronize_rcu() to finish the job.
- */
- synchronize_srcu(&tasks_rcu_exit_srcu);
-
- /*
- * Each pass through the following loop scans the list
- * of holdout tasks, removing any that are no longer
- * holdouts. When the list is empty, we are done.
- */
- lastreport = jiffies;
-
- /* Start off with HZ/10 wait and slowly back off to 1 HZ wait*/
- fract = 10;
-
- for (;;) {
- bool firstreport;
- bool needreport;
- int rtst;
- struct task_struct *t1;
-
- if (list_empty(&rcu_tasks_holdouts))
- break;
-
- /* Slowly back off waiting for holdouts */
- schedule_timeout_interruptible(HZ/fract);
-
- if (fract > 1)
- fract--;
-
- rtst = READ_ONCE(rcu_task_stall_timeout);
- needreport = rtst > 0 &&
- time_after(jiffies, lastreport + rtst);
- if (needreport)
- lastreport = jiffies;
- firstreport = true;
- WARN_ON(signal_pending(current));
- list_for_each_entry_safe(t, t1, &rcu_tasks_holdouts,
- rcu_tasks_holdout_list) {
- check_holdout_task(t, needreport, &firstreport);
- cond_resched();
- }
- }
-
- /*
- * Because ->on_rq and ->nvcsw are not guaranteed
- * to have a full memory barriers prior to them in the
- * schedule() path, memory reordering on other CPUs could
- * cause their RCU-tasks read-side critical sections to
- * extend past the end of the grace period. However,
- * because these ->nvcsw updates are carried out with
- * interrupts disabled, we can use synchronize_rcu()
- * to force the needed ordering on all such CPUs.
- *
- * This synchronize_rcu() also confines all
- * ->rcu_tasks_holdout accesses to be within the grace
- * period, avoiding the need for memory barriers for
- * ->rcu_tasks_holdout accesses.
- *
- * In addition, this synchronize_rcu() waits for exiting
- * tasks to complete their final preempt_disable() region
- * of execution, cleaning up after the synchronize_srcu()
- * above.
- */
- synchronize_rcu();
-
- /* Invoke the callbacks. */
- while (list) {
- next = list->next;
- local_bh_disable();
- list->func(list);
- local_bh_enable();
- list = next;
- cond_resched();
- }
- /* Paranoid sleep to keep this from entering a tight loop */
- schedule_timeout_uninterruptible(HZ/10);
- }
-}
-
-/* Spawn rcu_tasks_kthread() at core_initcall() time. */
-static int __init rcu_spawn_tasks_kthread(void)
-{
- struct task_struct *t;
-
- t = kthread_run(rcu_tasks_kthread, NULL, "rcu_tasks_kthread");
- if (WARN_ONCE(IS_ERR(t), "%s: Could not start Tasks-RCU grace-period kthread, OOM is now expected behavior\n", __func__))
- return 0;
- smp_mb(); /* Ensure others see full kthread. */
- WRITE_ONCE(rcu_tasks_kthread_ptr, t);
- return 0;
-}
-core_initcall(rcu_spawn_tasks_kthread);
-
-/* Do the srcu_read_lock() for the above synchronize_srcu(). */
-void exit_tasks_rcu_start(void) __acquires(&tasks_rcu_exit_srcu)
-{
- preempt_disable();
- current->rcu_tasks_idx = __srcu_read_lock(&tasks_rcu_exit_srcu);
- preempt_enable();
-}
-
-/* Do the srcu_read_unlock() for the above synchronize_srcu(). */
-void exit_tasks_rcu_finish(void) __releases(&tasks_rcu_exit_srcu)
-{
- preempt_disable();
- __srcu_read_unlock(&tasks_rcu_exit_srcu, current->rcu_tasks_idx);
- preempt_enable();
-}
-
-#endif /* #ifdef CONFIG_TASKS_RCU */
-
-#ifndef CONFIG_TINY_RCU
-
-/*
- * Print any non-default Tasks RCU settings.
- */
-static void __init rcu_tasks_bootup_oddness(void)
-{
-#ifdef CONFIG_TASKS_RCU
- if (rcu_task_stall_timeout != RCU_TASK_STALL_TIMEOUT)
- pr_info("\tTasks-RCU CPU stall warnings timeout set to %d (rcu_task_stall_timeout).\n", rcu_task_stall_timeout);
- else
- pr_info("\tTasks RCU enabled.\n");
-#endif /* #ifdef CONFIG_TASKS_RCU */
-}
-
-#endif /* #ifndef CONFIG_TINY_RCU */
-
#ifdef CONFIG_PROVE_RCU
/*
void rcu_early_boot_tests(void) {}
#endif /* CONFIG_PROVE_RCU */
+#include "tasks.h"
+
#ifndef CONFIG_TINY_RCU
/*
pr_emerg("Restarting system\n");
else
pr_emerg("Restarting system with command '%s'\n", cmd);
- kmsg_dump(KMSG_DUMP_RESTART);
+ kmsg_dump(KMSG_DUMP_SHUTDOWN);
machine_restart(cmd);
}
EXPORT_SYMBOL_GPL(kernel_restart);
migrate_to_reboot_cpu();
syscore_shutdown();
pr_emerg("System halted\n");
- kmsg_dump(KMSG_DUMP_HALT);
+ kmsg_dump(KMSG_DUMP_SHUTDOWN);
machine_halt();
}
EXPORT_SYMBOL_GPL(kernel_halt);
migrate_to_reboot_cpu();
syscore_shutdown();
pr_emerg("Power down\n");
- kmsg_dump(KMSG_DUMP_POWEROFF);
+ kmsg_dump(KMSG_DUMP_SHUTDOWN);
machine_power_off();
}
EXPORT_SYMBOL_GPL(kernel_power_off);
*
* Pairs with the LOCK+smp_mb__after_spinlock() on rq->lock in
* __schedule(). See the comment for smp_mb__after_spinlock().
+ *
+ * A similar smb_rmb() lives in try_invoke_on_locked_down_task().
*/
smp_rmb();
if (p->on_rq && ttwu_remote(p, wake_flags))
return success;
}
+/**
+ * try_invoke_on_locked_down_task - Invoke a function on task in fixed state
+ * @p: Process for which the function is to be invoked.
+ * @func: Function to invoke.
+ * @arg: Argument to function.
+ *
+ * If the specified task can be quickly locked into a definite state
+ * (either sleeping or on a given runqueue), arrange to keep it in that
+ * state while invoking @func(@arg). This function can use ->on_rq and
+ * task_curr() to work out what the state is, if required. Given that
+ * @func can be invoked with a runqueue lock held, it had better be quite
+ * lightweight.
+ *
+ * Returns:
+ * @false if the task slipped out from under the locks.
+ * @true if the task was locked onto a runqueue or is sleeping.
+ * However, @func can override this by returning @false.
+ */
+bool try_invoke_on_locked_down_task(struct task_struct *p, bool (*func)(struct task_struct *t, void *arg), void *arg)
+{
+ bool ret = false;
+ struct rq_flags rf;
+ struct rq *rq;
+
+ lockdep_assert_irqs_enabled();
+ raw_spin_lock_irq(&p->pi_lock);
+ if (p->on_rq) {
+ rq = __task_rq_lock(p, &rf);
+ if (task_rq(p) == rq)
+ ret = func(p, arg);
+ rq_unlock(rq, &rf);
+ } else {
+ switch (p->state) {
+ case TASK_RUNNING:
+ case TASK_WAKING:
+ break;
+ default:
+ smp_rmb(); // See smp_rmb() comment in try_to_wake_up().
+ if (!p->on_rq)
+ ret = func(p, arg);
+ }
+ }
+ raw_spin_unlock_irq(&p->pi_lock);
+ return ret;
+}
+
/**
* wake_up_process - Wake up a specific process
* @p: The process to be woken up.
P(se.avg.util_est.enqueued);
#endif
#ifdef CONFIG_UCLAMP_TASK
- __PS("uclamp.min", p->uclamp[UCLAMP_MIN].value);
- __PS("uclamp.max", p->uclamp[UCLAMP_MAX].value);
+ __PS("uclamp.min", p->uclamp_req[UCLAMP_MIN].value);
+ __PS("uclamp.max", p->uclamp_req[UCLAMP_MAX].value);
__PS("effective uclamp.min", uclamp_eff_value(p, UCLAMP_MIN));
__PS("effective uclamp.max", uclamp_eff_value(p, UCLAMP_MAX));
#endif
/*
* We don't care about NUMA placement if we don't have memory.
*/
- if (!curr->mm || (curr->flags & PF_EXITING) || work->next != work)
+ if ((curr->flags & (PF_EXITING | PF_KTHREAD)) || work->next != work)
return;
/*
struct rq *rq = rq_of(cfs_rq);
struct cfs_bandwidth *cfs_b = tg_cfs_bandwidth(cfs_rq->tg);
struct sched_entity *se;
- int enqueue = 1;
long task_delta, idle_task_delta;
se = cfs_rq->tg->se[cpu_of(rq)];
idle_task_delta = cfs_rq->idle_h_nr_running;
for_each_sched_entity(se) {
if (se->on_rq)
- enqueue = 0;
+ break;
+ cfs_rq = cfs_rq_of(se);
+ enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
+
+ cfs_rq->h_nr_running += task_delta;
+ cfs_rq->idle_h_nr_running += idle_task_delta;
+
+ /* end evaluation on encountering a throttled cfs_rq */
+ if (cfs_rq_throttled(cfs_rq))
+ goto unthrottle_throttle;
+ }
+ for_each_sched_entity(se) {
cfs_rq = cfs_rq_of(se);
- if (enqueue) {
- enqueue_entity(cfs_rq, se, ENQUEUE_WAKEUP);
- } else {
- update_load_avg(cfs_rq, se, 0);
- se_update_runnable(se);
- }
+
+ update_load_avg(cfs_rq, se, UPDATE_TG);
+ se_update_runnable(se);
cfs_rq->h_nr_running += task_delta;
cfs_rq->idle_h_nr_running += idle_task_delta;
+
+ /* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
- break;
+ goto unthrottle_throttle;
+
+ /*
+ * One parent has been throttled and cfs_rq removed from the
+ * list. Add it back to not break the leaf list.
+ */
+ if (throttled_hierarchy(cfs_rq))
+ list_add_leaf_cfs_rq(cfs_rq);
}
- if (!se)
- add_nr_running(rq, task_delta);
+ /* At this point se is NULL and we are at root level*/
+ add_nr_running(rq, task_delta);
+unthrottle_throttle:
/*
* The cfs_rq_throttled() breaks in the above iteration can result in
* incomplete leaf list maintenance, resulting in triggering the
for_each_sched_entity(se) {
cfs_rq = cfs_rq_of(se);
- list_add_leaf_cfs_rq(cfs_rq);
+ if (list_add_leaf_cfs_rq(cfs_rq))
+ break;
}
assert_list_leaf_cfs_rq(rq);
/* end evaluation on encountering a throttled cfs_rq */
if (cfs_rq_throttled(cfs_rq))
goto enqueue_throttle;
+
+ /*
+ * One parent has been throttled and cfs_rq removed from the
+ * list. Add it back to not break the leaf list.
+ */
+ if (throttled_hierarchy(cfs_rq))
+ list_add_leaf_cfs_rq(cfs_rq);
}
enqueue_throttle:
* early_boot_irqs_disabled is set. Use local_irq_save/restore() instead
* of local_irq_disable/enable().
*/
-void on_each_cpu(void (*func) (void *info), void *info, int wait)
+void on_each_cpu(smp_call_func_t func, void *info, int wait)
{
unsigned long flags;
config NOP_TRACER
bool
-config HAVE_FTRACE_NMI_ENTER
- bool
- help
- See Documentation/trace/ftrace-design.rst
-
config HAVE_FUNCTION_TRACER
bool
help
select TRACE_CLOCK
select IRQ_WORK
-config FTRACE_NMI_ENTER
- bool
- depends on HAVE_FTRACE_NMI_ENTER
- default y
-
config EVENT_TRACING
select CONTEXT_SWITCH_TRACER
select GLOB
select CONTEXT_SWITCH_TRACER
select GLOB
select TASKS_RCU if PREEMPTION
+ select TASKS_RUDE_RCU
help
Enable the kernel to trace every kernel function. This is done
by using a compiler feature to insert a small, 5-byte No-Operation
/*
* Only limited trace_printk() conversion specifiers allowed:
- * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %s
+ * %d %i %u %x %ld %li %lu %lx %lld %lli %llu %llx %p %pks %pus %s
*/
BPF_CALL_5(bpf_trace_printk, char *, fmt, u32, fmt_size, u64, arg1,
u64, arg2, u64, arg3)
{
+ int i, mod[3] = {}, fmt_cnt = 0;
+ char buf[64], fmt_ptype;
+ void *unsafe_ptr = NULL;
bool str_seen = false;
- int mod[3] = {};
- int fmt_cnt = 0;
- u64 unsafe_addr;
- char buf[64];
- int i;
/*
* bpf_check()->check_func_arg()->check_stack_boundary()
if (fmt[i] == 'l') {
mod[fmt_cnt]++;
i++;
- } else if (fmt[i] == 'p' || fmt[i] == 's') {
+ } else if (fmt[i] == 'p') {
mod[fmt_cnt]++;
+ if ((fmt[i + 1] == 'k' ||
+ fmt[i + 1] == 'u') &&
+ fmt[i + 2] == 's') {
+ fmt_ptype = fmt[i + 1];
+ i += 2;
+ goto fmt_str;
+ }
+
/* disallow any further format extensions */
if (fmt[i + 1] != 0 &&
!isspace(fmt[i + 1]) &&
!ispunct(fmt[i + 1]))
return -EINVAL;
- fmt_cnt++;
- if (fmt[i] == 's') {
- if (str_seen)
- /* allow only one '%s' per fmt string */
- return -EINVAL;
- str_seen = true;
-
- switch (fmt_cnt) {
- case 1:
- unsafe_addr = arg1;
- arg1 = (long) buf;
- break;
- case 2:
- unsafe_addr = arg2;
- arg2 = (long) buf;
- break;
- case 3:
- unsafe_addr = arg3;
- arg3 = (long) buf;
- break;
- }
- buf[0] = 0;
- strncpy_from_unsafe(buf,
- (void *) (long) unsafe_addr,
+
+ goto fmt_next;
+ } else if (fmt[i] == 's') {
+ mod[fmt_cnt]++;
+ fmt_ptype = fmt[i];
+fmt_str:
+ if (str_seen)
+ /* allow only one '%s' per fmt string */
+ return -EINVAL;
+ str_seen = true;
+
+ if (fmt[i + 1] != 0 &&
+ !isspace(fmt[i + 1]) &&
+ !ispunct(fmt[i + 1]))
+ return -EINVAL;
+
+ switch (fmt_cnt) {
+ case 0:
+ unsafe_ptr = (void *)(long)arg1;
+ arg1 = (long)buf;
+ break;
+ case 1:
+ unsafe_ptr = (void *)(long)arg2;
+ arg2 = (long)buf;
+ break;
+ case 2:
+ unsafe_ptr = (void *)(long)arg3;
+ arg3 = (long)buf;
+ break;
+ }
+
+ buf[0] = 0;
+ switch (fmt_ptype) {
+ case 's':
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ strncpy_from_unsafe(buf, unsafe_ptr,
sizeof(buf));
+ break;
+#endif
+ case 'k':
+ strncpy_from_unsafe_strict(buf, unsafe_ptr,
+ sizeof(buf));
+ break;
+ case 'u':
+ strncpy_from_unsafe_user(buf,
+ (__force void __user *)unsafe_ptr,
+ sizeof(buf));
+ break;
}
- continue;
+ goto fmt_next;
}
if (fmt[i] == 'l') {
if (fmt[i] != 'i' && fmt[i] != 'd' &&
fmt[i] != 'u' && fmt[i] != 'x')
return -EINVAL;
+fmt_next:
fmt_cnt++;
}
return &bpf_probe_read_user_proto;
case BPF_FUNC_probe_read_kernel:
return &bpf_probe_read_kernel_proto;
- case BPF_FUNC_probe_read:
- return &bpf_probe_read_compat_proto;
case BPF_FUNC_probe_read_user_str:
return &bpf_probe_read_user_str_proto;
case BPF_FUNC_probe_read_kernel_str:
return &bpf_probe_read_kernel_str_proto;
+#ifdef CONFIG_ARCH_HAS_NON_OVERLAPPING_ADDRESS_SPACE
+ case BPF_FUNC_probe_read:
+ return &bpf_probe_read_compat_proto;
case BPF_FUNC_probe_read_str:
return &bpf_probe_read_compat_str_proto;
+#endif
#ifdef CONFIG_CGROUPS
case BPF_FUNC_get_current_cgroup_id:
return &bpf_get_current_cgroup_id_proto;
u32 *ids, prog_cnt, ids_len;
int ret;
- if (!capable(CAP_SYS_ADMIN))
+ if (!perfmon_capable())
return -EPERM;
if (event->attr.type != PERF_TYPE_TRACEPOINT)
return -EINVAL;
op->saved_func(ip, parent_ip, op, regs);
}
-static void ftrace_sync(struct work_struct *work)
-{
- /*
- * This function is just a stub to implement a hard force
- * of synchronize_rcu(). This requires synchronizing
- * tasks even in userspace and idle.
- *
- * Yes, function tracing is rude.
- */
-}
-
static void ftrace_sync_ipi(void *data)
{
/* Probably not needed, but do it anyway */
* Make sure all CPUs see this. Yes this is slow, but static
* tracing is slow and nasty to have enabled.
*/
- schedule_on_each_cpu(ftrace_sync);
+ synchronize_rcu_tasks_rude();
/* Now all cpus are using the list ops. */
function_trace_op = set_function_trace_op;
/* Make sure the function_trace_op is visible on all CPUs */
* infrastructure to do the synchronization, thus we must do it
* ourselves.
*/
- schedule_on_each_cpu(ftrace_sync);
+ synchronize_rcu_tasks_rude();
/*
* When the kernel is preeptive, tasks can be preempted
* infrastructure to do the synchronization, thus we must do it
* ourselves.
*/
- schedule_on_each_cpu(ftrace_sync);
+ synchronize_rcu_tasks_rude();
free_ftrace_hash(old_hash);
}
#ifdef CONFIG_FUNCTION_TRACER
-/*
- * Traverse the ftrace_global_list, invoking all entries. The reason that we
- * can use rcu_dereference_raw_check() is that elements removed from this list
- * are simply leaked, so there is no need to interact with a grace-period
- * mechanism. The rcu_dereference_raw_check() calls are needed to handle
- * concurrent insertions into the ftrace_global_list.
- *
- * Silly Alpha and silly pointer-speculation compiler optimizations!
- */
-#define do_for_each_ftrace_op(op, list) \
- op = rcu_dereference_raw_check(list); \
- do
-
-/*
- * Optimized for just a single item in the list (as that is the normal case).
- */
-#define while_for_each_ftrace_op(op) \
- while (likely(op = rcu_dereference_raw_check((op)->next)) && \
- unlikely((op) != &ftrace_list_end))
-
-extern struct ftrace_ops __rcu *ftrace_ops_list;
-extern struct ftrace_ops ftrace_list_end;
extern struct mutex ftrace_lock;
extern struct ftrace_ops global_ops;
#include <linux/printk.h>
#include <linux/string.h>
#include <linux/sysfs.h>
+#include <linux/completion.h>
static ulong delay = 100;
static char test_mode[12] = "irq";
MODULE_PARM_DESC(test_mode, "Mode of the test such as preempt, irq, or alternate (default irq)");
MODULE_PARM_DESC(burst_size, "The size of a burst (default 1)");
+static struct completion done;
+
#define MIN(x, y) ((x) < (y) ? (x) : (y))
static void busy_wait(ulong time)
for (i = 0; i < s; i++)
(testfuncs[i])(i);
+ complete(&done);
+
set_current_state(TASK_INTERRUPTIBLE);
while (!kthread_should_stop()) {
schedule();
static int preemptirq_run_test(void)
{
struct task_struct *task;
-
char task_name[50];
+ init_completion(&done);
+
snprintf(task_name, sizeof(task_name), "%s_test", test_mode);
task = kthread_run(preemptirq_delay_run, NULL, task_name);
if (IS_ERR(task))
return PTR_ERR(task);
- if (task)
+ if (task) {
+ wait_for_completion(&done);
kthread_stop(task);
+ }
return 0;
}
case RINGBUF_TYPE_DATA:
return rb_event_data_length(event);
default:
- BUG();
+ WARN_ON_ONCE(1);
}
/* not hit */
return 0;
{
if (extended_time(event))
event = skip_time_extend(event);
- BUG_ON(event->type_len > RINGBUF_TYPE_DATA_TYPE_LEN_MAX);
+ WARN_ON_ONCE(event->type_len > RINGBUF_TYPE_DATA_TYPE_LEN_MAX);
/* If length is in len field, then array[0] has the data */
if (event->type_len)
return (void *)&event->array[0];
return;
default:
- BUG();
+ RB_WARN_ON(cpu_buffer, 1);
}
return;
}
return;
default:
- BUG();
+ RB_WARN_ON(iter->cpu_buffer, 1);
}
return;
}
return event;
default:
- BUG();
+ RB_WARN_ON(cpu_buffer, 1);
}
return NULL;
struct ring_buffer_per_cpu *cpu_buffer;
struct ring_buffer_event *event;
int nr_loops = 0;
- bool failed = false;
if (ts)
*ts = 0;
return NULL;
/*
- * We repeat when a time extend is encountered or we hit
- * the end of the page. Since the time extend is always attached
- * to a data event, we should never loop more than three times.
- * Once for going to next page, once on time extend, and
- * finally once to get the event.
- * We should never hit the following condition more than thrice,
- * unless the buffer is very small, and there's a writer
- * that is causing the reader to fail getting an event.
+ * As the writer can mess with what the iterator is trying
+ * to read, just give up if we fail to get an event after
+ * three tries. The iterator is not as reliable when reading
+ * the ring buffer with an active write as the consumer is.
+ * Do not warn if the three failures is reached.
*/
- if (++nr_loops > 3) {
- RB_WARN_ON(cpu_buffer, !failed);
+ if (++nr_loops > 3)
return NULL;
- }
if (rb_per_cpu_empty(cpu_buffer))
return NULL;
}
event = rb_iter_head_event(iter);
- if (!event) {
- failed = true;
+ if (!event)
goto again;
- }
switch (event->type_len) {
case RINGBUF_TYPE_PADDING:
return event;
default:
- BUG();
+ RB_WARN_ON(cpu_buffer, 1);
}
return NULL;
{
struct umh_info *umh_info = info->data;
+ /* cleanup if umh_pipe_setup() was successful but exec failed */
+ if (info->pid && info->retval) {
+ fput(umh_info->pipe_to_umh);
+ fput(umh_info->pipe_from_umh);
+ }
+
argv_free(info->argv);
umh_info->pid = info->pid;
}
Benchmark all available RAID6 PQ functions on init and choose the
fastest one.
+config LINEAR_RANGES
+ tristate
+
config PACKING
bool "Generic bitfield packing and unpacking"
default n
For more information, see
tools/objtool/Documentation/stack-validation.txt.
+config VMLINUX_VALIDATION
+ bool
+ depends on STACK_VALIDATION && DEBUG_ENTRY && !PARAVIRT
+ default y
+
config DEBUG_FORCE_WEAK_PER_CPU
bool "Force weak per-cpu definitions"
depends on DEBUG_KERNEL
If unsure, say N.
+config LINEAR_RANGES_TEST
+ tristate "KUnit test for linear_ranges"
+ depends on KUNIT
+ select LINEAR_RANGES
+ help
+ This builds the linear_ranges unit test, which runs on boot.
+ Tests the linear_ranges logic correctness.
+ For more information on KUnit and unit tests in general please refer
+ to the KUnit documentation in Documentation/dev-tools/kunit/.
+
+ If unsure, say N.
+
config TEST_UDELAY
tristate "udelay test driver"
help
obj-$(CONFIG_DEBUG_OBJECTS) += debugobjects.o
obj-$(CONFIG_BITREVERSE) += bitrev.o
+obj-$(CONFIG_LINEAR_RANGES) += linear_ranges.o
obj-$(CONFIG_PACKING) += packing.o
obj-$(CONFIG_CRC_CCITT) += crc-ccitt.o
obj-$(CONFIG_CRC16) += crc16.o
# KUnit tests
obj-$(CONFIG_LIST_KUNIT_TEST) += list-test.o
+obj-$(CONFIG_LINEAR_RANGES_TEST) += test_linear_ranges.o
#include <linux/export.h>
#include <linux/bitops.h>
#include <linux/string.h>
-#include <linux/cryptohash.h>
#include <asm/unaligned.h>
#include <crypto/chacha.h>
memzero_explicit(W, 64 * sizeof(u32));
}
-int sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
+void sha256_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
{
unsigned int partial, done;
const u8 *src;
partial = 0;
}
memcpy(sctx->buf + partial, src, len - done);
-
- return 0;
}
EXPORT_SYMBOL(sha256_update);
-int sha224_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
+void sha224_update(struct sha256_state *sctx, const u8 *data, unsigned int len)
{
- return sha256_update(sctx, data, len);
+ sha256_update(sctx, data, len);
}
EXPORT_SYMBOL(sha224_update);
-static int __sha256_final(struct sha256_state *sctx, u8 *out, int digest_words)
+static void __sha256_final(struct sha256_state *sctx, u8 *out, int digest_words)
{
__be32 *dst = (__be32 *)out;
__be64 bits;
/* Zeroize sensitive information. */
memset(sctx, 0, sizeof(*sctx));
-
- return 0;
}
-int sha256_final(struct sha256_state *sctx, u8 *out)
+void sha256_final(struct sha256_state *sctx, u8 *out)
{
- return __sha256_final(sctx, out, 8);
+ __sha256_final(sctx, out, 8);
}
EXPORT_SYMBOL(sha256_final);
-int sha224_final(struct sha256_state *sctx, u8 *out)
+void sha224_final(struct sha256_state *sctx, u8 *out)
{
- return __sha256_final(sctx, out, 7);
+ __sha256_final(sctx, out, 7);
}
EXPORT_SYMBOL(sha224_final);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * helpers to map values in a linear range to range index
+ *
+ * Original idea borrowed from regulator framework
+ *
+ * It might be useful if we could support also inversely proportional ranges?
+ * Copyright 2020 ROHM Semiconductors
+ */
+
+#include <linux/errno.h>
+#include <linux/export.h>
+#include <linux/kernel.h>
+#include <linux/linear_range.h>
+#include <linux/module.h>
+
+/**
+ * linear_range_values_in_range - return the amount of values in a range
+ * @r: pointer to linear range where values are counted
+ *
+ * Compute the amount of values in range pointed by @r. Note, values can
+ * be all equal - range with selectors 0,...,2 with step 0 still contains
+ * 3 values even though they are all equal.
+ *
+ * Return: the amount of values in range pointed by @r
+ */
+unsigned int linear_range_values_in_range(const struct linear_range *r)
+{
+ if (!r)
+ return 0;
+ return r->max_sel - r->min_sel + 1;
+}
+EXPORT_SYMBOL_GPL(linear_range_values_in_range);
+
+/**
+ * linear_range_values_in_range_array - return the amount of values in ranges
+ * @r: pointer to array of linear ranges where values are counted
+ * @ranges: amount of ranges we include in computation.
+ *
+ * Compute the amount of values in ranges pointed by @r. Note, values can
+ * be all equal - range with selectors 0,...,2 with step 0 still contains
+ * 3 values even though they are all equal.
+ *
+ * Return: the amount of values in first @ranges ranges pointed by @r
+ */
+unsigned int linear_range_values_in_range_array(const struct linear_range *r,
+ int ranges)
+{
+ int i, values_in_range = 0;
+
+ for (i = 0; i < ranges; i++) {
+ int values;
+
+ values = linear_range_values_in_range(&r[i]);
+ if (!values)
+ return values;
+
+ values_in_range += values;
+ }
+ return values_in_range;
+}
+EXPORT_SYMBOL_GPL(linear_range_values_in_range_array);
+
+/**
+ * linear_range_get_max_value - return the largest value in a range
+ * @r: pointer to linear range where value is looked from
+ *
+ * Return: the largest value in the given range
+ */
+unsigned int linear_range_get_max_value(const struct linear_range *r)
+{
+ return r->min + (r->max_sel - r->min_sel) * r->step;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_max_value);
+
+/**
+ * linear_range_get_value - fetch a value from given range
+ * @r: pointer to linear range where value is looked from
+ * @selector: selector for which the value is searched
+ * @val: address where found value is updated
+ *
+ * Search given ranges for value which matches given selector.
+ *
+ * Return: 0 on success, -EINVAL given selector is not found from any of the
+ * ranges.
+ */
+int linear_range_get_value(const struct linear_range *r, unsigned int selector,
+ unsigned int *val)
+{
+ if (r->min_sel > selector || r->max_sel < selector)
+ return -EINVAL;
+
+ *val = r->min + (selector - r->min_sel) * r->step;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_value);
+
+/**
+ * linear_range_get_value_array - fetch a value from array of ranges
+ * @r: pointer to array of linear ranges where value is looked from
+ * @ranges: amount of ranges in an array
+ * @selector: selector for which the value is searched
+ * @val: address where found value is updated
+ *
+ * Search through an array of ranges for value which matches given selector.
+ *
+ * Return: 0 on success, -EINVAL given selector is not found from any of the
+ * ranges.
+ */
+int linear_range_get_value_array(const struct linear_range *r, int ranges,
+ unsigned int selector, unsigned int *val)
+{
+ int i;
+
+ for (i = 0; i < ranges; i++)
+ if (r[i].min_sel <= selector && r[i].max_sel >= selector)
+ return linear_range_get_value(&r[i], selector, val);
+
+ return -EINVAL;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_value_array);
+
+/**
+ * linear_range_get_selector_low - return linear range selector for value
+ * @r: pointer to linear range where selector is looked from
+ * @val: value for which the selector is searched
+ * @selector: address where found selector value is updated
+ * @found: flag to indicate that given value was in the range
+ *
+ * Return selector which which range value is closest match for given
+ * input value. Value is matching if it is equal or smaller than given
+ * value. If given value is in the range, then @found is set true.
+ *
+ * Return: 0 on success, -EINVAL if range is invalid or does not contain
+ * value smaller or equal to given value
+ */
+int linear_range_get_selector_low(const struct linear_range *r,
+ unsigned int val, unsigned int *selector,
+ bool *found)
+{
+ *found = false;
+
+ if (r->min > val)
+ return -EINVAL;
+
+ if (linear_range_get_max_value(r) < val) {
+ *selector = r->max_sel;
+ return 0;
+ }
+
+ *found = true;
+
+ if (r->step == 0)
+ *selector = r->min_sel;
+ else
+ *selector = (val - r->min) / r->step + r->min_sel;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_selector_low);
+
+/**
+ * linear_range_get_selector_low_array - return linear range selector for value
+ * @r: pointer to array of linear ranges where selector is looked from
+ * @ranges: amount of ranges to scan from array
+ * @val: value for which the selector is searched
+ * @selector: address where found selector value is updated
+ * @found: flag to indicate that given value was in the range
+ *
+ * Scan array of ranges for selector which which range value matches given
+ * input value. Value is matching if it is equal or smaller than given
+ * value. If given value is found to be in a range scanning is stopped and
+ * @found is set true. If a range with values smaller than given value is found
+ * but the range max is being smaller than given value, then the ranges
+ * biggest selector is updated to @selector but scanning ranges is continued
+ * and @found is set to false.
+ *
+ * Return: 0 on success, -EINVAL if range array is invalid or does not contain
+ * range with a value smaller or equal to given value
+ */
+int linear_range_get_selector_low_array(const struct linear_range *r,
+ int ranges, unsigned int val,
+ unsigned int *selector, bool *found)
+{
+ int i;
+ int ret = -EINVAL;
+
+ for (i = 0; i < ranges; i++) {
+ int tmpret;
+
+ tmpret = linear_range_get_selector_low(&r[i], val, selector,
+ found);
+ if (!tmpret)
+ ret = 0;
+
+ if (*found)
+ break;
+ }
+
+ return ret;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_selector_low_array);
+
+/**
+ * linear_range_get_selector_high - return linear range selector for value
+ * @r: pointer to linear range where selector is looked from
+ * @val: value for which the selector is searched
+ * @selector: address where found selector value is updated
+ * @found: flag to indicate that given value was in the range
+ *
+ * Return selector which which range value is closest match for given
+ * input value. Value is matching if it is equal or higher than given
+ * value. If given value is in the range, then @found is set true.
+ *
+ * Return: 0 on success, -EINVAL if range is invalid or does not contain
+ * value greater or equal to given value
+ */
+int linear_range_get_selector_high(const struct linear_range *r,
+ unsigned int val, unsigned int *selector,
+ bool *found)
+{
+ *found = false;
+
+ if (linear_range_get_max_value(r) < val)
+ return -EINVAL;
+
+ if (r->min > val) {
+ *selector = r->min_sel;
+ return 0;
+ }
+
+ *found = true;
+
+ if (r->step == 0)
+ *selector = r->max_sel;
+ else
+ *selector = DIV_ROUND_UP(val - r->min, r->step) + r->min_sel;
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(linear_range_get_selector_high);
+
+MODULE_DESCRIPTION("linear-ranges helper");
+MODULE_LICENSE("GPL");
************** MIPS/64 **************
***************************************/
#if (defined(__mips) && __mips >= 3) && W_TYPE_SIZE == 64
-#if defined(__mips_isa_rev) && __mips_isa_rev >= 6
+#if defined(__mips_isa_rev) && __mips_isa_rev >= 6 && defined(CONFIG_CC_IS_GCC)
/*
* GCC ends up emitting a __multi3 intrinsic call for MIPS64r6 with the plain C
* code below, so we special case MIPS64r6 until the compiler can do better.
#include <linux/kernel.h>
#include <linux/kmemleak.h>
#include <linux/percpu.h>
+#include <linux/local_lock.h>
#include <linux/preempt.h> /* in_interrupt() */
#include <linux/radix-tree.h>
#include <linux/rcupdate.h>
#include <linux/string.h>
#include <linux/xarray.h>
-
/*
* Radix tree node cache.
*/
/*
* Per-cpu pool of preloaded nodes
*/
-struct radix_tree_preload {
- unsigned nr;
- /* nodes->parent points to next preallocated node */
- struct radix_tree_node *nodes;
+DEFINE_PER_CPU(struct radix_tree_preload, radix_tree_preloads) = {
+ .lock = INIT_LOCAL_LOCK(lock),
};
-static DEFINE_PER_CPU(struct radix_tree_preload, radix_tree_preloads) = { 0, };
+EXPORT_PER_CPU_SYMBOL_GPL(radix_tree_preloads);
static inline struct radix_tree_node *entry_to_node(void *ptr)
{
*/
gfp_mask &= ~__GFP_ACCOUNT;
- preempt_disable();
+ local_lock(&radix_tree_preloads.lock);
rtp = this_cpu_ptr(&radix_tree_preloads);
while (rtp->nr < nr) {
- preempt_enable();
+ local_unlock(&radix_tree_preloads.lock);
node = kmem_cache_alloc(radix_tree_node_cachep, gfp_mask);
if (node == NULL)
goto out;
- preempt_disable();
+ local_lock(&radix_tree_preloads.lock);
rtp = this_cpu_ptr(&radix_tree_preloads);
if (rtp->nr < nr) {
node->parent = rtp->nodes;
if (gfpflags_allow_blocking(gfp_mask))
return __radix_tree_preload(gfp_mask, RADIX_TREE_PRELOAD_SIZE);
/* Preloading doesn't help anything with this gfp mask, skip it */
- preempt_disable();
+ local_lock(&radix_tree_preloads.lock);
return 0;
}
EXPORT_SYMBOL(radix_tree_maybe_preload);
void idr_preload(gfp_t gfp_mask)
{
if (__radix_tree_preload(gfp_mask, IDR_PRELOAD_SIZE))
- preempt_disable();
+ local_lock(&radix_tree_preloads.lock);
}
EXPORT_SYMBOL(idr_preload);
#include <linux/kernel.h>
#include <linux/export.h>
#include <linux/bitops.h>
-#include <linux/cryptohash.h>
+#include <crypto/sha.h>
#include <asm/unaligned.h>
/*
#define T_60_79(t, A, B, C, D, E) SHA_ROUND(t, SHA_MIX, (B^C^D) , 0xca62c1d6, A, B, C, D, E )
/**
- * sha_transform - single block SHA1 transform
+ * sha1_transform - single block SHA1 transform (deprecated)
*
* @digest: 160 bit digest to update
* @data: 512 bits of data to hash
* @array: 16 words of workspace (see note)
*
- * This function generates a SHA1 digest for a single 512-bit block.
- * Be warned, it does not handle padding and message digest, do not
- * confuse it with the full FIPS 180-1 digest algorithm for variable
- * length messages.
+ * This function executes SHA-1's internal compression function. It updates the
+ * 160-bit internal state (@digest) with a single 512-bit data block (@data).
+ *
+ * Don't use this function. SHA-1 is no longer considered secure. And even if
+ * you do have to use SHA-1, this isn't the correct way to hash something with
+ * SHA-1 as this doesn't handle padding and finalization.
*
* Note: If the hash is security sensitive, the caller should be sure
* to clear the workspace. This is left to the caller to avoid
* unnecessary clears between chained hashing operations.
*/
-void sha_transform(__u32 *digest, const char *data, __u32 *array)
+void sha1_transform(__u32 *digest, const char *data, __u32 *array)
{
__u32 A, B, C, D, E;
digest[3] += D;
digest[4] += E;
}
-EXPORT_SYMBOL(sha_transform);
+EXPORT_SYMBOL(sha1_transform);
/**
- * sha_init - initialize the vectors for a SHA1 digest
+ * sha1_init - initialize the vectors for a SHA1 digest
* @buf: vector to initialize
*/
-void sha_init(__u32 *buf)
+void sha1_init(__u32 *buf)
{
buf[0] = 0x67452301;
buf[1] = 0xefcdab89;
buf[3] = 0x10325476;
buf[4] = 0xc3d2e1f0;
}
-EXPORT_SYMBOL(sha_init);
+EXPORT_SYMBOL(sha1_init);
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KUnit test for the linear_ranges helper.
+ *
+ * Copyright (C) 2020, ROHM Semiconductors.
+ * Author: Matti Vaittinen <matti.vaittien@fi.rohmeurope.com>
+ */
+#include <kunit/test.h>
+
+#include <linux/linear_range.h>
+
+/* First things first. I deeply dislike unit-tests. I have seen all the hell
+ * breaking loose when people who think the unit tests are "the silver bullet"
+ * to kill bugs get to decide how a company should implement testing strategy...
+ *
+ * Believe me, it may get _really_ ridiculous. It is tempting to think that
+ * walking through all the possible execution branches will nail down 100% of
+ * bugs. This may lead to ideas about demands to get certain % of "test
+ * coverage" - measured as line coverage. And that is one of the worst things
+ * you can do.
+ *
+ * Ask people to provide line coverage and they do. I've seen clever tools
+ * which generate test cases to test the existing functions - and by default
+ * these tools expect code to be correct and just generate checks which are
+ * passing when ran against current code-base. Run this generator and you'll get
+ * tests that do not test code is correct but just verify nothing changes.
+ * Problem is that testing working code is pointless. And if it is not
+ * working, your test must not assume it is working. You won't catch any bugs
+ * by such tests. What you can do is to generate a huge amount of tests.
+ * Especially if you were are asked to proivde 100% line-coverage x_x. So what
+ * does these tests - which are not finding any bugs now - do?
+ *
+ * They add inertia to every future development. I think it was Terry Pratchet
+ * who wrote someone having same impact as thick syrup has to chronometre.
+ * Excessive amount of unit-tests have this effect to development. If you do
+ * actually find _any_ bug from code in such environment and try fixing it...
+ * ...chances are you also need to fix the test cases. In sunny day you fix one
+ * test. But I've done refactoring which resulted 500+ broken tests (which had
+ * really zero value other than proving to managers that we do do "quality")...
+ *
+ * After this being said - there are situations where UTs can be handy. If you
+ * have algorithms which take some input and should produce output - then you
+ * can implement few, carefully selected simple UT-cases which test this. I've
+ * previously used this for example for netlink and device-tree data parsing
+ * functions. Feed some data examples to functions and verify the output is as
+ * expected. I am not covering all the cases but I will see the logic should be
+ * working.
+ *
+ * Here we also do some minor testing. I don't want to go through all branches
+ * or test more or less obvious things - but I want to see the main logic is
+ * working. And I definitely don't want to add 500+ test cases that break when
+ * some simple fix is done x_x. So - let's only add few, well selected tests
+ * which ensure as much logic is good as possible.
+ */
+
+/*
+ * Test Range 1:
+ * selectors: 2 3 4 5 6
+ * values (5): 10 20 30 40 50
+ *
+ * Test Range 2:
+ * selectors: 7 8 9 10
+ * values (4): 100 150 200 250
+ */
+
+#define RANGE1_MIN 10
+#define RANGE1_MIN_SEL 2
+#define RANGE1_STEP 10
+
+/* 2, 3, 4, 5, 6 */
+static const unsigned int range1_sels[] = { RANGE1_MIN_SEL, RANGE1_MIN_SEL + 1,
+ RANGE1_MIN_SEL + 2,
+ RANGE1_MIN_SEL + 3,
+ RANGE1_MIN_SEL + 4 };
+/* 10, 20, 30, 40, 50 */
+static const unsigned int range1_vals[] = { RANGE1_MIN, RANGE1_MIN +
+ RANGE1_STEP,
+ RANGE1_MIN + RANGE1_STEP * 2,
+ RANGE1_MIN + RANGE1_STEP * 3,
+ RANGE1_MIN + RANGE1_STEP * 4 };
+
+#define RANGE2_MIN 100
+#define RANGE2_MIN_SEL 7
+#define RANGE2_STEP 50
+
+/* 7, 8, 9, 10 */
+static const unsigned int range2_sels[] = { RANGE2_MIN_SEL, RANGE2_MIN_SEL + 1,
+ RANGE2_MIN_SEL + 2,
+ RANGE2_MIN_SEL + 3 };
+/* 100, 150, 200, 250 */
+static const unsigned int range2_vals[] = { RANGE2_MIN, RANGE2_MIN +
+ RANGE2_STEP,
+ RANGE2_MIN + RANGE2_STEP * 2,
+ RANGE2_MIN + RANGE2_STEP * 3 };
+
+#define RANGE1_NUM_VALS (ARRAY_SIZE(range1_vals))
+#define RANGE2_NUM_VALS (ARRAY_SIZE(range2_vals))
+#define RANGE_NUM_VALS (RANGE1_NUM_VALS + RANGE2_NUM_VALS)
+
+#define RANGE1_MAX_SEL (RANGE1_MIN_SEL + RANGE1_NUM_VALS - 1)
+#define RANGE1_MAX_VAL (range1_vals[RANGE1_NUM_VALS - 1])
+
+#define RANGE2_MAX_SEL (RANGE2_MIN_SEL + RANGE2_NUM_VALS - 1)
+#define RANGE2_MAX_VAL (range2_vals[RANGE2_NUM_VALS - 1])
+
+#define SMALLEST_SEL RANGE1_MIN_SEL
+#define SMALLEST_VAL RANGE1_MIN
+
+static struct linear_range testr[] = {
+ {
+ .min = RANGE1_MIN,
+ .min_sel = RANGE1_MIN_SEL,
+ .max_sel = RANGE1_MAX_SEL,
+ .step = RANGE1_STEP,
+ }, {
+ .min = RANGE2_MIN,
+ .min_sel = RANGE2_MIN_SEL,
+ .max_sel = RANGE2_MAX_SEL,
+ .step = RANGE2_STEP
+ },
+};
+
+static void range_test_get_value(struct kunit *test)
+{
+ int ret, i;
+ unsigned int sel, val;
+
+ for (i = 0; i < RANGE1_NUM_VALS; i++) {
+ sel = range1_sels[i];
+ ret = linear_range_get_value_array(&testr[0], 2, sel, &val);
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_EQ(test, val, range1_vals[i]);
+ }
+ for (i = 0; i < RANGE2_NUM_VALS; i++) {
+ sel = range2_sels[i];
+ ret = linear_range_get_value_array(&testr[0], 2, sel, &val);
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_EQ(test, val, range2_vals[i]);
+ }
+ ret = linear_range_get_value_array(&testr[0], 2, sel + 1, &val);
+ KUNIT_EXPECT_NE(test, 0, ret);
+}
+
+static void range_test_get_selector_high(struct kunit *test)
+{
+ int ret, i;
+ unsigned int sel;
+ bool found;
+
+ for (i = 0; i < RANGE1_NUM_VALS; i++) {
+ ret = linear_range_get_selector_high(&testr[0], range1_vals[i],
+ &sel, &found);
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_EQ(test, sel, range1_sels[i]);
+ KUNIT_EXPECT_TRUE(test, found);
+ }
+
+ ret = linear_range_get_selector_high(&testr[0], RANGE1_MAX_VAL + 1,
+ &sel, &found);
+ KUNIT_EXPECT_LE(test, ret, 0);
+
+ ret = linear_range_get_selector_high(&testr[0], RANGE1_MIN - 1,
+ &sel, &found);
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_FALSE(test, found);
+ KUNIT_EXPECT_EQ(test, sel, range1_sels[0]);
+}
+
+static void range_test_get_value_amount(struct kunit *test)
+{
+ int ret;
+
+ ret = linear_range_values_in_range_array(&testr[0], 2);
+ KUNIT_EXPECT_EQ(test, (int)RANGE_NUM_VALS, ret);
+}
+
+static void range_test_get_selector_low(struct kunit *test)
+{
+ int i, ret;
+ unsigned int sel;
+ bool found;
+
+ for (i = 0; i < RANGE1_NUM_VALS; i++) {
+ ret = linear_range_get_selector_low_array(&testr[0], 2,
+ range1_vals[i], &sel,
+ &found);
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_EQ(test, sel, range1_sels[i]);
+ KUNIT_EXPECT_TRUE(test, found);
+ }
+ for (i = 0; i < RANGE2_NUM_VALS; i++) {
+ ret = linear_range_get_selector_low_array(&testr[0], 2,
+ range2_vals[i], &sel,
+ &found);
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_EQ(test, sel, range2_sels[i]);
+ KUNIT_EXPECT_TRUE(test, found);
+ }
+
+ /*
+ * Seek value greater than range max => get_selector_*_low should
+ * return Ok - but set found to false as value is not in range
+ */
+ ret = linear_range_get_selector_low_array(&testr[0], 2,
+ range2_vals[RANGE2_NUM_VALS - 1] + 1,
+ &sel, &found);
+
+ KUNIT_EXPECT_EQ(test, 0, ret);
+ KUNIT_EXPECT_EQ(test, sel, range2_sels[RANGE2_NUM_VALS - 1]);
+ KUNIT_EXPECT_FALSE(test, found);
+}
+
+static struct kunit_case range_test_cases[] = {
+ KUNIT_CASE(range_test_get_value_amount),
+ KUNIT_CASE(range_test_get_selector_high),
+ KUNIT_CASE(range_test_get_selector_low),
+ KUNIT_CASE(range_test_get_value),
+ {},
+};
+
+static struct kunit_suite range_test_module = {
+ .name = "linear-ranges-test",
+ .test_cases = range_test_cases,
+};
+
+kunit_test_suites(&range_test_module);
+
+MODULE_LICENSE("GPL");
#define PTR_STR "ffff0123456789ab"
#define PTR_VAL_NO_CRNG "(____ptrval____)"
#define ZEROS "00000000" /* hex 32 zero bits */
+#define ONES "ffffffff" /* hex 32 one bits */
static int __init
plain_format(void)
#define PTR_STR "456789ab"
#define PTR_VAL_NO_CRNG "(ptrval)"
#define ZEROS ""
+#define ONES ""
static int __init
plain_format(void)
test(buf, fmt, p);
}
+/*
+ * NULL pointers aren't hashed.
+ */
static void __init
null_pointer(void)
{
- test_hashed("%p", NULL);
+ test(ZEROS "00000000", "%p", NULL);
test(ZEROS "00000000", "%px", NULL);
test("(null)", "%pE", NULL);
}
+/*
+ * Error pointers aren't hashed.
+ */
+static void __init
+error_pointer(void)
+{
+ test(ONES "fffffff5", "%p", ERR_PTR(-11));
+ test(ONES "fffffff5", "%px", ERR_PTR(-11));
+ test("(efault)", "%pE", ERR_PTR(-11));
+}
+
#define PTR_INVALID ((void *)0x000000ab)
static void __init
}
static void __init
-struct_rtc_time(void)
+time_and_date(void)
{
/* 1543210543 */
const struct rtc_time tm = {
.tm_mon = 10,
.tm_year = 118,
};
+ /* 2019-01-04T15:32:23 */
+ time64_t t = 1546615943;
- test("(%ptR?)", "%pt", &tm);
+ test("(%pt?)", "%pt", &tm);
test("2018-11-26T05:35:43", "%ptR", &tm);
test("0118-10-26T05:35:43", "%ptRr", &tm);
test("05:35:43|2018-11-26", "%ptRt|%ptRd", &tm, &tm);
test("05:35:43|0118-10-26", "%ptRtr|%ptRdr", &tm, &tm);
test("05:35:43|2018-11-26", "%ptRttr|%ptRdtr", &tm, &tm);
test("05:35:43 tr|2018-11-26 tr", "%ptRt tr|%ptRd tr", &tm, &tm);
+
+ test("2019-01-04T15:32:23", "%ptT", &t);
+ test("0119-00-04T15:32:23", "%ptTr", &t);
+ test("15:32:23|2019-01-04", "%ptTt|%ptTd", &t, &t);
+ test("15:32:23|0119-00-04", "%ptTtr|%ptTdr", &t, &t);
}
static void __init
{
plain();
null_pointer();
+ error_pointer();
invalid_pointer();
symbol_ptr();
kernel_ptr();
uuid();
dentry();
struct_va_format();
- struct_rtc_time();
+ time_and_date();
struct_clk();
bitmap();
netdev_features();
#include <linux/dcache.h>
#include <linux/cred.h>
#include <linux/rtc.h>
+#include <linux/time.h>
#include <linux/uuid.h>
#include <linux/of.h>
#include <net/addrconf.h>
* @endp: A pointer to the end of the parsed string will be placed here
* @base: The number base to use
*
- * This function is obsolete. Please use kstrtoull instead.
+ * This function has caveats. Please use kstrtoull instead.
*/
unsigned long long simple_strtoull(const char *cp, char **endp, unsigned int base)
{
* @endp: A pointer to the end of the parsed string will be placed here
* @base: The number base to use
*
- * This function is obsolete. Please use kstrtoul instead.
+ * This function has caveats. Please use kstrtoul instead.
*/
unsigned long simple_strtoul(const char *cp, char **endp, unsigned int base)
{
* @endp: A pointer to the end of the parsed string will be placed here
* @base: The number base to use
*
- * This function is obsolete. Please use kstrtol instead.
+ * This function has caveats. Please use kstrtol instead.
*/
long simple_strtol(const char *cp, char **endp, unsigned int base)
{
* @endp: A pointer to the end of the parsed string will be placed here
* @base: The number base to use
*
- * This function is obsolete. Please use kstrtoll instead.
+ * This function has caveats. Please use kstrtoll instead.
*/
long long simple_strtoll(const char *cp, char **endp, unsigned int base)
{
unsigned long hashval;
int ret;
+ /*
+ * Print the real pointer value for NULL and error pointers,
+ * as they are not actual addresses.
+ */
+ if (IS_ERR_OR_NULL(ptr))
+ return pointer_string(buf, end, ptr, spec);
+
/* When debugging early boot use non-cryptographically secure hash. */
if (unlikely(debug_boot_weak_hash)) {
hashval = hash_long((unsigned long)ptr, 32);
return buf;
}
+static noinline_for_stack
+char *time64_str(char *buf, char *end, const time64_t time,
+ struct printf_spec spec, const char *fmt)
+{
+ struct rtc_time rtc_time;
+ struct tm tm;
+
+ time64_to_tm(time, 0, &tm);
+
+ rtc_time.tm_sec = tm.tm_sec;
+ rtc_time.tm_min = tm.tm_min;
+ rtc_time.tm_hour = tm.tm_hour;
+ rtc_time.tm_mday = tm.tm_mday;
+ rtc_time.tm_mon = tm.tm_mon;
+ rtc_time.tm_year = tm.tm_year;
+ rtc_time.tm_wday = tm.tm_wday;
+ rtc_time.tm_yday = tm.tm_yday;
+
+ rtc_time.tm_isdst = 0;
+
+ return rtc_str(buf, end, &rtc_time, spec, fmt);
+}
+
static noinline_for_stack
char *time_and_date(char *buf, char *end, void *ptr, struct printf_spec spec,
const char *fmt)
switch (fmt[1]) {
case 'R':
return rtc_str(buf, end, (const struct rtc_time *)ptr, spec, fmt);
+ case 'T':
+ return time64_str(buf, end, *(const time64_t *)ptr, spec, fmt);
default:
- return error_string(buf, end, "(%ptR?)", spec);
+ return error_string(buf, end, "(%pt?)", spec);
}
}
* - 'd[234]' For a dentry name (optionally 2-4 last components)
* - 'D[234]' Same as 'd' but for a struct file
* - 'g' For block_device name (gendisk + partition number)
- * - 't[R][dt][r]' For time and date as represented:
+ * - 't[RT][dt][r]' For time and date as represented by:
* R struct rtc_time
+ * T time64_t
* - 'C' For a clock, it prints the name (Common Clock Framework) or address
* (legacy clock framework) of the clock
* - 'Cn' For a clock, it prints the name (Common Clock Framework) or address
* f full name
* P node name, including a possible unit address
* - 'x' For printing the address. Equivalent to "%lx".
+ * - '[ku]s' For a BPF/tracing related format specifier, e.g. used out of
+ * bpf_trace_printk() where [ku] prefix specifies either kernel (k)
+ * or user (u) memory to probe, and:
+ * s a string, equivalent to "%s" on direct vsnprintf() use
*
* ** When making changes please also update:
* Documentation/core-api/printk-formats.rst
if (!IS_ERR(ptr))
break;
return err_ptr(buf, end, ptr, spec);
+ case 'u':
+ case 'k':
+ switch (fmt[1]) {
+ case 's':
+ return string(buf, end, ptr, spec);
+ default:
+ return error_string(buf, end, "(einval)", spec);
+ }
}
/* default is to _not_ leak addresses, hash before printing */
* would succeed.
*/
if (cc->order > 0 && last_migrated_pfn) {
- int cpu;
unsigned long current_block_start =
block_start_pfn(cc->migrate_pfn, cc->order);
if (last_migrated_pfn < current_block_start) {
- cpu = get_cpu();
- lru_add_drain_cpu(cpu);
- drain_local_pages(cc->zone);
- put_cpu();
+ lru_add_drain_cpu_zone(cc->zone);
/* No more flushing until we migrate again */
last_migrated_pfn = 0;
}
if (!vma_permits_fault(vma, fault_flags))
return -EFAULT;
+ if ((fault_flags & FAULT_FLAG_KILLABLE) &&
+ fatal_signal_pending(current))
+ return -EINTR;
+
ret = handle_mm_fault(vma, address, fault_flags);
major |= ret & VM_FAULT_MAJOR;
if (ret & VM_FAULT_ERROR) {
if (ret & VM_FAULT_RETRY) {
down_read(&mm->mmap_sem);
- if (!(fault_flags & FAULT_FLAG_TRIED)) {
- *unlocked = true;
- fault_flags |= FAULT_FLAG_TRIED;
- goto retry;
- }
+ *unlocked = true;
+ fault_flags |= FAULT_FLAG_TRIED;
+ goto retry;
}
if (tsk) {
# SPDX-License-Identifier: GPL-2.0
KASAN_SANITIZE := n
-UBSAN_SANITIZE_common.o := n
-UBSAN_SANITIZE_generic.o := n
-UBSAN_SANITIZE_generic_report.o := n
-UBSAN_SANITIZE_tags.o := n
+UBSAN_SANITIZE := n
KCOV_INSTRUMENT := n
+# Disable ftrace to avoid recursion.
CFLAGS_REMOVE_common.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_generic.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_generic_report.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_init.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_quarantine.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_report.o = $(CC_FLAGS_FTRACE)
CFLAGS_REMOVE_tags.o = $(CC_FLAGS_FTRACE)
+CFLAGS_REMOVE_tags_report.o = $(CC_FLAGS_FTRACE)
# Function splitter causes unnecessary splits in __asan_load1/__asan_store1
# see: https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63533
-
-CFLAGS_common.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
-CFLAGS_generic.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
-CFLAGS_generic_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
-CFLAGS_tags.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector)
+CFLAGS_common.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_generic.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_generic_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_init.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_quarantine.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_tags.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
+CFLAGS_tags_report.o := $(call cc-option, -fno-conserve-stack -fno-stack-protector) -DDISABLE_BRANCH_PROFILING
obj-$(CONFIG_KASAN) := common.o init.o report.o
obj-$(CONFIG_KASAN_GENERIC) += generic.o generic_report.o quarantine.o
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-#define DISABLE_BRANCH_PROFILING
#include <linux/export.h>
#include <linux/interrupt.h>
asmlinkage void kasan_unpoison_task_stack_below(const void *watermark);
void __asan_register_globals(struct kasan_global *globals, size_t size);
void __asan_unregister_globals(struct kasan_global *globals, size_t size);
-void __asan_loadN(unsigned long addr, size_t size);
-void __asan_storeN(unsigned long addr, size_t size);
void __asan_handle_no_return(void);
void __asan_alloca_poison(unsigned long addr, size_t size);
void __asan_allocas_unpoison(const void *stack_top, const void *stack_bottom);
void __asan_store8(unsigned long addr);
void __asan_load16(unsigned long addr);
void __asan_store16(unsigned long addr);
+void __asan_loadN(unsigned long addr, size_t size);
+void __asan_storeN(unsigned long addr, size_t size);
void __asan_load1_noabort(unsigned long addr);
void __asan_store1_noabort(unsigned long addr);
void __asan_store8_noabort(unsigned long addr);
void __asan_load16_noabort(unsigned long addr);
void __asan_store16_noabort(unsigned long addr);
+void __asan_loadN_noabort(unsigned long addr, size_t size);
+void __asan_storeN_noabort(unsigned long addr, size_t size);
+
+void __asan_report_load1_noabort(unsigned long addr);
+void __asan_report_store1_noabort(unsigned long addr);
+void __asan_report_load2_noabort(unsigned long addr);
+void __asan_report_store2_noabort(unsigned long addr);
+void __asan_report_load4_noabort(unsigned long addr);
+void __asan_report_store4_noabort(unsigned long addr);
+void __asan_report_load8_noabort(unsigned long addr);
+void __asan_report_store8_noabort(unsigned long addr);
+void __asan_report_load16_noabort(unsigned long addr);
+void __asan_report_store16_noabort(unsigned long addr);
+void __asan_report_load_n_noabort(unsigned long addr, size_t size);
+void __asan_report_store_n_noabort(unsigned long addr, size_t size);
void __asan_set_shadow_00(const void *addr, size_t size);
void __asan_set_shadow_f1(const void *addr, size_t size);
void __asan_set_shadow_f5(const void *addr, size_t size);
void __asan_set_shadow_f8(const void *addr, size_t size);
+void __hwasan_load1_noabort(unsigned long addr);
+void __hwasan_store1_noabort(unsigned long addr);
+void __hwasan_load2_noabort(unsigned long addr);
+void __hwasan_store2_noabort(unsigned long addr);
+void __hwasan_load4_noabort(unsigned long addr);
+void __hwasan_store4_noabort(unsigned long addr);
+void __hwasan_load8_noabort(unsigned long addr);
+void __hwasan_store8_noabort(unsigned long addr);
+void __hwasan_load16_noabort(unsigned long addr);
+void __hwasan_store16_noabort(unsigned long addr);
+void __hwasan_loadN_noabort(unsigned long addr, size_t size);
+void __hwasan_storeN_noabort(unsigned long addr, size_t size);
+
+void __hwasan_tag_memory(unsigned long addr, u8 tag, unsigned long size);
+
#endif
*/
#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
-#define DISABLE_BRANCH_PROFILING
#include <linux/export.h>
#include <linux/interrupt.h>
if (page_has_private(page) &&
!try_to_release_page(page, GFP_KERNEL)) {
result = SCAN_PAGE_HAS_PRIVATE;
+ putback_lru_page(page);
goto out_unlock;
}
if (locked && new_len > old_len)
mm_populate(new_addr + old_len, new_len - old_len);
userfaultfd_unmap_complete(mm, &uf_unmap_early);
- mremap_userfaultfd_complete(&uf, addr, new_addr, old_len);
+ mremap_userfaultfd_complete(&uf, addr, ret, old_len);
userfaultfd_unmap_complete(mm, &uf_unmap);
return ret;
}
#include <linux/uio.h>
#include <linux/hugetlb.h>
#include <linux/page_idle.h>
+#include <linux/local_lock.h>
#include "internal.h"
/* How many pages do we try to swap or page in/out together? */
int page_cluster;
-static DEFINE_PER_CPU(struct pagevec, lru_add_pvec);
-static DEFINE_PER_CPU(struct pagevec, lru_rotate_pvecs);
-static DEFINE_PER_CPU(struct pagevec, lru_deactivate_file_pvecs);
-static DEFINE_PER_CPU(struct pagevec, lru_deactivate_pvecs);
-static DEFINE_PER_CPU(struct pagevec, lru_lazyfree_pvecs);
+/* Protecting only lru_rotate.pvec which requires disabling interrupts */
+struct lru_rotate {
+ local_lock_t lock;
+ struct pagevec pvec;
+};
+static DEFINE_PER_CPU(struct lru_rotate, lru_rotate) = {
+ .lock = INIT_LOCAL_LOCK(lock),
+};
+
+/*
+ * The following struct pagevec are grouped together because they are protected
+ * by disabling preemption (and interrupts remain enabled).
+ */
+struct lru_pvecs {
+ local_lock_t lock;
+ struct pagevec lru_add;
+ struct pagevec lru_deactivate_file;
+ struct pagevec lru_deactivate;
+ struct pagevec lru_lazyfree;
#ifdef CONFIG_SMP
-static DEFINE_PER_CPU(struct pagevec, activate_page_pvecs);
+ struct pagevec activate_page;
#endif
+};
+static DEFINE_PER_CPU(struct lru_pvecs, lru_pvecs) = {
+ .lock = INIT_LOCAL_LOCK(lock),
+};
/*
* This path almost never happens for VM activity - pages are normally
unsigned long flags;
get_page(page);
- local_irq_save(flags);
- pvec = this_cpu_ptr(&lru_rotate_pvecs);
+ local_lock_irqsave(&lru_rotate.lock, flags);
+ pvec = this_cpu_ptr(&lru_rotate.pvec);
if (!pagevec_add(pvec, page) || PageCompound(page))
pagevec_move_tail(pvec);
- local_irq_restore(flags);
+ local_unlock_irqrestore(&lru_rotate.lock, flags);
}
}
#ifdef CONFIG_SMP
static void activate_page_drain(int cpu)
{
- struct pagevec *pvec = &per_cpu(activate_page_pvecs, cpu);
+ struct pagevec *pvec = &per_cpu(lru_pvecs.activate_page, cpu);
if (pagevec_count(pvec))
pagevec_lru_move_fn(pvec, __activate_page, NULL);
static bool need_activate_page_drain(int cpu)
{
- return pagevec_count(&per_cpu(activate_page_pvecs, cpu)) != 0;
+ return pagevec_count(&per_cpu(lru_pvecs.activate_page, cpu)) != 0;
}
void activate_page(struct page *page)
{
page = compound_head(page);
if (PageLRU(page) && !PageActive(page) && !PageUnevictable(page)) {
- struct pagevec *pvec = &get_cpu_var(activate_page_pvecs);
+ struct pagevec *pvec;
+ local_lock(&lru_pvecs.lock);
+ pvec = this_cpu_ptr(&lru_pvecs.activate_page);
get_page(page);
if (!pagevec_add(pvec, page) || PageCompound(page))
pagevec_lru_move_fn(pvec, __activate_page, NULL);
- put_cpu_var(activate_page_pvecs);
+ local_unlock(&lru_pvecs.lock);
}
}
static void __lru_cache_activate_page(struct page *page)
{
- struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
+ struct pagevec *pvec;
int i;
+ local_lock(&lru_pvecs.lock);
+ pvec = this_cpu_ptr(&lru_pvecs.lru_add);
+
/*
* Search backwards on the optimistic assumption that the page being
* activated has just been added to this pagevec. Note that only
}
}
- put_cpu_var(lru_add_pvec);
+ local_unlock(&lru_pvecs.lock);
}
/*
} else if (!PageActive(page)) {
/*
* If the page is on the LRU, queue it for activation via
- * activate_page_pvecs. Otherwise, assume the page is on a
+ * lru_pvecs.activate_page. Otherwise, assume the page is on a
* pagevec, mark it active and it'll be moved to the active
* LRU on the next drain.
*/
static void __lru_cache_add(struct page *page)
{
- struct pagevec *pvec = &get_cpu_var(lru_add_pvec);
+ struct pagevec *pvec;
+ local_lock(&lru_pvecs.lock);
+ pvec = this_cpu_ptr(&lru_pvecs.lru_add);
get_page(page);
if (!pagevec_add(pvec, page) || PageCompound(page))
__pagevec_lru_add(pvec);
- put_cpu_var(lru_add_pvec);
+ local_unlock(&lru_pvecs.lock);
}
/**
*/
void lru_add_drain_cpu(int cpu)
{
- struct pagevec *pvec = &per_cpu(lru_add_pvec, cpu);
+ struct pagevec *pvec = &per_cpu(lru_pvecs.lru_add, cpu);
if (pagevec_count(pvec))
__pagevec_lru_add(pvec);
- pvec = &per_cpu(lru_rotate_pvecs, cpu);
+ pvec = &per_cpu(lru_rotate.pvec, cpu);
if (pagevec_count(pvec)) {
unsigned long flags;
/* No harm done if a racing interrupt already did this */
- local_irq_save(flags);
+ local_lock_irqsave(&lru_rotate.lock, flags);
pagevec_move_tail(pvec);
- local_irq_restore(flags);
+ local_unlock_irqrestore(&lru_rotate.lock, flags);
}
- pvec = &per_cpu(lru_deactivate_file_pvecs, cpu);
+ pvec = &per_cpu(lru_pvecs.lru_deactivate_file, cpu);
if (pagevec_count(pvec))
pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL);
- pvec = &per_cpu(lru_deactivate_pvecs, cpu);
+ pvec = &per_cpu(lru_pvecs.lru_deactivate, cpu);
if (pagevec_count(pvec))
pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
- pvec = &per_cpu(lru_lazyfree_pvecs, cpu);
+ pvec = &per_cpu(lru_pvecs.lru_lazyfree, cpu);
if (pagevec_count(pvec))
pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL);
return;
if (likely(get_page_unless_zero(page))) {
- struct pagevec *pvec = &get_cpu_var(lru_deactivate_file_pvecs);
+ struct pagevec *pvec;
+
+ local_lock(&lru_pvecs.lock);
+ pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate_file);
if (!pagevec_add(pvec, page) || PageCompound(page))
pagevec_lru_move_fn(pvec, lru_deactivate_file_fn, NULL);
- put_cpu_var(lru_deactivate_file_pvecs);
+ local_unlock(&lru_pvecs.lock);
}
}
void deactivate_page(struct page *page)
{
if (PageLRU(page) && PageActive(page) && !PageUnevictable(page)) {
- struct pagevec *pvec = &get_cpu_var(lru_deactivate_pvecs);
+ struct pagevec *pvec;
+ local_lock(&lru_pvecs.lock);
+ pvec = this_cpu_ptr(&lru_pvecs.lru_deactivate);
get_page(page);
if (!pagevec_add(pvec, page) || PageCompound(page))
pagevec_lru_move_fn(pvec, lru_deactivate_fn, NULL);
- put_cpu_var(lru_deactivate_pvecs);
+ local_unlock(&lru_pvecs.lock);
}
}
{
if (PageLRU(page) && PageAnon(page) && PageSwapBacked(page) &&
!PageSwapCache(page) && !PageUnevictable(page)) {
- struct pagevec *pvec = &get_cpu_var(lru_lazyfree_pvecs);
+ struct pagevec *pvec;
+ local_lock(&lru_pvecs.lock);
+ pvec = this_cpu_ptr(&lru_pvecs.lru_lazyfree);
get_page(page);
if (!pagevec_add(pvec, page) || PageCompound(page))
pagevec_lru_move_fn(pvec, lru_lazyfree_fn, NULL);
- put_cpu_var(lru_lazyfree_pvecs);
+ local_unlock(&lru_pvecs.lock);
}
}
void lru_add_drain(void)
{
- lru_add_drain_cpu(get_cpu());
- put_cpu();
+ local_lock(&lru_pvecs.lock);
+ lru_add_drain_cpu(smp_processor_id());
+ local_unlock(&lru_pvecs.lock);
+}
+
+void lru_add_drain_cpu_zone(struct zone *zone)
+{
+ local_lock(&lru_pvecs.lock);
+ lru_add_drain_cpu(smp_processor_id());
+ drain_local_pages(zone);
+ local_unlock(&lru_pvecs.lock);
}
#ifdef CONFIG_SMP
for_each_online_cpu(cpu) {
struct work_struct *work = &per_cpu(lru_add_drain_work, cpu);
- if (pagevec_count(&per_cpu(lru_add_pvec, cpu)) ||
- pagevec_count(&per_cpu(lru_rotate_pvecs, cpu)) ||
- pagevec_count(&per_cpu(lru_deactivate_file_pvecs, cpu)) ||
- pagevec_count(&per_cpu(lru_deactivate_pvecs, cpu)) ||
- pagevec_count(&per_cpu(lru_lazyfree_pvecs, cpu)) ||
+ if (pagevec_count(&per_cpu(lru_pvecs.lru_add, cpu)) ||
+ pagevec_count(&per_cpu(lru_rotate.pvec, cpu)) ||
+ pagevec_count(&per_cpu(lru_pvecs.lru_deactivate_file, cpu)) ||
+ pagevec_count(&per_cpu(lru_pvecs.lru_deactivate, cpu)) ||
+ pagevec_count(&per_cpu(lru_pvecs.lru_lazyfree, cpu)) ||
need_activate_page_drain(cpu)) {
INIT_WORK(work, lru_add_drain_per_cpu);
queue_work_on(cpu, mm_percpu_wq, work);
#include <linux/spinlock.h>
#include <linux/zpool.h>
#include <linux/magic.h>
+#include <linux/kmemleak.h>
/*
* NCHUNKS_ORDER determines the internal allocation granularity, effectively
(gfp & ~(__GFP_HIGHMEM | __GFP_MOVABLE)));
if (slots) {
+ /* It will be freed separately in free_handle(). */
+ kmemleak_not_leak(slots);
memset(slots->slot, 0, sizeof(slots->slot));
slots->pool = (unsigned long)pool;
rwlock_init(&slots->lock);
slots = handle_to_slots(handle);
write_lock(&slots->lock);
*(unsigned long *)handle = 0;
- write_unlock(&slots->lock);
- if (zhdr->slots == slots)
+ if (zhdr->slots == slots) {
+ write_unlock(&slots->lock);
return; /* simple case, nothing else to do */
+ }
/* we are freeing a foreign handle if we are here */
zhdr->foreign_handles--;
is_free = true;
- read_lock(&slots->lock);
if (!test_bit(HANDLES_ORPHANED, &slots->pool)) {
- read_unlock(&slots->lock);
+ write_unlock(&slots->lock);
return;
}
for (i = 0; i <= BUDDY_MASK; i++) {
break;
}
}
- read_unlock(&slots->lock);
+ write_unlock(&slots->lock);
if (is_free) {
struct z3fold_pool *pool = slots_to_pool(slots);
zhdr->start_middle = 0;
zhdr->cpu = -1;
zhdr->foreign_handles = 0;
+ zhdr->mapped_count = 0;
zhdr->slots = slots;
zhdr->pool = pool;
INIT_LIST_HEAD(&zhdr->buddy);
break;
case SO_BINDTODEVICE:
- if (optlen > IFNAMSIZ)
- optlen = IFNAMSIZ;
+ if (optlen > IFNAMSIZ - 1)
+ optlen = IFNAMSIZ - 1;
+
+ memset(devname, 0, sizeof(devname));
if (copy_from_user(devname, optval, optlen)) {
res = -EFAULT;
size_t len, u8 mac[16])
{
uint8_t tmp[16], mac_msb[16], msg_msb[CMAC_MSG_MAX];
- SHASH_DESC_ON_STACK(desc, tfm);
int err;
if (len > CMAC_MSG_MAX)
return -EINVAL;
}
- desc->tfm = tfm;
-
/* Swap key and message from LSB to MSB */
swap_buf(k, tmp, 16);
swap_buf(m, msg_msb, len);
return err;
}
- err = crypto_shash_digest(desc, msg_msb, len, mac_msb);
- shash_desc_zero(desc);
+ err = crypto_shash_tfm_digest(tfm, msg_msb, len, mac_msb);
if (err) {
BT_ERR("Hash computation error %d", err);
return err;
free_percpu(br->mcast_stats);
}
-static void mcast_stats_add_dir(u64 *dst, u64 *src)
+/* noinline for https://bugs.llvm.org/show_bug.cgi?id=45802#c9 */
+static noinline_for_stack void mcast_stats_add_dir(u64 *dst, u64 *src)
{
dst[BR_MCAST_DIR_RX] += src[BR_MCAST_DIR_RX];
dst[BR_MCAST_DIR_TX] += src[BR_MCAST_DIR_TX];
ether_addr_copy(eth->h_dest, eth_hdr(oldskb)->h_source);
eth->h_proto = eth_hdr(oldskb)->h_proto;
skb_pull(nskb, ETH_HLEN);
+
+ if (skb_vlan_tag_present(oldskb)) {
+ u16 vid = skb_vlan_tag_get(oldskb);
+
+ __vlan_hwaccel_put_tag(nskb, oldskb->vlan_proto, vid);
+ }
}
static int nft_bridge_iphdr_validate(struct sk_buff *skb)
* supported.
*/
req->r_t.target_oloc.pool = m.redirect.oloc.pool;
- req->r_flags |= CEPH_OSD_FLAG_REDIRECTED;
+ req->r_flags |= CEPH_OSD_FLAG_REDIRECTED |
+ CEPH_OSD_FLAG_IGNORE_OVERLAY |
+ CEPH_OSD_FLAG_IGNORE_CACHE;
req->r_tid = 0;
__submit_request(req, false);
goto out_unlock_osdc;
return 0;
}
-static int __netif_receive_skb_core(struct sk_buff *skb, bool pfmemalloc,
+static int __netif_receive_skb_core(struct sk_buff **pskb, bool pfmemalloc,
struct packet_type **ppt_prev)
{
struct packet_type *ptype, *pt_prev;
rx_handler_func_t *rx_handler;
+ struct sk_buff *skb = *pskb;
struct net_device *orig_dev;
bool deliver_exact = false;
int ret = NET_RX_DROP;
ret2 = do_xdp_generic(rcu_dereference(skb->dev->xdp_prog), skb);
preempt_enable();
- if (ret2 != XDP_PASS)
- return NET_RX_DROP;
+ if (ret2 != XDP_PASS) {
+ ret = NET_RX_DROP;
+ goto out;
+ }
skb_reset_mac_len(skb);
}
}
out:
+ /* The invariant here is that if *ppt_prev is not NULL
+ * then skb should also be non-NULL.
+ *
+ * Apparently *ppt_prev assignment above holds this invariant due to
+ * skb dereferencing near it.
+ */
+ *pskb = skb;
return ret;
}
struct packet_type *pt_prev = NULL;
int ret;
- ret = __netif_receive_skb_core(skb, pfmemalloc, &pt_prev);
+ ret = __netif_receive_skb_core(&skb, pfmemalloc, &pt_prev);
if (pt_prev)
ret = INDIRECT_CALL_INET(pt_prev->func, ipv6_rcv, ip_rcv, skb,
skb->dev, pt_prev, orig_dev);
struct packet_type *pt_prev = NULL;
skb_list_del_init(skb);
- __netif_receive_skb_core(skb, pfmemalloc, &pt_prev);
+ __netif_receive_skb_core(&skb, pfmemalloc, &pt_prev);
if (!pt_prev)
continue;
if (pt_curr != pt_prev || od_curr != orig_dev) {
netdev_dbg(upper, "Disabling feature %pNF on lower dev %s.\n",
&feature, lower->name);
lower->wanted_features &= ~feature;
- netdev_update_features(lower);
+ __netdev_update_features(lower);
if (unlikely(lower->features & feature))
netdev_WARN(upper, "failed to disable %pNF on %s!\n",
&feature, lower->name);
+ else
+ netdev_features_change(lower);
}
}
}
}
pop = 0;
} else if (pop >= sge->length - a) {
- sge->length = a;
pop -= (sge->length - a);
+ sge->length = a;
}
}
return ret;
}
-int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
+static int flow_dissector_bpf_prog_detach(struct net *net)
{
struct bpf_prog *attached;
- struct net *net;
- net = current->nsproxy->net_ns;
mutex_lock(&flow_dissector_mutex);
attached = rcu_dereference_protected(net->flow_dissector_prog,
lockdep_is_held(&flow_dissector_mutex));
return 0;
}
+int skb_flow_dissector_bpf_prog_detach(const union bpf_attr *attr)
+{
+ return flow_dissector_bpf_prog_detach(current->nsproxy->net_ns);
+}
+
+static void __net_exit flow_dissector_pernet_pre_exit(struct net *net)
+{
+ /* We're not racing with attach/detach because there are no
+ * references to netns left when pre_exit gets called.
+ */
+ if (rcu_access_pointer(net->flow_dissector_prog))
+ flow_dissector_bpf_prog_detach(net);
+}
+
+static struct pernet_operations flow_dissector_pernet_ops __net_initdata = {
+ .pre_exit = flow_dissector_pernet_pre_exit,
+};
+
/**
* __skb_flow_get_ports - extract the upper layer ports and return them
* @skb: sk_buff to extract the ports from
skb_flow_dissector_init(&flow_keys_basic_dissector,
flow_keys_basic_dissector_keys,
ARRAY_SIZE(flow_keys_basic_dissector_keys));
- return 0;
-}
+ return register_pernet_subsys(&flow_dissector_pernet_ops);
+}
core_initcall(init_default_flow_dissectors);
}
if (neigh->nud_state & NUD_IN_TIMER) {
- if (time_before(next, jiffies + HZ/2))
- next = jiffies + HZ/2;
+ if (time_before(next, jiffies + HZ/100))
+ next = jiffies + HZ/100;
if (!mod_timer(&neigh->timer, next))
neigh_hold(neigh);
}
struct task_struct *p;
struct cgroup_subsys_state *css;
+ cgroup_sk_alloc_disable();
+
cgroup_taskset_for_each(p, css, tset) {
void *v = (void *)(unsigned long)css->id;
#include <linux/kernel.h>
#include <linux/init.h>
-#include <linux/cryptohash.h>
#include <linux/module.h>
#include <linux/cache.h>
#include <linux/random.h>
if (ds->ops->port_vlan_add && ds->ops->port_vlan_del)
slave_dev->features |= NETIF_F_HW_VLAN_CTAG_FILTER;
slave_dev->hw_features |= NETIF_F_HW_TC;
+ slave_dev->features |= NETIF_F_LLTX;
slave_dev->ethtool_ops = &dsa_slave_ethtool_ops;
if (!IS_ERR_OR_NULL(port->mac))
ether_addr_copy(slave_dev->dev_addr, port->mac);
#define MTK_HDR_XMIT_TAGGED_TPID_8100 1
#define MTK_HDR_RECV_SOURCE_PORT_MASK GENMASK(2, 0)
#define MTK_HDR_XMIT_DP_BIT_MASK GENMASK(5, 0)
+#define MTK_HDR_XMIT_SA_DIS BIT(6)
static struct sk_buff *mtk_tag_xmit(struct sk_buff *skb,
struct net_device *dev)
struct dsa_port *dp = dsa_slave_to_port(dev);
u8 *mtk_tag;
bool is_vlan_skb = true;
+ unsigned char *dest = eth_hdr(skb)->h_dest;
+ bool is_multicast_skb = is_multicast_ether_addr(dest) &&
+ !is_broadcast_ether_addr(dest);
/* Build the special tag after the MAC Source Address. If VLAN header
* is present, it's required that VLAN header and special tag is
MTK_HDR_XMIT_UNTAGGED;
mtk_tag[1] = (1 << dp->index) & MTK_HDR_XMIT_DP_BIT_MASK;
+ /* Disable SA learning for multicast frames */
+ if (unlikely(is_multicast_skb))
+ mtk_tag[1] |= MTK_HDR_XMIT_SA_DIS;
+
/* Tag control information is kept for 802.1Q */
if (!is_vlan_skb) {
mtk_tag[2] = 0;
{
int port;
__be16 *phdr, hdr;
+ unsigned char *dest = eth_hdr(skb)->h_dest;
+ bool is_multicast_skb = is_multicast_ether_addr(dest) &&
+ !is_broadcast_ether_addr(dest);
if (unlikely(!pskb_may_pull(skb, MTK_HDR_LEN)))
return NULL;
if (!skb->dev)
return NULL;
+ /* Only unicast or broadcast frames are offloaded */
+ if (likely(!is_multicast_skb))
+ skb->offload_fwd_mark = 1;
+
return skb;
}
ret = ops->reply_size(req_info, reply_data);
if (ret < 0)
goto err_cleanup;
- reply_len = ret;
+ reply_len = ret + ethnl_reply_header_size();
ret = -ENOMEM;
rskb = ethnl_reply_init(reply_len, req_info->dev, ops->reply_cmd,
ops->hdr_attr, info, &reply_payload);
ret = ops->reply_size(req_info, reply_data);
if (ret < 0)
goto err_cleanup;
- reply_len = ret;
+ reply_len = ret + ethnl_reply_header_size();
ret = -ENOMEM;
skb = genlmsg_new(reply_len, GFP_KERNEL);
if (!skb)
int len = 0;
int ret;
- len += ethnl_reply_header_size();
for (i = 0; i < ETH_SS_COUNT; i++) {
const struct strset_info *set_info = &data->sets[i];
return ret_val;
}
- secattr->flags |= NETLBL_SECATTR_MLS_CAT;
+ if (secattr->attr.mls.cat)
+ secattr->flags |= NETLBL_SECATTR_MLS_CAT;
}
return 0;
return ret_val;
}
- secattr->flags |= NETLBL_SECATTR_MLS_CAT;
+ if (secattr->attr.mls.cat)
+ secattr->flags |= NETLBL_SECATTR_MLS_CAT;
}
return 0;
err = devinet_sysctl_register(in_dev);
if (err) {
in_dev->dead = 1;
+ neigh_parms_release(&arp_tbl, in_dev->arp_parms);
in_dev_put(in_dev);
in_dev = NULL;
goto out;
sp->olen++;
xo = xfrm_offload(skb);
- if (!xo) {
- xfrm_state_put(x);
+ if (!xo)
goto out_reset;
- }
}
xo->flags |= XFRM_GRO;
struct xfrm_offload *xo = xfrm_offload(skb);
struct sk_buff *segs = ERR_PTR(-EINVAL);
const struct net_offload *ops;
- int proto = xo->proto;
+ u8 proto = xo->proto;
skb->transport_header += x->props.header_len;
- if (proto == IPPROTO_BEETPH) {
- struct ip_beet_phdr *ph = (struct ip_beet_phdr *)skb->data;
+ if (x->sel.family != AF_INET6) {
+ if (proto == IPPROTO_BEETPH) {
+ struct ip_beet_phdr *ph =
+ (struct ip_beet_phdr *)skb->data;
+
+ skb->transport_header += ph->hdrlen * 8;
+ proto = ph->nexthdr;
+ } else {
+ skb->transport_header -= IPV4_BEET_PHMAXLEN;
+ }
+ } else {
+ __be16 frag;
- skb->transport_header += ph->hdrlen * 8;
- proto = ph->nexthdr;
- } else if (x->sel.family != AF_INET6) {
- skb->transport_header -= IPV4_BEET_PHMAXLEN;
- } else if (proto == IPPROTO_TCP) {
- skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
+ skb->transport_header +=
+ ipv6_skip_exthdr(skb, 0, &proto, &frag);
+ if (proto == IPPROTO_TCP)
+ skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV4;
}
__skb_pull(skb, skb_transport_offset(skb));
{
bool dev_match = false;
#ifdef CONFIG_IP_ROUTE_MULTIPATH
- int ret;
+ if (unlikely(fi->nh)) {
+ dev_match = nexthop_uses_dev(fi->nh, dev);
+ } else {
+ int ret;
- for (ret = 0; ret < fib_info_num_path(fi); ret++) {
- const struct fib_nh_common *nhc = fib_info_nhc(fi, ret);
+ for (ret = 0; ret < fib_info_num_path(fi); ret++) {
+ const struct fib_nh_common *nhc = fib_info_nhc(fi, ret);
- if (nhc->nhc_dev == dev) {
- dev_match = true;
- break;
- } else if (l3mdev_master_ifindex_rcu(nhc->nhc_dev) == dev->ifindex) {
- dev_match = true;
- break;
+ if (nhc_l3mdev_matches_dev(nhc, dev)) {
+ dev_match = true;
+ break;
+ }
}
}
#else
else
filter->dump_exceptions = false;
- filter->dump_all_families = (rtm->rtm_family == AF_UNSPEC);
filter->flags = rtm->rtm_flags;
filter->protocol = rtm->rtm_protocol;
filter->rt_type = rtm->rtm_type;
if (filter.table_id) {
tb = fib_get_table(net, filter.table_id);
if (!tb) {
- if (filter.dump_all_families)
+ if (rtnl_msg_family(cb->nlh) != PF_INET)
return skb->len;
NL_SET_ERR_MSG(cb->extack, "ipv4: FIB table does not exist");
return (key ^ prefix) & (prefix | -prefix);
}
+bool fib_lookup_good_nhc(const struct fib_nh_common *nhc, int fib_flags,
+ const struct flowi4 *flp)
+{
+ if (nhc->nhc_flags & RTNH_F_DEAD)
+ return false;
+
+ if (ip_ignore_linkdown(nhc->nhc_dev) &&
+ nhc->nhc_flags & RTNH_F_LINKDOWN &&
+ !(fib_flags & FIB_LOOKUP_IGNORE_LINKSTATE))
+ return false;
+
+ if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {
+ if (flp->flowi4_oif &&
+ flp->flowi4_oif != nhc->nhc_oif)
+ return false;
+ }
+
+ return true;
+}
+
/* should be called with rcu_read_lock */
int fib_table_lookup(struct fib_table *tb, const struct flowi4 *flp,
struct fib_result *res, int fib_flags)
/* Step 3: Process the leaf, if that fails fall back to backtracing */
hlist_for_each_entry_rcu(fa, &n->leaf, fa_list) {
struct fib_info *fi = fa->fa_info;
+ struct fib_nh_common *nhc;
int nhsel, err;
if ((BITS_PER_LONG > KEYLENGTH) || (fa->fa_slen < KEYLENGTH)) {
if (fi->fib_flags & RTNH_F_DEAD)
continue;
- if (unlikely(fi->nh && nexthop_is_blackhole(fi->nh))) {
- err = fib_props[RTN_BLACKHOLE].error;
- goto out_reject;
+ if (unlikely(fi->nh)) {
+ if (nexthop_is_blackhole(fi->nh)) {
+ err = fib_props[RTN_BLACKHOLE].error;
+ goto out_reject;
+ }
+
+ nhc = nexthop_get_nhc_lookup(fi->nh, fib_flags, flp,
+ &nhsel);
+ if (nhc)
+ goto set_result;
+ goto miss;
}
for (nhsel = 0; nhsel < fib_info_num_path(fi); nhsel++) {
- struct fib_nh_common *nhc = fib_info_nhc(fi, nhsel);
+ nhc = fib_info_nhc(fi, nhsel);
- if (nhc->nhc_flags & RTNH_F_DEAD)
+ if (!fib_lookup_good_nhc(nhc, fib_flags, flp))
continue;
- if (ip_ignore_linkdown(nhc->nhc_dev) &&
- nhc->nhc_flags & RTNH_F_LINKDOWN &&
- !(fib_flags & FIB_LOOKUP_IGNORE_LINKSTATE))
- continue;
- if (!(flp->flowi4_flags & FLOWI_FLAG_SKIP_NH_OIF)) {
- if (flp->flowi4_oif &&
- flp->flowi4_oif != nhc->nhc_oif)
- continue;
- }
-
+set_result:
if (!(fib_flags & FIB_LOOKUP_NOREF))
refcount_inc(&fi->fib_clntref);
return err;
}
}
+miss:
#ifdef CONFIG_IP_FIB_TRIE_STATS
this_cpu_inc(stats->semantic_match_miss);
#endif
#include <net/addrconf.h>
#if IS_ENABLED(CONFIG_IPV6)
-/* match_wildcard == true: IPV6_ADDR_ANY equals to any IPv6 addresses if IPv6
- * only, and any IPv4 addresses if not IPv6 only
- * match_wildcard == false: addresses must be exactly the same, i.e.
- * IPV6_ADDR_ANY only equals to IPV6_ADDR_ANY,
- * and 0.0.0.0 equals to 0.0.0.0 only
+/* match_sk*_wildcard == true: IPV6_ADDR_ANY equals to any IPv6 addresses
+ * if IPv6 only, and any IPv4 addresses
+ * if not IPv6 only
+ * match_sk*_wildcard == false: addresses must be exactly the same, i.e.
+ * IPV6_ADDR_ANY only equals to IPV6_ADDR_ANY,
+ * and 0.0.0.0 equals to 0.0.0.0 only
*/
static bool ipv6_rcv_saddr_equal(const struct in6_addr *sk1_rcv_saddr6,
const struct in6_addr *sk2_rcv_saddr6,
__be32 sk1_rcv_saddr, __be32 sk2_rcv_saddr,
bool sk1_ipv6only, bool sk2_ipv6only,
- bool match_wildcard)
+ bool match_sk1_wildcard,
+ bool match_sk2_wildcard)
{
int addr_type = ipv6_addr_type(sk1_rcv_saddr6);
int addr_type2 = sk2_rcv_saddr6 ? ipv6_addr_type(sk2_rcv_saddr6) : IPV6_ADDR_MAPPED;
if (!sk2_ipv6only) {
if (sk1_rcv_saddr == sk2_rcv_saddr)
return true;
- if (!sk1_rcv_saddr || !sk2_rcv_saddr)
- return match_wildcard;
+ return (match_sk1_wildcard && !sk1_rcv_saddr) ||
+ (match_sk2_wildcard && !sk2_rcv_saddr);
}
return false;
}
if (addr_type == IPV6_ADDR_ANY && addr_type2 == IPV6_ADDR_ANY)
return true;
- if (addr_type2 == IPV6_ADDR_ANY && match_wildcard &&
+ if (addr_type2 == IPV6_ADDR_ANY && match_sk2_wildcard &&
!(sk2_ipv6only && addr_type == IPV6_ADDR_MAPPED))
return true;
- if (addr_type == IPV6_ADDR_ANY && match_wildcard &&
+ if (addr_type == IPV6_ADDR_ANY && match_sk1_wildcard &&
!(sk1_ipv6only && addr_type2 == IPV6_ADDR_MAPPED))
return true;
}
#endif
-/* match_wildcard == true: 0.0.0.0 equals to any IPv4 addresses
- * match_wildcard == false: addresses must be exactly the same, i.e.
- * 0.0.0.0 only equals to 0.0.0.0
+/* match_sk*_wildcard == true: 0.0.0.0 equals to any IPv4 addresses
+ * match_sk*_wildcard == false: addresses must be exactly the same, i.e.
+ * 0.0.0.0 only equals to 0.0.0.0
*/
static bool ipv4_rcv_saddr_equal(__be32 sk1_rcv_saddr, __be32 sk2_rcv_saddr,
- bool sk2_ipv6only, bool match_wildcard)
+ bool sk2_ipv6only, bool match_sk1_wildcard,
+ bool match_sk2_wildcard)
{
if (!sk2_ipv6only) {
if (sk1_rcv_saddr == sk2_rcv_saddr)
return true;
- if (!sk1_rcv_saddr || !sk2_rcv_saddr)
- return match_wildcard;
+ return (match_sk1_wildcard && !sk1_rcv_saddr) ||
+ (match_sk2_wildcard && !sk2_rcv_saddr);
}
return false;
}
sk2->sk_rcv_saddr,
ipv6_only_sock(sk),
ipv6_only_sock(sk2),
+ match_wildcard,
match_wildcard);
#endif
return ipv4_rcv_saddr_equal(sk->sk_rcv_saddr, sk2->sk_rcv_saddr,
- ipv6_only_sock(sk2), match_wildcard);
+ ipv6_only_sock(sk2), match_wildcard,
+ match_wildcard);
}
EXPORT_SYMBOL(inet_rcv_saddr_equal);
tb->fast_rcv_saddr,
sk->sk_rcv_saddr,
tb->fast_ipv6_only,
- ipv6_only_sock(sk), true);
+ ipv6_only_sock(sk), true, false);
#endif
return ipv4_rcv_saddr_equal(tb->fast_rcv_saddr, sk->sk_rcv_saddr,
- ipv6_only_sock(sk), true);
+ ipv6_only_sock(sk), true, false);
}
/* Obtain a reference to a local port for the given sock,
static int vti_rcv_tunnel(struct sk_buff *skb)
{
- return vti_rcv(skb, ip_hdr(skb)->saddr, true);
+ struct ip_tunnel_net *itn = net_generic(dev_net(skb->dev), vti_net_id);
+ const struct iphdr *iph = ip_hdr(skb);
+ struct ip_tunnel *tunnel;
+
+ tunnel = ip_tunnel_lookup(itn, skb->dev->ifindex, TUNNEL_NO_KEY,
+ iph->saddr, iph->daddr, 0);
+ if (tunnel) {
+ struct tnl_ptk_info tpi = {
+ .proto = htons(ETH_P_IP),
+ };
+
+ if (!xfrm4_policy_check(NULL, XFRM_POLICY_IN, skb))
+ goto drop;
+ if (iptunnel_pull_header(skb, 0, tpi.proto, false))
+ goto drop;
+ return ip_tunnel_rcv(tunnel, skb, &tpi, NULL, false);
+ }
+
+ return -EINVAL;
+drop:
+ kfree_skb(skb);
+ return 0;
}
static int vti_rcv_cb(struct sk_buff *skb, int err)
rtnl_link_failed:
#if IS_ENABLED(CONFIG_MPLS)
- xfrm4_tunnel_deregister(&mplsip_handler, AF_INET);
+ xfrm4_tunnel_deregister(&mplsip_handler, AF_MPLS);
xfrm_tunnel_mplsip_failed:
#endif
static void ipmr_expire_process(struct timer_list *t);
#ifdef CONFIG_IP_MROUTE_MULTIPLE_TABLES
-#define ipmr_for_each_table(mrt, net) \
- list_for_each_entry_rcu(mrt, &net->ipv4.mr_tables, list)
+#define ipmr_for_each_table(mrt, net) \
+ list_for_each_entry_rcu(mrt, &net->ipv4.mr_tables, list, \
+ lockdep_rtnl_is_held() || \
+ list_empty(&net->ipv4.mr_tables))
static struct mr_table *ipmr_mr_table_iter(struct net *net,
struct mr_table *mrt)
mrt = ipmr_get_table(sock_net(skb->sk), filter.table_id);
if (!mrt) {
- if (filter.dump_all_families)
+ if (rtnl_msg_family(cb->nlh) != RTNL_FAMILY_IPMR)
return skb->len;
NL_SET_ERR_MSG(cb->extack, "ipv4: MR table does not exist");
break;
default:
pr_debug("unknown outbound packet 0x%04x:%s\n", msg,
- msg <= PPTP_MSG_MAX ? pptp_msg_name[msg] :
- pptp_msg_name[0]);
+ pptp_msg_name(msg));
fallthrough;
case PPTP_SET_LINK_INFO:
/* only need to NAT in case PAC is behind NAT box */
pcid_off = offsetof(union pptp_ctrl_union, setlink.peersCallID);
break;
default:
- pr_debug("unknown inbound packet %s\n",
- msg <= PPTP_MSG_MAX ? pptp_msg_name[msg] :
- pptp_msg_name[0]);
+ pr_debug("unknown inbound packet %s\n", pptp_msg_name(msg));
fallthrough;
case PPTP_START_SESSION_REQUEST:
case PPTP_START_SESSION_REPLY:
int i;
nhg = rcu_dereference_raw(nh->nh_grp);
- for (i = 0; i < nhg->num_nh; ++i)
- WARN_ON(nhg->nh_entries[i].nh);
+ for (i = 0; i < nhg->num_nh; ++i) {
+ struct nh_grp_entry *nhge = &nhg->nh_entries[i];
+
+ WARN_ON(!list_empty(&nhge->nh_list));
+ nexthop_put(nhge->nh);
+ }
+
+ WARN_ON(nhg->spare == nhg);
+ kfree(nhg->spare);
kfree(nhg);
}
return 0;
nla_put_failure:
+ nlmsg_cancel(skb, nlh);
return -EMSGSIZE;
}
if (!valid_group_nh(nh, len, extack))
return -EINVAL;
}
- for (i = NHA_GROUP + 1; i < __NHA_MAX; ++i) {
+ for (i = NHA_GROUP_TYPE + 1; i < __NHA_MAX; ++i) {
if (!tb[i])
continue;
}
}
-static void remove_nh_grp_entry(struct nh_grp_entry *nhge,
- struct nh_group *nhg,
+static void remove_nh_grp_entry(struct net *net, struct nh_grp_entry *nhge,
struct nl_info *nlinfo)
{
+ struct nh_grp_entry *nhges, *new_nhges;
+ struct nexthop *nhp = nhge->nh_parent;
struct nexthop *nh = nhge->nh;
- struct nh_grp_entry *nhges;
- bool found = false;
- int i;
+ struct nh_group *nhg, *newg;
+ int i, j;
WARN_ON(!nh);
- nhges = nhg->nh_entries;
- for (i = 0; i < nhg->num_nh; ++i) {
- if (found) {
- nhges[i-1].nh = nhges[i].nh;
- nhges[i-1].weight = nhges[i].weight;
- list_del(&nhges[i].nh_list);
- list_add(&nhges[i-1].nh_list, &nhges[i-1].nh->grp_list);
- } else if (nhg->nh_entries[i].nh == nh) {
- found = true;
- }
- }
+ nhg = rtnl_dereference(nhp->nh_grp);
+ newg = nhg->spare;
- if (WARN_ON(!found))
+ /* last entry, keep it visible and remove the parent */
+ if (nhg->num_nh == 1) {
+ remove_nexthop(net, nhp, nlinfo);
return;
+ }
+
+ newg->has_v4 = nhg->has_v4;
+ newg->mpath = nhg->mpath;
+ newg->num_nh = nhg->num_nh;
- nhg->num_nh--;
- nhg->nh_entries[nhg->num_nh].nh = NULL;
+ /* copy old entries to new except the one getting removed */
+ nhges = nhg->nh_entries;
+ new_nhges = newg->nh_entries;
+ for (i = 0, j = 0; i < nhg->num_nh; ++i) {
+ /* current nexthop getting removed */
+ if (nhg->nh_entries[i].nh == nh) {
+ newg->num_nh--;
+ continue;
+ }
- nh_group_rebalance(nhg);
+ list_del(&nhges[i].nh_list);
+ new_nhges[j].nh_parent = nhges[i].nh_parent;
+ new_nhges[j].nh = nhges[i].nh;
+ new_nhges[j].weight = nhges[i].weight;
+ list_add(&new_nhges[j].nh_list, &new_nhges[j].nh->grp_list);
+ j++;
+ }
- nexthop_put(nh);
+ nh_group_rebalance(newg);
+ rcu_assign_pointer(nhp->nh_grp, newg);
+
+ list_del(&nhge->nh_list);
+ nexthop_put(nhge->nh);
if (nlinfo)
- nexthop_notify(RTM_NEWNEXTHOP, nhge->nh_parent, nlinfo);
+ nexthop_notify(RTM_NEWNEXTHOP, nhp, nlinfo);
}
static void remove_nexthop_from_groups(struct net *net, struct nexthop *nh,
{
struct nh_grp_entry *nhge, *tmp;
- list_for_each_entry_safe(nhge, tmp, &nh->grp_list, nh_list) {
- struct nh_group *nhg;
-
- list_del(&nhge->nh_list);
- nhg = rtnl_dereference(nhge->nh_parent->nh_grp);
- remove_nh_grp_entry(nhge, nhg, nlinfo);
+ list_for_each_entry_safe(nhge, tmp, &nh->grp_list, nh_list)
+ remove_nh_grp_entry(net, nhge, nlinfo);
- /* if this group has no more entries then remove it */
- if (!nhg->num_nh)
- remove_nexthop(net, nhge->nh_parent, nlinfo);
- }
+ /* make sure all see the newly published array before releasing rtnl */
+ synchronize_rcu();
}
static void remove_nexthop_group(struct nexthop *nh, struct nl_info *nlinfo)
if (WARN_ON(!nhge->nh))
continue;
- list_del(&nhge->nh_list);
- nexthop_put(nhge->nh);
- nhge->nh = NULL;
- nhg->num_nh--;
+ list_del_init(&nhge->nh_list);
}
}
{
struct nlattr *grps_attr = cfg->nh_grp;
struct nexthop_grp *entry = nla_data(grps_attr);
+ u16 num_nh = nla_len(grps_attr) / sizeof(*entry);
struct nh_group *nhg;
struct nexthop *nh;
int i;
nh->is_group = 1;
- nhg = nexthop_grp_alloc(nla_len(grps_attr) / sizeof(*entry));
+ nhg = nexthop_grp_alloc(num_nh);
if (!nhg) {
kfree(nh);
return ERR_PTR(-ENOMEM);
}
+ /* spare group used for removals */
+ nhg->spare = nexthop_grp_alloc(num_nh);
+ if (!nhg) {
+ kfree(nhg);
+ kfree(nh);
+ return NULL;
+ }
+ nhg->spare->spare = nhg;
+
for (i = 0; i < nhg->num_nh; ++i) {
struct nexthop *nhe;
struct nh_info *nhi;
for (; i >= 0; --i)
nexthop_put(nhg->nh_entries[i].nh);
+ kfree(nhg->spare);
kfree(nhg);
kfree(nh);
atomic_t *p_id = ip_idents + hash % IP_IDENTS_SZ;
u32 old = READ_ONCE(*p_tstamp);
u32 now = (u32)jiffies;
- u32 new, delta = 0;
+ u32 delta = 0;
if (old != now && cmpxchg(p_tstamp, old, now) == old)
delta = prandom_u32_max(now - old);
- /* Do not use atomic_add_return() as it makes UBSAN unhappy */
- do {
- old = (u32)atomic_read(p_id);
- new = old + delta + segs;
- } while (atomic_cmpxchg(p_id, old, new) != old);
-
- return new - segs;
+ /* If UBSAN reports an error there, please make sure your compiler
+ * supports -fno-strict-overflow before reporting it that was a bug
+ * in UBSAN, and it has been fixed in GCC-8.
+ */
+ return atomic_add_return(segs + delta, p_id) - segs;
}
EXPORT_SYMBOL(ip_idents_reserve);
/* Check for load limit; set rate_last to the latest sent
* redirect.
*/
- if (peer->rate_tokens == 0 ||
+ if (peer->n_redirects == 0 ||
time_after(jiffies,
(peer->rate_last +
(ip_rt_redirect_load << peer->n_redirects)))) {
static inline bool tcp_stream_is_readable(const struct tcp_sock *tp,
int target, struct sock *sk)
{
- return (READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq) >= target) ||
- (sk->sk_prot->stream_memory_read ?
- sk->sk_prot->stream_memory_read(sk) : false);
+ int avail = READ_ONCE(tp->rcv_nxt) - READ_ONCE(tp->copied_seq);
+
+ if (avail > 0) {
+ if (avail >= target)
+ return true;
+ if (tcp_rmem_pressure(sk))
+ return true;
+ }
+ if (sk->sk_prot->stream_memory_read)
+ return sk->sk_prot->stream_memory_read(sk);
+ return false;
}
/*
down_read(¤t->mm->mmap_sem);
- ret = -EINVAL;
vma = find_vma(current->mm, address);
- if (!vma || vma->vm_start > address || vma->vm_ops != &tcp_vm_ops)
- goto out;
+ if (!vma || vma->vm_start > address || vma->vm_ops != &tcp_vm_ops) {
+ up_read(¤t->mm->mmap_sem);
+ return -EINVAL;
+ }
zc->length = min_t(unsigned long, zc->length, vma->vm_end - address);
tp = tcp_sk(sk);
tp->urg_data = 0;
tcp_fast_path_check(sk);
}
- if (used + offset < skb->len)
- continue;
if (TCP_SKB_CB(skb)->has_rxtstamp) {
tcp_update_recv_tstamps(skb, &tss);
cmsg_flags |= 2;
}
+
+ if (used + offset < skb->len)
+ continue;
+
if (TCP_SKB_CB(skb)->tcp_flags & TCPHDR_FIN)
goto found_fin_ok;
if (!(flags & MSG_PEEK))
if (!ret) {
msg->sg.start = i;
- msg->sg.size -= apply_bytes;
sk_psock_queue_msg(psock, tmp);
sk_psock_data_ready(sk, psock);
} else {
struct sk_psock *psock;
int copied, ret;
+ if (unlikely(flags & MSG_ERRQUEUE))
+ return inet_recv_error(sk, msg, len, addr_len);
+
psock = sk_psock_get(sk);
if (unlikely(!psock))
return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
- if (unlikely(flags & MSG_ERRQUEUE))
- return inet_recv_error(sk, msg, len, addr_len);
if (!skb_queue_empty(&sk->sk_receive_queue) &&
- sk_psock_queue_empty(psock))
+ sk_psock_queue_empty(psock)) {
+ sk_psock_put(sk, psock);
return tcp_recvmsg(sk, msg, len, nonblock, flags, addr_len);
+ }
lock_sock(sk);
msg_bytes_ready:
copied = __tcp_bpf_recvmsg(sk, psock, msg, len, flags);
const struct tcp_sock *tp = tcp_sk(sk);
int avail = tp->rcv_nxt - tp->copied_seq;
- if (avail < sk->sk_rcvlowat && !sock_flag(sk, SOCK_DONE))
+ if (avail < sk->sk_rcvlowat && !tcp_rmem_pressure(sk) &&
+ !sock_flag(sk, SOCK_DONE))
return;
sk->sk_data_ready(sk);
const struct inet6_dev *idev)
{
static DEFINE_SPINLOCK(lock);
- static __u32 digest[SHA_DIGEST_WORDS];
- static __u32 workspace[SHA_WORKSPACE_WORDS];
+ static __u32 digest[SHA1_DIGEST_WORDS];
+ static __u32 workspace[SHA1_WORKSPACE_WORDS];
static union {
- char __data[SHA_MESSAGE_BYTES];
+ char __data[SHA1_BLOCK_SIZE];
struct {
struct in6_addr secret;
__be32 prefix[2];
retry:
spin_lock_bh(&lock);
- sha_init(digest);
+ sha1_init(digest);
memset(&data, 0, sizeof(data));
memset(workspace, 0, sizeof(workspace));
memcpy(data.hwaddr, idev->dev->perm_addr, idev->dev->addr_len);
data.secret = secret;
data.dad_count = dad_count;
- sha_transform(digest, data.__data, workspace);
+ sha1_transform(digest, data.__data, workspace);
temp = *address;
temp.s6_addr32[2] = (__force __be32)digest[0];
goto getattr_return;
}
- secattr->flags |= NETLBL_SECATTR_MLS_CAT;
+ if (secattr->attr.mls.cat)
+ secattr->flags |= NETLBL_SECATTR_MLS_CAT;
}
secattr->type = NETLBL_NLTYPE_CALIPSO;
sp->olen++;
xo = xfrm_offload(skb);
- if (!xo) {
- xfrm_state_put(x);
+ if (!xo)
goto out_reset;
- }
}
xo->flags |= XFRM_GRO;
struct ip_esp_hdr *esph;
struct ipv6hdr *iph = ipv6_hdr(skb);
struct xfrm_offload *xo = xfrm_offload(skb);
- int proto = iph->nexthdr;
+ u8 proto = iph->nexthdr;
skb_push(skb, -skb_network_offset(skb));
+
+ if (x->outer_mode.encap == XFRM_MODE_TRANSPORT) {
+ __be16 frag;
+
+ ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &proto, &frag);
+ }
+
esph = ip_esp_hdr(skb);
*skb_mac_header(skb) = IPPROTO_ESP;
struct xfrm_offload *xo = xfrm_offload(skb);
struct sk_buff *segs = ERR_PTR(-EINVAL);
const struct net_offload *ops;
- int proto = xo->proto;
+ u8 proto = xo->proto;
skb->transport_header += x->props.header_len;
- if (proto == IPPROTO_BEETPH) {
- struct ip_beet_phdr *ph = (struct ip_beet_phdr *)skb->data;
-
- skb->transport_header += ph->hdrlen * 8;
- proto = ph->nexthdr;
- }
-
if (x->sel.family != AF_INET6) {
skb->transport_header -=
(sizeof(struct ipv6hdr) - sizeof(struct iphdr));
+ if (proto == IPPROTO_BEETPH) {
+ struct ip_beet_phdr *ph =
+ (struct ip_beet_phdr *)skb->data;
+
+ skb->transport_header += ph->hdrlen * 8;
+ proto = ph->nexthdr;
+ } else {
+ skb->transport_header -= IPV4_BEET_PHMAXLEN;
+ }
+
if (proto == IPPROTO_TCP)
skb_shinfo(skb)->gso_type |= SKB_GSO_TCPV6;
+ } else {
+ __be16 frag;
+
+ skb->transport_header +=
+ ipv6_skip_exthdr(skb, 0, &proto, &frag);
}
__skb_pull(skb, skb_transport_offset(skb));
if (arg.filter.table_id) {
tb = fib6_get_table(net, arg.filter.table_id);
if (!tb) {
- if (arg.filter.dump_all_families)
+ if (rtnl_msg_family(cb->nlh) != PF_INET6)
goto out;
NL_SET_ERR_MSG_MOD(cb->extack, "FIB table does not exist");
#ifdef CONFIG_IPV6_MROUTE_MULTIPLE_TABLES
#define ip6mr_for_each_table(mrt, net) \
list_for_each_entry_rcu(mrt, &net->ipv6.mr6_tables, list, \
- lockdep_rtnl_is_held())
+ lockdep_rtnl_is_held() || \
+ list_empty(&net->ipv6.mr6_tables))
static struct mr_table *ip6mr_mr_table_iter(struct net *net,
struct mr_table *mrt)
mrt = ip6mr_get_table(sock_net(skb->sk), filter.table_id);
if (!mrt) {
- if (filter.dump_all_families)
+ if (rtnl_msg_family(cb->nlh) != RTNL_FAMILY_IP6MR)
return skb->len;
NL_SET_ERR_MSG_MOD(cb->extack, "MR table does not exist");
const struct in6_addr *daddr, *saddr;
struct rt6_info *rt6 = (struct rt6_info *)dst;
- if (dst_metric_locked(dst, RTAX_MTU))
- return;
+ /* Note: do *NOT* check dst_metric_locked(dst, RTAX_MTU)
+ * IPv6 pmtu discovery isn't optional, so 'mtu lock' cannot disable it.
+ * [see also comment in rt6_mtu_change_route()]
+ */
if (iph) {
daddr = &iph->daddr;
#include <net/addrconf.h>
#include <net/xfrm.h>
-#include <linux/cryptohash.h>
#include <crypto/hash.h>
#include <crypto/sha.h>
#include <net/seg6.h>
if (sk->sk_type != SOCK_DGRAM)
return -EPROTONOSUPPORT;
+ if (sk->sk_family != PF_INET && sk->sk_family != PF_INET6)
+ return -EPROTONOSUPPORT;
+
if ((encap == L2TP_ENCAPTYPE_UDP && sk->sk_protocol != IPPROTO_UDP) ||
(encap == L2TP_ENCAPTYPE_IP && sk->sk_protocol != IPPROTO_L2TP))
return -EPROTONOSUPPORT;
#include <net/icmp.h>
#include <net/udp.h>
#include <net/inet_common.h>
-#include <net/inet_hashtables.h>
#include <net/tcp_states.h>
#include <net/protocol.h>
#include <net/xfrm.h>
return 0;
}
-static int l2tp_ip_open(struct sock *sk)
+static int l2tp_ip_hash(struct sock *sk)
{
- /* Prevent autobind. We don't have ports. */
- inet_sk(sk)->inet_num = IPPROTO_L2TP;
+ if (sk_unhashed(sk)) {
+ write_lock_bh(&l2tp_ip_lock);
+ sk_add_node(sk, &l2tp_ip_table);
+ write_unlock_bh(&l2tp_ip_lock);
+ }
+ return 0;
+}
+static void l2tp_ip_unhash(struct sock *sk)
+{
+ if (sk_unhashed(sk))
+ return;
write_lock_bh(&l2tp_ip_lock);
- sk_add_node(sk, &l2tp_ip_table);
+ sk_del_node_init(sk);
write_unlock_bh(&l2tp_ip_lock);
+}
+
+static int l2tp_ip_open(struct sock *sk)
+{
+ /* Prevent autobind. We don't have ports. */
+ inet_sk(sk)->inet_num = IPPROTO_L2TP;
+ l2tp_ip_hash(sk);
return 0;
}
.sendmsg = l2tp_ip_sendmsg,
.recvmsg = l2tp_ip_recvmsg,
.backlog_rcv = l2tp_ip_backlog_recv,
- .hash = inet_hash,
- .unhash = inet_unhash,
+ .hash = l2tp_ip_hash,
+ .unhash = l2tp_ip_unhash,
.obj_size = sizeof(struct l2tp_ip_sock),
#ifdef CONFIG_COMPAT
.compat_setsockopt = compat_ip_setsockopt,
#include <net/icmp.h>
#include <net/udp.h>
#include <net/inet_common.h>
-#include <net/inet_hashtables.h>
-#include <net/inet6_hashtables.h>
#include <net/tcp_states.h>
#include <net/protocol.h>
#include <net/xfrm.h>
return 0;
}
-static int l2tp_ip6_open(struct sock *sk)
+static int l2tp_ip6_hash(struct sock *sk)
{
- /* Prevent autobind. We don't have ports. */
- inet_sk(sk)->inet_num = IPPROTO_L2TP;
+ if (sk_unhashed(sk)) {
+ write_lock_bh(&l2tp_ip6_lock);
+ sk_add_node(sk, &l2tp_ip6_table);
+ write_unlock_bh(&l2tp_ip6_lock);
+ }
+ return 0;
+}
+static void l2tp_ip6_unhash(struct sock *sk)
+{
+ if (sk_unhashed(sk))
+ return;
write_lock_bh(&l2tp_ip6_lock);
- sk_add_node(sk, &l2tp_ip6_table);
+ sk_del_node_init(sk);
write_unlock_bh(&l2tp_ip6_lock);
+}
+
+static int l2tp_ip6_open(struct sock *sk)
+{
+ /* Prevent autobind. We don't have ports. */
+ inet_sk(sk)->inet_num = IPPROTO_L2TP;
+ l2tp_ip6_hash(sk);
return 0;
}
.sendmsg = l2tp_ip6_sendmsg,
.recvmsg = l2tp_ip6_recvmsg,
.backlog_rcv = l2tp_ip6_backlog_recv,
- .hash = inet6_hash,
- .unhash = inet_unhash,
+ .hash = l2tp_ip6_hash,
+ .unhash = l2tp_ip6_unhash,
.obj_size = sizeof(struct l2tp_ip6_sock),
#ifdef CONFIG_COMPAT
.compat_setsockopt = compat_ipv6_setsockopt,
mesh_path_sel_frame_tx(MPATH_PREQ, 0, sdata->vif.addr, ifmsh->sn,
target_flags, mpath->dst, mpath->sn, da, 0,
ttl, lifetime, 0, ifmsh->preq_id++, sdata);
+
+ spin_lock_bh(&mpath->state_lock);
+ if (mpath->flags & MESH_PATH_DELETED) {
+ spin_unlock_bh(&mpath->state_lock);
+ goto enddiscovery;
+ }
mod_timer(&mpath->timer, jiffies + mpath->discovery_timeout);
+ spin_unlock_bh(&mpath->state_lock);
enddiscovery:
rcu_read_unlock();
void mptcp_crypto_hmac_sha(u64 key1, u64 key2, u8 *msg, int len, void *hmac)
{
u8 input[SHA256_BLOCK_SIZE + SHA256_DIGEST_SIZE];
- __be32 mptcp_hashed_key[SHA256_DIGEST_WORDS];
- __be32 *hash_out = (__force __be32 *)hmac;
struct sha256_state state;
u8 key1be[8];
u8 key2be[8];
put_unaligned_be64(key2, key2be);
/* Generate key xored with ipad */
- memset(input, 0x36, SHA_MESSAGE_BYTES);
+ memset(input, 0x36, SHA256_BLOCK_SIZE);
for (i = 0; i < 8; i++)
input[i] ^= key1be[i];
for (i = 0; i < 8; i++)
sha256_final(&state, &input[SHA256_BLOCK_SIZE]);
/* Prepare second part of hmac */
- memset(input, 0x5C, SHA_MESSAGE_BYTES);
+ memset(input, 0x5C, SHA256_BLOCK_SIZE);
for (i = 0; i < 8; i++)
input[i] ^= key1be[i];
for (i = 0; i < 8; i++)
sha256_init(&state);
sha256_update(&state, input, SHA256_BLOCK_SIZE + SHA256_DIGEST_SIZE);
- sha256_final(&state, (u8 *)mptcp_hashed_key);
-
- /* takes only first 160 bits */
- for (i = 0; i < 5; i++)
- hash_out[i] = mptcp_hashed_key[i];
+ sha256_final(&state, (u8 *)hmac);
}
#ifdef CONFIG_MPTCP_HMAC_TEST
};
/* we can't reuse RFC 4231 test vectors, as we have constraint on the
- * input and key size, and we truncate the output.
+ * input and key size.
*/
static struct test_cast tests[] = {
{
.key = "0b0b0b0b0b0b0b0b",
.msg = "48692054",
- .result = "8385e24fb4235ac37556b6b886db106284a1da67",
+ .result = "8385e24fb4235ac37556b6b886db106284a1da671699f46db1f235ec622dcafa",
},
{
.key = "aaaaaaaaaaaaaaaa",
.msg = "dddddddd",
- .result = "2c5e219164ff1dca1c4a92318d847bb6b9d44492",
+ .result = "2c5e219164ff1dca1c4a92318d847bb6b9d44492984e1eb71aff9022f71046e9",
},
{
.key = "0102030405060708",
.msg = "cdcdcdcd",
- .result = "e73b9ba9969969cefb04aa0d6df18ec2fcc075b6",
+ .result = "e73b9ba9969969cefb04aa0d6df18ec2fcc075b6f23b4d8c4da736a5dbbc6e7d",
},
};
static int __init test_mptcp_crypto(void)
{
- char hmac[20], hmac_hex[41];
+ char hmac[32], hmac_hex[65];
u32 nonce1, nonce2;
u64 key1, key2;
u8 msg[8];
put_unaligned_be32(nonce2, &msg[4]);
mptcp_crypto_hmac_sha(key1, key2, msg, 8, hmac);
- for (j = 0; j < 20; ++j)
+ for (j = 0; j < 32; ++j)
sprintf(&hmac_hex[j << 1], "%02x", hmac[j] & 0xff);
- hmac_hex[40] = 0;
+ hmac_hex[64] = 0;
- if (memcmp(hmac_hex, tests[i].result, 40))
+ if (memcmp(hmac_hex, tests[i].result, 64))
pr_err("test %d failed, got %s expected %s", i,
hmac_hex, tests[i].result);
else
#define pr_fmt(fmt) "MPTCP: " fmt
#include <linux/kernel.h>
+#include <crypto/sha.h>
#include <net/tcp.h>
#include <net/mptcp.h>
#include "protocol.h"
static u64 add_addr_generate_hmac(u64 key1, u64 key2, u8 addr_id,
struct in_addr *addr)
{
- u8 hmac[MPTCP_ADDR_HMAC_LEN];
+ u8 hmac[SHA256_DIGEST_SIZE];
u8 msg[7];
msg[0] = addr_id;
mptcp_crypto_hmac_sha(key1, key2, msg, 7, hmac);
- return get_unaligned_be64(hmac);
+ return get_unaligned_be64(&hmac[SHA256_DIGEST_SIZE - sizeof(u64)]);
}
#if IS_ENABLED(CONFIG_MPTCP_IPV6)
static u64 add_addr6_generate_hmac(u64 key1, u64 key2, u8 addr_id,
struct in6_addr *addr)
{
- u8 hmac[MPTCP_ADDR_HMAC_LEN];
+ u8 hmac[SHA256_DIGEST_SIZE];
u8 msg[19];
msg[0] = addr_id;
mptcp_crypto_hmac_sha(key1, key2, msg, 19, hmac);
- return get_unaligned_be64(hmac);
+ return get_unaligned_be64(&hmac[SHA256_DIGEST_SIZE - sizeof(u64)]);
}
#endif
pr_debug("block timeout %ld", timeo);
mptcp_wait_data(sk, &timeo);
- if (unlikely(__mptcp_tcp_fallback(msk)))
+ ssock = __mptcp_tcp_fallback(msk);
+ if (unlikely(ssock))
goto fallback;
}
lock_sock(sk);
- mptcp_token_destroy(msk->token);
inet_sk_state_store(sk, TCP_CLOSE);
- __mptcp_flush_join_list(msk);
-
+ /* be sure to always acquire the join list lock, to sync vs
+ * mptcp_finish_join().
+ */
+ spin_lock_bh(&msk->join_list_lock);
+ list_splice_tail_init(&msk->join_list, &msk->conn_list);
+ spin_unlock_bh(&msk->join_list_lock);
list_splice_init(&msk->conn_list, &conn_list);
data_fin_tx_seq = msk->write_seq;
{
struct mptcp_sock *msk = mptcp_sk(sk);
+ mptcp_token_destroy(msk->token);
if (msk->cached_ext)
__skb_ext_put(msk->cached_ext);
if (!msk->pm.server_side)
return true;
- /* passive connection, attach to msk socket */
+ if (!mptcp_pm_allow_new_subflow(msk))
+ return false;
+
+ /* active connections are already on conn_list, and we can't acquire
+ * msk lock here.
+ * use the join list lock as synchronization point and double-check
+ * msk status to avoid racing with mptcp_close()
+ */
+ spin_lock_bh(&msk->join_list_lock);
+ ret = inet_sk_state_load(parent) == TCP_ESTABLISHED;
+ if (ret && !WARN_ON_ONCE(!list_empty(&subflow->node)))
+ list_add_tail(&subflow->node, &msk->join_list);
+ spin_unlock_bh(&msk->join_list_lock);
+ if (!ret)
+ return false;
+
+ /* attach to msk socket only after we are sure he will deal with us
+ * at close time
+ */
parent_sock = READ_ONCE(parent->sk_socket);
if (parent_sock && !sk->sk_socket)
mptcp_sock_graft(sk, parent_sock);
-
- ret = mptcp_pm_allow_new_subflow(msk);
- if (ret) {
- /* active connections are already on conn_list */
- spin_lock_bh(&msk->join_list_lock);
- if (!WARN_ON_ONCE(!list_empty(&subflow->node)))
- list_add_tail(&subflow->node, &msk->join_list);
- spin_unlock_bh(&msk->join_list_lock);
- }
- return ret;
+ subflow->map_seq = msk->ack_seq;
+ return true;
}
bool mptcp_sk_is_subflow(const struct sock *sk)
int err;
lock_sock(sock->sk);
+ if (sock->state != SS_UNCONNECTED && msk->subflow) {
+ /* pending connection or invalid state, let existing subflow
+ * cope with that
+ */
+ ssock = msk->subflow;
+ goto do_connect;
+ }
+
ssock = __mptcp_socket_create(msk, TCP_SYN_SENT);
if (IS_ERR(ssock)) {
err = PTR_ERR(ssock);
mptcp_subflow_ctx(ssock->sk)->request_mptcp = 0;
#endif
+do_connect:
err = ssock->ops->connect(ssock, uaddr, addr_len, flags);
- inet_sk_state_store(sock->sk, inet_sk_state_load(ssock->sk));
- mptcp_copy_inaddrs(sock->sk, ssock->sk);
+ sock->state = ssock->state;
+
+ /* on successful connect, the msk state will be moved to established by
+ * subflow_finish_connect()
+ */
+ if (!err || err == EINPROGRESS)
+ mptcp_copy_inaddrs(sock->sk, ssock->sk);
+ else
+ inet_sk_state_store(sock->sk, inet_sk_state_load(ssock->sk));
unlock:
release_sock(sock->sk);
/* MPTCP ADD_ADDR flags */
#define MPTCP_ADDR_ECHO BIT(0)
-#define MPTCP_ADDR_HMAC_LEN 20
#define MPTCP_ADDR_IPVERSION_4 4
#define MPTCP_ADDR_IPVERSION_6 6
#include <linux/module.h>
#include <linux/netdevice.h>
#include <crypto/algapi.h>
+#include <crypto/sha.h>
#include <net/sock.h>
#include <net/inet_common.h>
#include <net/inet_hashtables.h>
const struct sk_buff *skb)
{
struct mptcp_subflow_request_sock *subflow_req = mptcp_subflow_rsk(req);
- u8 hmac[MPTCPOPT_HMAC_LEN];
+ u8 hmac[SHA256_DIGEST_SIZE];
struct mptcp_sock *msk;
int local_id;
/* validate received truncated hmac and create hmac for third ACK */
static bool subflow_thmac_valid(struct mptcp_subflow_context *subflow)
{
- u8 hmac[MPTCPOPT_HMAC_LEN];
+ u8 hmac[SHA256_DIGEST_SIZE];
u64 thmac;
subflow_generate_hmac(subflow->remote_key, subflow->local_key,
subflow->ssn_offset = TCP_SKB_CB(skb)->seq;
}
} else if (subflow->mp_join) {
+ u8 hmac[SHA256_DIGEST_SIZE];
+
pr_debug("subflow=%p, thmac=%llu, remote_nonce=%u",
subflow, subflow->thmac,
subflow->remote_nonce);
subflow_generate_hmac(subflow->local_key, subflow->remote_key,
subflow->local_nonce,
subflow->remote_nonce,
- subflow->hmac);
+ hmac);
+
+ memcpy(subflow->hmac, hmac, MPTCPOPT_HMAC_LEN);
if (skb)
subflow->ssn_offset = TCP_SKB_CB(skb)->seq;
const struct mptcp_options_received *mp_opt)
{
const struct mptcp_subflow_request_sock *subflow_req;
- u8 hmac[MPTCPOPT_HMAC_LEN];
+ u8 hmac[SHA256_DIGEST_SIZE];
struct mptcp_sock *msk;
bool ret;
subflow_req->local_nonce, hmac);
ret = true;
- if (crypto_memneq(hmac, mp_opt->hmac, sizeof(hmac)))
+ if (crypto_memneq(hmac, mp_opt->hmac, MPTCPOPT_HMAC_LEN))
ret = false;
sock_put((struct sock *)msk);
if (err)
return err;
+ /* the newly created socket really belongs to the owning MPTCP master
+ * socket, even if for additional subflows the allocation is performed
+ * by a kernel workqueue. Adjust inode references, so that the
+ * procfs/diag interaces really show this one belonging to the correct
+ * user.
+ */
+ SOCK_INODE(sf)->i_ino = SOCK_INODE(sk->sk_socket)->i_ino;
+ SOCK_INODE(sf)->i_uid = SOCK_INODE(sk->sk_socket)->i_uid;
+ SOCK_INODE(sf)->i_gid = SOCK_INODE(sk->sk_socket)->i_gid;
+
subflow = mptcp_subflow_ctx(sf->sk);
pr_debug("subflow=%p", subflow);
/* Don't lookup sub-counters at all */
opt->cmdflags &= ~IPSET_FLAG_MATCH_COUNTERS;
if (opt->cmdflags & IPSET_FLAG_SKIP_SUBCOUNTER_UPDATE)
- opt->cmdflags &= ~IPSET_FLAG_SKIP_COUNTER_UPDATE;
+ opt->cmdflags |= IPSET_FLAG_SKIP_COUNTER_UPDATE;
list_for_each_entry_rcu(e, &map->members, list) {
ret = ip_set_test(e->id, skb, par, opt);
if (ret <= 0)
ct->status = 0;
ct->timeout = 0;
write_pnet(&ct->ct_net, net);
- memset(&ct->__nfct_init_offset[0], 0,
+ memset(&ct->__nfct_init_offset, 0,
offsetof(struct nf_conn, proto) -
- offsetof(struct nf_conn, __nfct_init_offset[0]));
+ offsetof(struct nf_conn, __nfct_init_offset));
nf_ct_zone_add(ct, zone);
nf_conntrack_get(skb_nfct(nskb));
}
-static int nf_conntrack_update(struct net *net, struct sk_buff *skb)
+static int __nf_conntrack_update(struct net *net, struct sk_buff *skb,
+ struct nf_conn *ct,
+ enum ip_conntrack_info ctinfo)
{
struct nf_conntrack_tuple_hash *h;
struct nf_conntrack_tuple tuple;
- enum ip_conntrack_info ctinfo;
struct nf_nat_hook *nat_hook;
unsigned int status;
- struct nf_conn *ct;
int dataoff;
u16 l3num;
u8 l4num;
- ct = nf_ct_get(skb, &ctinfo);
- if (!ct || nf_ct_is_confirmed(ct))
- return 0;
-
l3num = nf_ct_l3num(ct);
dataoff = get_l4proto(skb, skb_network_offset(skb), l3num, &l4num);
return 0;
}
+/* This packet is coming from userspace via nf_queue, complete the packet
+ * processing after the helper invocation in nf_confirm().
+ */
+static int nf_confirm_cthelper(struct sk_buff *skb, struct nf_conn *ct,
+ enum ip_conntrack_info ctinfo)
+{
+ const struct nf_conntrack_helper *helper;
+ const struct nf_conn_help *help;
+ int protoff;
+
+ help = nfct_help(ct);
+ if (!help)
+ return 0;
+
+ helper = rcu_dereference(help->helper);
+ if (!(helper->flags & NF_CT_HELPER_F_USERSPACE))
+ return 0;
+
+ switch (nf_ct_l3num(ct)) {
+ case NFPROTO_IPV4:
+ protoff = skb_network_offset(skb) + ip_hdrlen(skb);
+ break;
+#if IS_ENABLED(CONFIG_IPV6)
+ case NFPROTO_IPV6: {
+ __be16 frag_off;
+ u8 pnum;
+
+ pnum = ipv6_hdr(skb)->nexthdr;
+ protoff = ipv6_skip_exthdr(skb, sizeof(struct ipv6hdr), &pnum,
+ &frag_off);
+ if (protoff < 0 || (frag_off & htons(~0x7)) != 0)
+ return 0;
+ break;
+ }
+#endif
+ default:
+ return 0;
+ }
+
+ if (test_bit(IPS_SEQ_ADJUST_BIT, &ct->status) &&
+ !nf_is_loopback_packet(skb)) {
+ if (!nf_ct_seq_adjust(skb, ct, ctinfo, protoff)) {
+ NF_CT_STAT_INC_ATOMIC(nf_ct_net(ct), drop);
+ return -1;
+ }
+ }
+
+ /* We've seen it coming out the other side: confirm it */
+ return nf_conntrack_confirm(skb) == NF_DROP ? - 1 : 0;
+}
+
+static int nf_conntrack_update(struct net *net, struct sk_buff *skb)
+{
+ enum ip_conntrack_info ctinfo;
+ struct nf_conn *ct;
+ int err;
+
+ ct = nf_ct_get(skb, &ctinfo);
+ if (!ct)
+ return 0;
+
+ if (!nf_ct_is_confirmed(ct)) {
+ err = __nf_conntrack_update(net, skb, ct, ctinfo);
+ if (err < 0)
+ return err;
+ }
+
+ return nf_confirm_cthelper(skb, ct, ctinfo);
+}
+
static bool nf_conntrack_get_tuple_skb(struct nf_conntrack_tuple *dst_tuple,
const struct sk_buff *skb)
{
nf_conntrack_lock(lockp);
if (*bucket < nf_conntrack_htable_size) {
hlist_nulls_for_each_entry(h, n, &nf_conntrack_hash[*bucket], hnnode) {
- if (NF_CT_DIRECTION(h) != IP_CT_DIR_ORIGINAL)
+ if (NF_CT_DIRECTION(h) != IP_CT_DIR_REPLY)
continue;
+ /* All nf_conn objects are added to hash table twice, one
+ * for original direction tuple, once for the reply tuple.
+ *
+ * Exception: In the IPS_NAT_CLASH case, only the reply
+ * tuple is added (the original tuple already existed for
+ * a different object).
+ *
+ * We only need to call the iterator once for each
+ * conntrack, so we just use the 'reply' direction
+ * tuple while iterating.
+ */
ct = nf_ct_tuplehash_to_ctrack(h);
if (iter(ct, data))
goto found;
#if defined(DEBUG) || defined(CONFIG_DYNAMIC_DEBUG)
/* PptpControlMessageType names */
-const char *const pptp_msg_name[] = {
- "UNKNOWN_MESSAGE",
- "START_SESSION_REQUEST",
- "START_SESSION_REPLY",
- "STOP_SESSION_REQUEST",
- "STOP_SESSION_REPLY",
- "ECHO_REQUEST",
- "ECHO_REPLY",
- "OUT_CALL_REQUEST",
- "OUT_CALL_REPLY",
- "IN_CALL_REQUEST",
- "IN_CALL_REPLY",
- "IN_CALL_CONNECT",
- "CALL_CLEAR_REQUEST",
- "CALL_DISCONNECT_NOTIFY",
- "WAN_ERROR_NOTIFY",
- "SET_LINK_INFO"
+static const char *const pptp_msg_name_array[PPTP_MSG_MAX + 1] = {
+ [0] = "UNKNOWN_MESSAGE",
+ [PPTP_START_SESSION_REQUEST] = "START_SESSION_REQUEST",
+ [PPTP_START_SESSION_REPLY] = "START_SESSION_REPLY",
+ [PPTP_STOP_SESSION_REQUEST] = "STOP_SESSION_REQUEST",
+ [PPTP_STOP_SESSION_REPLY] = "STOP_SESSION_REPLY",
+ [PPTP_ECHO_REQUEST] = "ECHO_REQUEST",
+ [PPTP_ECHO_REPLY] = "ECHO_REPLY",
+ [PPTP_OUT_CALL_REQUEST] = "OUT_CALL_REQUEST",
+ [PPTP_OUT_CALL_REPLY] = "OUT_CALL_REPLY",
+ [PPTP_IN_CALL_REQUEST] = "IN_CALL_REQUEST",
+ [PPTP_IN_CALL_REPLY] = "IN_CALL_REPLY",
+ [PPTP_IN_CALL_CONNECT] = "IN_CALL_CONNECT",
+ [PPTP_CALL_CLEAR_REQUEST] = "CALL_CLEAR_REQUEST",
+ [PPTP_CALL_DISCONNECT_NOTIFY] = "CALL_DISCONNECT_NOTIFY",
+ [PPTP_WAN_ERROR_NOTIFY] = "WAN_ERROR_NOTIFY",
+ [PPTP_SET_LINK_INFO] = "SET_LINK_INFO"
};
+
+const char *pptp_msg_name(u_int16_t msg)
+{
+ if (msg > PPTP_MSG_MAX)
+ return pptp_msg_name_array[0];
+
+ return pptp_msg_name_array[msg];
+}
EXPORT_SYMBOL(pptp_msg_name);
#endif
typeof(nf_nat_pptp_hook_inbound) nf_nat_pptp_inbound;
msg = ntohs(ctlh->messageType);
- pr_debug("inbound control message %s\n", pptp_msg_name[msg]);
+ pr_debug("inbound control message %s\n", pptp_msg_name(msg));
switch (msg) {
case PPTP_START_SESSION_REPLY:
pcid = pptpReq->ocack.peersCallID;
if (info->pns_call_id != pcid)
goto invalid;
- pr_debug("%s, CID=%X, PCID=%X\n", pptp_msg_name[msg],
+ pr_debug("%s, CID=%X, PCID=%X\n", pptp_msg_name(msg),
ntohs(cid), ntohs(pcid));
if (pptpReq->ocack.resultCode == PPTP_OUTCALL_CONNECT) {
goto invalid;
cid = pptpReq->icreq.callID;
- pr_debug("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid));
+ pr_debug("%s, CID=%X\n", pptp_msg_name(msg), ntohs(cid));
info->cstate = PPTP_CALL_IN_REQ;
info->pac_call_id = cid;
break;
if (info->pns_call_id != pcid)
goto invalid;
- pr_debug("%s, PCID=%X\n", pptp_msg_name[msg], ntohs(pcid));
+ pr_debug("%s, PCID=%X\n", pptp_msg_name(msg), ntohs(pcid));
info->cstate = PPTP_CALL_IN_CONF;
/* we expect a GRE connection from PAC to PNS */
case PPTP_CALL_DISCONNECT_NOTIFY:
/* server confirms disconnect */
cid = pptpReq->disc.callID;
- pr_debug("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid));
+ pr_debug("%s, CID=%X\n", pptp_msg_name(msg), ntohs(cid));
info->cstate = PPTP_CALL_NONE;
/* untrack this call id, unexpect GRE packets */
invalid:
pr_debug("invalid %s: type=%d cid=%u pcid=%u "
"cstate=%d sstate=%d pns_cid=%u pac_cid=%u\n",
- msg <= PPTP_MSG_MAX ? pptp_msg_name[msg] : pptp_msg_name[0],
+ pptp_msg_name(msg),
msg, ntohs(cid), ntohs(pcid), info->cstate, info->sstate,
ntohs(info->pns_call_id), ntohs(info->pac_call_id));
return NF_ACCEPT;
typeof(nf_nat_pptp_hook_outbound) nf_nat_pptp_outbound;
msg = ntohs(ctlh->messageType);
- pr_debug("outbound control message %s\n", pptp_msg_name[msg]);
+ pr_debug("outbound control message %s\n", pptp_msg_name(msg));
switch (msg) {
case PPTP_START_SESSION_REQUEST:
info->cstate = PPTP_CALL_OUT_REQ;
/* track PNS call id */
cid = pptpReq->ocreq.callID;
- pr_debug("%s, CID=%X\n", pptp_msg_name[msg], ntohs(cid));
+ pr_debug("%s, CID=%X\n", pptp_msg_name(msg), ntohs(cid));
info->pns_call_id = cid;
break;
pcid = pptpReq->icack.peersCallID;
if (info->pac_call_id != pcid)
goto invalid;
- pr_debug("%s, CID=%X PCID=%X\n", pptp_msg_name[msg],
+ pr_debug("%s, CID=%X PCID=%X\n", pptp_msg_name(msg),
ntohs(cid), ntohs(pcid));
if (pptpReq->icack.resultCode == PPTP_INCALL_ACCEPT) {
invalid:
pr_debug("invalid %s: type=%d cid=%u pcid=%u "
"cstate=%d sstate=%d pns_cid=%u pac_cid=%u\n",
- msg <= PPTP_MSG_MAX ? pptp_msg_name[msg] : pptp_msg_name[0],
+ pptp_msg_name(msg),
msg, ntohs(cid), ntohs(pcid), info->cstate, info->sstate,
ntohs(info->pns_call_id), ntohs(info->pac_call_id));
return NF_ACCEPT;
if (nf_flow_has_expired(flow))
flow_offload_fixup_ct(flow->ct);
- else if (test_bit(NF_FLOW_TEARDOWN, &flow->flags))
+ else
flow_offload_fixup_ct_timeout(flow->ct);
flow_offload_free(flow);
{
struct nf_flowtable *flow_table = data;
- if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct) ||
- test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
+ if (nf_flow_has_expired(flow) || nf_ct_is_dying(flow->ct))
+ set_bit(NF_FLOW_TEARDOWN, &flow->flags);
+
+ if (test_bit(NF_FLOW_TEARDOWN, &flow->flags)) {
if (test_bit(NF_FLOW_HW, &flow->flags)) {
if (!test_bit(NF_FLOW_HW_DYING, &flow->flags))
nf_flow_offload_del(flow_table, flow);
WARN_ON_ONCE(1);
}
+ clear_bit(NF_FLOW_HW_PENDING, &offload->flow->flags);
kfree(offload);
}
{
struct flow_offload_work *offload;
+ if (test_and_set_bit(NF_FLOW_HW_PENDING, &flow->flags))
+ return NULL;
+
offload = kmalloc(sizeof(struct flow_offload_work), GFP_ATOMIC);
- if (!offload)
+ if (!offload) {
+ clear_bit(NF_FLOW_HW_PENDING, &flow->flags);
return NULL;
+ }
offload->cmd = cmd;
offload->flow = flow;
int nf_flow_table_offload_init(void)
{
nf_flow_offload_wq = alloc_workqueue("nf_flow_table_offload",
- WQ_UNBOUND | WQ_MEM_RECLAIM, 0);
+ WQ_UNBOUND, 0);
if (!nf_flow_offload_wq)
return -ENOMEM;
if (help->helper->data_len == 0)
return -EINVAL;
- nla_memcpy(help->data, nla_data(attr), sizeof(help->data));
+ nla_memcpy(help->data, attr, sizeof(help->data));
return 0;
}
ret = -ENOMEM;
goto err2;
}
+ helper->data_len = size;
helper->flags |= NF_CT_HELPER_F_USERSPACE;
memcpy(&helper->tuple, tuple, sizeof(struct nf_conntrack_tuple));
parent = rcu_dereference_raw(parent->rb_left);
continue;
}
+
+ if (nft_set_elem_expired(&rbe->ext))
+ return false;
+
if (nft_rbtree_interval_end(rbe)) {
if (nft_set_is_anonymous(set))
return false;
if (set->flags & NFT_SET_INTERVAL && interval != NULL &&
nft_set_elem_active(&interval->ext, genmask) &&
+ !nft_set_elem_expired(&interval->ext) &&
nft_rbtree_interval_start(interval)) {
*ext = &interval->ext;
return true;
continue;
}
+ if (nft_set_elem_expired(&rbe->ext))
+ return false;
+
if (!nft_set_ext_exists(&rbe->ext, NFT_SET_EXT_FLAGS) ||
(*nft_set_ext_flags(&rbe->ext) & NFT_SET_ELEM_INTERVAL_END) ==
(flags & NFT_SET_ELEM_INTERVAL_END)) {
if (set->flags & NFT_SET_INTERVAL && interval != NULL &&
nft_set_elem_active(&interval->ext, genmask) &&
+ !nft_set_elem_expired(&interval->ext) &&
((!nft_rbtree_interval_end(interval) &&
!(flags & NFT_SET_ELEM_INTERVAL_END)) ||
(nft_rbtree_interval_end(interval) &&
if (iter->count < iter->skip)
goto cont;
+ if (nft_set_elem_expired(&rbe->ext))
+ goto cont;
if (!nft_set_elem_active(&rbe->ext, iter->genmask))
goto cont;
if ((off & (BITS_PER_LONG - 1)) != 0)
return -EINVAL;
+ /* a null catmap is equivalent to an empty one */
+ if (!catmap) {
+ *offset = (u32)-1;
+ return 0;
+ }
+
if (off < catmap->startbit) {
off = catmap->startbit;
*offset = off;
goto err_sock;
}
+ qrtr_ns.workqueue = alloc_workqueue("qrtr_ns_handler", WQ_UNBOUND, 1);
+ if (!qrtr_ns.workqueue)
+ goto err_sock;
+
qrtr_ns.sock->sk->sk_data_ready = qrtr_ns_data_ready;
sq.sq_port = QRTR_PORT_CTRL;
ret = kernel_bind(qrtr_ns.sock, (struct sockaddr *)&sq, sizeof(sq));
if (ret < 0) {
pr_err("failed to bind to socket\n");
- goto err_sock;
+ goto err_wq;
}
qrtr_ns.bcast_sq.sq_family = AF_QIPCRTR;
qrtr_ns.bcast_sq.sq_node = QRTR_NODE_BCAST;
qrtr_ns.bcast_sq.sq_port = QRTR_PORT_CTRL;
- qrtr_ns.workqueue = alloc_workqueue("qrtr_ns_handler", WQ_UNBOUND, 1);
- if (!qrtr_ns.workqueue)
- goto err_sock;
-
ret = say_hello(&qrtr_ns.bcast_sq);
if (ret < 0)
goto err_wq;
}
mutex_unlock(&qrtr_node_lock);
- qrtr_local_enqueue(node, skb, type, from, to);
+ qrtr_local_enqueue(NULL, skb, type, from, to);
return 0;
}
peer_event.o \
peer_object.o \
recvmsg.o \
+ rtt.o \
security.o \
sendmsg.o \
skbuff.o \
#include <linux/atomic.h>
#include <linux/seqlock.h>
+#include <linux/win_minmax.h>
#include <net/net_namespace.h>
#include <net/netns/generic.h>
#include <net/sock.h>
#define RXRPC_RTT_CACHE_SIZE 32
spinlock_t rtt_input_lock; /* RTT lock for input routine */
ktime_t rtt_last_req; /* Time of last RTT request */
- u64 rtt; /* Current RTT estimate (in nS) */
- u64 rtt_sum; /* Sum of cache contents */
- u64 rtt_cache[RXRPC_RTT_CACHE_SIZE]; /* Determined RTT cache */
- u8 rtt_cursor; /* next entry at which to insert */
- u8 rtt_usage; /* amount of cache actually used */
+ unsigned int rtt_count; /* Number of samples we've got */
+
+ u32 srtt_us; /* smoothed round trip time << 3 in usecs */
+ u32 mdev_us; /* medium deviation */
+ u32 mdev_max_us; /* maximal mdev for the last rtt period */
+ u32 rttvar_us; /* smoothed mdev_max */
+ u32 rto_j; /* Retransmission timeout in jiffies */
+ u8 backoff; /* Backoff timeout */
u8 cong_cwnd; /* Congestion window size */
};
extern unsigned int rxrpc_rx_window_size;
extern unsigned int rxrpc_rx_mtu;
extern unsigned int rxrpc_rx_jumbo_max;
-extern unsigned long rxrpc_resend_timeout;
extern const s8 rxrpc_ack_priority[];
* peer_event.c
*/
void rxrpc_error_report(struct sock *);
-void rxrpc_peer_add_rtt(struct rxrpc_call *, enum rxrpc_rtt_rx_trace,
- rxrpc_serial_t, rxrpc_serial_t, ktime_t, ktime_t);
void rxrpc_peer_keepalive_worker(struct work_struct *);
/*
void rxrpc_notify_socket(struct rxrpc_call *);
int rxrpc_recvmsg(struct socket *, struct msghdr *, size_t, int);
+/*
+ * rtt.c
+ */
+void rxrpc_peer_add_rtt(struct rxrpc_call *, enum rxrpc_rtt_rx_trace,
+ rxrpc_serial_t, rxrpc_serial_t, ktime_t, ktime_t);
+unsigned long rxrpc_get_rto_backoff(struct rxrpc_peer *, bool);
+void rxrpc_peer_init_rtt(struct rxrpc_peer *);
+
/*
* rxkad.c
*/
struct rxrpc_skb_priv *sp = rxrpc_skb(skb);
ktime_t now = skb->tstamp;
- if (call->peer->rtt_usage < 3 ||
+ if (call->peer->rtt_count < 3 ||
ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000), now))
rxrpc_propose_ACK(call, RXRPC_ACK_PING, sp->hdr.serial,
true, true,
} else {
unsigned long now = jiffies, ack_at;
- if (call->peer->rtt_usage > 0)
- ack_at = nsecs_to_jiffies(call->peer->rtt);
+ if (call->peer->srtt_us != 0)
+ ack_at = usecs_to_jiffies(call->peer->srtt_us >> 3);
else
ack_at = expiry;
static void rxrpc_resend(struct rxrpc_call *call, unsigned long now_j)
{
struct sk_buff *skb;
- unsigned long resend_at;
+ unsigned long resend_at, rto_j;
rxrpc_seq_t cursor, seq, top;
- ktime_t now, max_age, oldest, ack_ts, timeout, min_timeo;
+ ktime_t now, max_age, oldest, ack_ts;
int ix;
u8 annotation, anno_type, retrans = 0, unacked = 0;
_enter("{%d,%d}", call->tx_hard_ack, call->tx_top);
- if (call->peer->rtt_usage > 1)
- timeout = ns_to_ktime(call->peer->rtt * 3 / 2);
- else
- timeout = ms_to_ktime(rxrpc_resend_timeout);
- min_timeo = ns_to_ktime((1000000000 / HZ) * 4);
- if (ktime_before(timeout, min_timeo))
- timeout = min_timeo;
+ rto_j = call->peer->rto_j;
now = ktime_get_real();
- max_age = ktime_sub(now, timeout);
+ max_age = ktime_sub(now, jiffies_to_usecs(rto_j));
spin_lock_bh(&call->lock);
}
resend_at = nsecs_to_jiffies(ktime_to_ns(ktime_sub(now, oldest)));
- resend_at += jiffies + rxrpc_resend_timeout;
+ resend_at += jiffies + rto_j;
WRITE_ONCE(call->resend_at, resend_at);
if (unacked)
rxrpc_timer_set_for_resend);
spin_unlock_bh(&call->lock);
ack_ts = ktime_sub(now, call->acks_latest_ts);
- if (ktime_to_ns(ack_ts) < call->peer->rtt)
+ if (ktime_to_us(ack_ts) < (call->peer->srtt_us >> 3))
goto out;
rxrpc_propose_ACK(call, RXRPC_ACK_PING, 0, true, false,
rxrpc_propose_ack_ping_for_lost_ack);
/* We analyse the number of packets that get ACK'd per RTT
* period and increase the window if we managed to fill it.
*/
- if (call->peer->rtt_usage == 0)
+ if (call->peer->rtt_count == 0)
goto out;
if (ktime_before(skb->tstamp,
- ktime_add_ns(call->cong_tstamp,
- call->peer->rtt)))
+ ktime_add_us(call->cong_tstamp,
+ call->peer->srtt_us >> 3)))
goto out_no_clear_ca;
change = rxrpc_cong_rtt_window_end;
call->cong_tstamp = skb->tstamp;
}
}
+/*
+ * Return true if the ACK is valid - ie. it doesn't appear to have regressed
+ * with respect to the ack state conveyed by preceding ACKs.
+ */
+static bool rxrpc_is_ack_valid(struct rxrpc_call *call,
+ rxrpc_seq_t first_pkt, rxrpc_seq_t prev_pkt)
+{
+ rxrpc_seq_t base = READ_ONCE(call->ackr_first_seq);
+
+ if (after(first_pkt, base))
+ return true; /* The window advanced */
+
+ if (before(first_pkt, base))
+ return false; /* firstPacket regressed */
+
+ if (after_eq(prev_pkt, call->ackr_prev_seq))
+ return true; /* previousPacket hasn't regressed. */
+
+ /* Some rx implementations put a serial number in previousPacket. */
+ if (after_eq(prev_pkt, base + call->tx_winsize))
+ return false;
+ return true;
+}
+
/*
* Process an ACK packet.
*
}
/* Discard any out-of-order or duplicate ACKs (outside lock). */
- if (before(first_soft_ack, call->ackr_first_seq) ||
- before(prev_pkt, call->ackr_prev_seq))
+ if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
+ trace_rxrpc_rx_discard_ack(call->debug_id, sp->hdr.serial,
+ first_soft_ack, call->ackr_first_seq,
+ prev_pkt, call->ackr_prev_seq);
return;
+ }
buf.info.rxMTU = 0;
ioffset = offset + nr_acks + 3;
spin_lock(&call->input_lock);
/* Discard any out-of-order or duplicate ACKs (inside lock). */
- if (before(first_soft_ack, call->ackr_first_seq) ||
- before(prev_pkt, call->ackr_prev_seq))
+ if (!rxrpc_is_ack_valid(call, first_soft_ack, prev_pkt)) {
+ trace_rxrpc_rx_discard_ack(call->debug_id, sp->hdr.serial,
+ first_soft_ack, call->ackr_first_seq,
+ prev_pkt, call->ackr_prev_seq);
goto out;
+ }
call->acks_latest_ts = skb->tstamp;
call->ackr_first_seq = first_soft_ack;
*/
unsigned int rxrpc_rx_jumbo_max = 4;
-/*
- * Time till packet resend (in milliseconds).
- */
-unsigned long rxrpc_resend_timeout = 4 * HZ;
-
const s8 rxrpc_ack_priority[] = {
[0] = 0,
[RXRPC_ACK_DELAY] = 1,
(test_and_clear_bit(RXRPC_CALL_EV_ACK_LOST, &call->events) ||
retrans ||
call->cong_mode == RXRPC_CALL_SLOW_START ||
- (call->peer->rtt_usage < 3 && sp->hdr.seq & 1) ||
+ (call->peer->rtt_count < 3 && sp->hdr.seq & 1) ||
ktime_before(ktime_add_ms(call->peer->rtt_last_req, 1000),
ktime_get_real())))
whdr.flags |= RXRPC_REQUEST_ACK;
if (whdr.flags & RXRPC_REQUEST_ACK) {
call->peer->rtt_last_req = skb->tstamp;
trace_rxrpc_rtt_tx(call, rxrpc_rtt_tx_data, serial);
- if (call->peer->rtt_usage > 1) {
+ if (call->peer->rtt_count > 1) {
unsigned long nowj = jiffies, ack_lost_at;
- ack_lost_at = nsecs_to_jiffies(2 * call->peer->rtt);
- if (ack_lost_at < 1)
- ack_lost_at = 1;
-
+ ack_lost_at = rxrpc_get_rto_backoff(call->peer, retrans);
ack_lost_at += nowj;
WRITE_ONCE(call->ack_lost_at, ack_lost_at);
rxrpc_reduce_call_timer(call, ack_lost_at, nowj,
}
}
-/*
- * Add RTT information to cache. This is called in softirq mode and has
- * exclusive access to the peer RTT data.
- */
-void rxrpc_peer_add_rtt(struct rxrpc_call *call, enum rxrpc_rtt_rx_trace why,
- rxrpc_serial_t send_serial, rxrpc_serial_t resp_serial,
- ktime_t send_time, ktime_t resp_time)
-{
- struct rxrpc_peer *peer = call->peer;
- s64 rtt;
- u64 sum = peer->rtt_sum, avg;
- u8 cursor = peer->rtt_cursor, usage = peer->rtt_usage;
-
- rtt = ktime_to_ns(ktime_sub(resp_time, send_time));
- if (rtt < 0)
- return;
-
- spin_lock(&peer->rtt_input_lock);
-
- /* Replace the oldest datum in the RTT buffer */
- sum -= peer->rtt_cache[cursor];
- sum += rtt;
- peer->rtt_cache[cursor] = rtt;
- peer->rtt_cursor = (cursor + 1) & (RXRPC_RTT_CACHE_SIZE - 1);
- peer->rtt_sum = sum;
- if (usage < RXRPC_RTT_CACHE_SIZE) {
- usage++;
- peer->rtt_usage = usage;
- }
-
- spin_unlock(&peer->rtt_input_lock);
-
- /* Now recalculate the average */
- if (usage == RXRPC_RTT_CACHE_SIZE) {
- avg = sum / RXRPC_RTT_CACHE_SIZE;
- } else {
- avg = sum;
- do_div(avg, usage);
- }
-
- /* Don't need to update this under lock */
- peer->rtt = avg;
- trace_rxrpc_rtt_rx(call, why, send_serial, resp_serial, rtt,
- usage, avg);
-}
-
/*
* Perform keep-alive pings.
*/
spin_lock_init(&peer->rtt_input_lock);
peer->debug_id = atomic_inc_return(&rxrpc_debug_id);
+ rxrpc_peer_init_rtt(peer);
+
if (RXRPC_TX_SMSS > 2190)
peer->cong_cwnd = 2;
else if (RXRPC_TX_SMSS > 1095)
EXPORT_SYMBOL(rxrpc_kernel_get_peer);
/**
- * rxrpc_kernel_get_rtt - Get a call's peer RTT
+ * rxrpc_kernel_get_srtt - Get a call's peer smoothed RTT
* @sock: The socket on which the call is in progress.
* @call: The call to query
*
- * Get the call's peer RTT.
+ * Get the call's peer smoothed RTT.
*/
-u64 rxrpc_kernel_get_rtt(struct socket *sock, struct rxrpc_call *call)
+u32 rxrpc_kernel_get_srtt(struct socket *sock, struct rxrpc_call *call)
{
- return call->peer->rtt;
+ return call->peer->srtt_us >> 3;
}
-EXPORT_SYMBOL(rxrpc_kernel_get_rtt);
+EXPORT_SYMBOL(rxrpc_kernel_get_srtt);
seq_puts(seq,
"Proto Local "
" Remote "
- " Use CW MTU LastUse RTT Rc\n"
+ " Use CW MTU LastUse RTT RTO\n"
);
return 0;
}
now = ktime_get_seconds();
seq_printf(seq,
"UDP %-47.47s %-47.47s %3u"
- " %3u %5u %6llus %12llu %2u\n",
+ " %3u %5u %6llus %8u %8u\n",
lbuff,
rbuff,
atomic_read(&peer->usage),
peer->cong_cwnd,
peer->mtu,
now - peer->last_tx_at,
- peer->rtt,
- peer->rtt_cursor);
+ peer->srtt_us >> 3,
+ jiffies_to_usecs(peer->rto_j));
return 0;
}
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/* RTT/RTO calculation.
+ *
+ * Adapted from TCP for AF_RXRPC by David Howells (dhowells@redhat.com)
+ *
+ * https://tools.ietf.org/html/rfc6298
+ * https://tools.ietf.org/html/rfc1122#section-4.2.3.1
+ * http://ccr.sigcomm.org/archive/1995/jan95/ccr-9501-partridge87.pdf
+ */
+
+#include <linux/net.h>
+#include "ar-internal.h"
+
+#define RXRPC_RTO_MAX ((unsigned)(120 * HZ))
+#define RXRPC_TIMEOUT_INIT ((unsigned)(1*HZ)) /* RFC6298 2.1 initial RTO value */
+#define rxrpc_jiffies32 ((u32)jiffies) /* As rxrpc_jiffies32 */
+#define rxrpc_min_rtt_wlen 300 /* As sysctl_tcp_min_rtt_wlen */
+
+static u32 rxrpc_rto_min_us(struct rxrpc_peer *peer)
+{
+ return 200;
+}
+
+static u32 __rxrpc_set_rto(const struct rxrpc_peer *peer)
+{
+ return _usecs_to_jiffies((peer->srtt_us >> 3) + peer->rttvar_us);
+}
+
+static u32 rxrpc_bound_rto(u32 rto)
+{
+ return min(rto, RXRPC_RTO_MAX);
+}
+
+/*
+ * Called to compute a smoothed rtt estimate. The data fed to this
+ * routine either comes from timestamps, or from segments that were
+ * known _not_ to have been retransmitted [see Karn/Partridge
+ * Proceedings SIGCOMM 87]. The algorithm is from the SIGCOMM 88
+ * piece by Van Jacobson.
+ * NOTE: the next three routines used to be one big routine.
+ * To save cycles in the RFC 1323 implementation it was better to break
+ * it up into three procedures. -- erics
+ */
+static void rxrpc_rtt_estimator(struct rxrpc_peer *peer, long sample_rtt_us)
+{
+ long m = sample_rtt_us; /* RTT */
+ u32 srtt = peer->srtt_us;
+
+ /* The following amusing code comes from Jacobson's
+ * article in SIGCOMM '88. Note that rtt and mdev
+ * are scaled versions of rtt and mean deviation.
+ * This is designed to be as fast as possible
+ * m stands for "measurement".
+ *
+ * On a 1990 paper the rto value is changed to:
+ * RTO = rtt + 4 * mdev
+ *
+ * Funny. This algorithm seems to be very broken.
+ * These formulae increase RTO, when it should be decreased, increase
+ * too slowly, when it should be increased quickly, decrease too quickly
+ * etc. I guess in BSD RTO takes ONE value, so that it is absolutely
+ * does not matter how to _calculate_ it. Seems, it was trap
+ * that VJ failed to avoid. 8)
+ */
+ if (srtt != 0) {
+ m -= (srtt >> 3); /* m is now error in rtt est */
+ srtt += m; /* rtt = 7/8 rtt + 1/8 new */
+ if (m < 0) {
+ m = -m; /* m is now abs(error) */
+ m -= (peer->mdev_us >> 2); /* similar update on mdev */
+ /* This is similar to one of Eifel findings.
+ * Eifel blocks mdev updates when rtt decreases.
+ * This solution is a bit different: we use finer gain
+ * for mdev in this case (alpha*beta).
+ * Like Eifel it also prevents growth of rto,
+ * but also it limits too fast rto decreases,
+ * happening in pure Eifel.
+ */
+ if (m > 0)
+ m >>= 3;
+ } else {
+ m -= (peer->mdev_us >> 2); /* similar update on mdev */
+ }
+
+ peer->mdev_us += m; /* mdev = 3/4 mdev + 1/4 new */
+ if (peer->mdev_us > peer->mdev_max_us) {
+ peer->mdev_max_us = peer->mdev_us;
+ if (peer->mdev_max_us > peer->rttvar_us)
+ peer->rttvar_us = peer->mdev_max_us;
+ }
+ } else {
+ /* no previous measure. */
+ srtt = m << 3; /* take the measured time to be rtt */
+ peer->mdev_us = m << 1; /* make sure rto = 3*rtt */
+ peer->rttvar_us = max(peer->mdev_us, rxrpc_rto_min_us(peer));
+ peer->mdev_max_us = peer->rttvar_us;
+ }
+
+ peer->srtt_us = max(1U, srtt);
+}
+
+/*
+ * Calculate rto without backoff. This is the second half of Van Jacobson's
+ * routine referred to above.
+ */
+static void rxrpc_set_rto(struct rxrpc_peer *peer)
+{
+ u32 rto;
+
+ /* 1. If rtt variance happened to be less 50msec, it is hallucination.
+ * It cannot be less due to utterly erratic ACK generation made
+ * at least by solaris and freebsd. "Erratic ACKs" has _nothing_
+ * to do with delayed acks, because at cwnd>2 true delack timeout
+ * is invisible. Actually, Linux-2.4 also generates erratic
+ * ACKs in some circumstances.
+ */
+ rto = __rxrpc_set_rto(peer);
+
+ /* 2. Fixups made earlier cannot be right.
+ * If we do not estimate RTO correctly without them,
+ * all the algo is pure shit and should be replaced
+ * with correct one. It is exactly, which we pretend to do.
+ */
+
+ /* NOTE: clamping at RXRPC_RTO_MIN is not required, current algo
+ * guarantees that rto is higher.
+ */
+ peer->rto_j = rxrpc_bound_rto(rto);
+}
+
+static void rxrpc_ack_update_rtt(struct rxrpc_peer *peer, long rtt_us)
+{
+ if (rtt_us < 0)
+ return;
+
+ //rxrpc_update_rtt_min(peer, rtt_us);
+ rxrpc_rtt_estimator(peer, rtt_us);
+ rxrpc_set_rto(peer);
+
+ /* RFC6298: only reset backoff on valid RTT measurement. */
+ peer->backoff = 0;
+}
+
+/*
+ * Add RTT information to cache. This is called in softirq mode and has
+ * exclusive access to the peer RTT data.
+ */
+void rxrpc_peer_add_rtt(struct rxrpc_call *call, enum rxrpc_rtt_rx_trace why,
+ rxrpc_serial_t send_serial, rxrpc_serial_t resp_serial,
+ ktime_t send_time, ktime_t resp_time)
+{
+ struct rxrpc_peer *peer = call->peer;
+ s64 rtt_us;
+
+ rtt_us = ktime_to_us(ktime_sub(resp_time, send_time));
+ if (rtt_us < 0)
+ return;
+
+ spin_lock(&peer->rtt_input_lock);
+ rxrpc_ack_update_rtt(peer, rtt_us);
+ if (peer->rtt_count < 3)
+ peer->rtt_count++;
+ spin_unlock(&peer->rtt_input_lock);
+
+ trace_rxrpc_rtt_rx(call, why, send_serial, resp_serial,
+ peer->srtt_us >> 3, peer->rto_j);
+}
+
+/*
+ * Get the retransmission timeout to set in jiffies, backing it off each time
+ * we retransmit.
+ */
+unsigned long rxrpc_get_rto_backoff(struct rxrpc_peer *peer, bool retrans)
+{
+ u64 timo_j;
+ u8 backoff = READ_ONCE(peer->backoff);
+
+ timo_j = peer->rto_j;
+ timo_j <<= backoff;
+ if (retrans && timo_j * 2 <= RXRPC_RTO_MAX)
+ WRITE_ONCE(peer->backoff, backoff + 1);
+
+ if (timo_j < 1)
+ timo_j = 1;
+
+ return timo_j;
+}
+
+void rxrpc_peer_init_rtt(struct rxrpc_peer *peer)
+{
+ peer->rto_j = RXRPC_TIMEOUT_INIT;
+ peer->mdev_us = jiffies_to_usecs(RXRPC_TIMEOUT_INIT);
+ peer->backoff = 0;
+ //minmax_reset(&peer->rtt_min, rxrpc_jiffies32, ~0U);
+}
ret = rxkad_decrypt_ticket(conn, skb, ticket, ticket_len, &session_key,
&expiry, _abort_code);
if (ret < 0)
- goto temporary_error_free_resp;
+ goto temporary_error_free_ticket;
/* use the session key from inside the ticket to decrypt the
* response */
temporary_error_free_ticket:
kfree(ticket);
-temporary_error_free_resp:
kfree(response);
temporary_error:
/* Ignore the response packet if we got a temporary error such as
struct rxrpc_call *call)
{
rxrpc_seq_t tx_start, tx_win;
- signed long rtt2, timeout;
- u64 rtt;
+ signed long rtt, timeout;
- rtt = READ_ONCE(call->peer->rtt);
- rtt2 = nsecs_to_jiffies64(rtt) * 2;
- if (rtt2 < 2)
- rtt2 = 2;
+ rtt = READ_ONCE(call->peer->srtt_us) >> 3;
+ rtt = usecs_to_jiffies(rtt) * 2;
+ if (rtt < 2)
+ rtt = 2;
- timeout = rtt2;
+ timeout = rtt;
tx_start = READ_ONCE(call->tx_hard_ack);
for (;;) {
return -EINTR;
if (tx_win != tx_start) {
- timeout = rtt2;
+ timeout = rtt;
tx_start = tx_win;
}
_debug("need instant resend %d", ret);
rxrpc_instant_resend(call, ix);
} else {
- unsigned long now = jiffies, resend_at;
+ unsigned long now = jiffies;
+ unsigned long resend_at = now + call->peer->rto_j;
- if (call->peer->rtt_usage > 1)
- resend_at = nsecs_to_jiffies(call->peer->rtt * 3 / 2);
- else
- resend_at = rxrpc_resend_timeout;
- if (resend_at < 1)
- resend_at = 1;
-
- resend_at += now;
WRITE_ONCE(call->resend_at, resend_at);
rxrpc_reduce_call_timer(call, resend_at, now,
rxrpc_timer_set_for_send);
.extra1 = (void *)&one_jiffy,
.extra2 = (void *)&max_jiffies,
},
- {
- .procname = "resend_timeout",
- .data = &rxrpc_resend_timeout,
- .maxlen = sizeof(unsigned long),
- .mode = 0644,
- .proc_handler = proc_doulongvec_ms_jiffies_minmax,
- .extra1 = (void *)&one_jiffy,
- .extra2 = (void *)&max_jiffies,
- },
/* Non-time values */
{
const struct nf_conntrack_tuple *tuple = &ct->tuplehash[dir].tuple;
struct nf_conntrack_tuple target;
+ if (!(ct->status & IPS_NAT_MASK))
+ return 0;
+
nf_ct_invert_tuple(&target, &ct->tuplehash[!dir].tuple);
switch (tuple->src.l3num) {
goto flow_error;
}
q->flows_cnt = nla_get_u32(tb[TCA_FQ_PIE_FLOWS]);
- if (!q->flows_cnt || q->flows_cnt > 65536) {
+ if (!q->flows_cnt || q->flows_cnt >= 65536) {
NL_SET_ERR_MSG_MOD(extack,
- "Number of flows must be < 65536");
+ "Number of flows must range in [1..65535]");
goto flow_error;
}
}
homing at either or both ends of an association."
To compile this protocol support as a module, choose M here: the
- module will be called sctp. Debug messages are handeled by the
+ module will be called sctp. Debug messages are handled by the
kernel's dynamic debugging framework.
If in doubt, say N.
if (crypto_shash_setkey(tfm, &asoc_key->data[0], asoc_key->len))
goto free;
- {
- SHASH_DESC_ON_STACK(desc, tfm);
-
- desc->tfm = tfm;
- crypto_shash_digest(desc, (u8 *)auth,
- end - (unsigned char *)auth, digest);
- shash_desc_zero(desc);
- }
+ crypto_shash_tfm_digest(tfm, (u8 *)auth, end - (unsigned char *)auth,
+ digest);
free:
if (free_key)
ntohs(init_chunk->chunk_hdr->length), raw_addrs, addrs_len);
if (sctp_sk(ep->base.sk)->hmac) {
- SHASH_DESC_ON_STACK(desc, sctp_sk(ep->base.sk)->hmac);
+ struct crypto_shash *tfm = sctp_sk(ep->base.sk)->hmac;
int err;
/* Sign the message. */
- desc->tfm = sctp_sk(ep->base.sk)->hmac;
-
- err = crypto_shash_setkey(desc->tfm, ep->secret_key,
+ err = crypto_shash_setkey(tfm, ep->secret_key,
sizeof(ep->secret_key)) ?:
- crypto_shash_digest(desc, (u8 *)&cookie->c, bodysize,
- cookie->signature);
- shash_desc_zero(desc);
+ crypto_shash_tfm_digest(tfm, (u8 *)&cookie->c, bodysize,
+ cookie->signature);
if (err)
goto free_cookie;
}
/* Check the signature. */
{
- SHASH_DESC_ON_STACK(desc, sctp_sk(ep->base.sk)->hmac);
+ struct crypto_shash *tfm = sctp_sk(ep->base.sk)->hmac;
int err;
- desc->tfm = sctp_sk(ep->base.sk)->hmac;
-
- err = crypto_shash_setkey(desc->tfm, ep->secret_key,
+ err = crypto_shash_setkey(tfm, ep->secret_key,
sizeof(ep->secret_key)) ?:
- crypto_shash_digest(desc, (u8 *)bear_cookie, bodysize,
- digest);
- shash_desc_zero(desc);
-
+ crypto_shash_tfm_digest(tfm, (u8 *)bear_cookie, bodysize,
+ digest);
if (err) {
*error = -SCTP_IERROR_NOMEM;
goto fail;
timeout = asoc->timeouts[cmd->obj.to];
BUG_ON(!timeout);
- timer->expires = jiffies + timeout;
- sctp_association_hold(asoc);
- add_timer(timer);
+ /*
+ * SCTP has a hard time with timer starts. Because we process
+ * timer starts as side effects, it can be hard to tell if we
+ * have already started a timer or not, which leads to BUG
+ * halts when we call add_timer. So here, instead of just starting
+ * a timer, if the timer is already started, and just mod
+ * the timer with the shorter of the two expiration times
+ */
+ if (!timer_pending(timer))
+ sctp_association_hold(asoc);
+ timer_reduce(timer, jiffies + timeout);
break;
case SCTP_CMD_TIMER_RESTART:
/* Update the content of current association. */
sctp_add_cmd_sf(commands, SCTP_CMD_UPDATE_ASSOC, SCTP_ASOC(new_asoc));
sctp_add_cmd_sf(commands, SCTP_CMD_EVENT_ULP, SCTP_ULPEVENT(ev));
- if (sctp_state(asoc, SHUTDOWN_PENDING) &&
+ if ((sctp_state(asoc, SHUTDOWN_PENDING) ||
+ sctp_state(asoc, SHUTDOWN_SENT)) &&
(sctp_sstate(asoc->base.sk, CLOSING) ||
sock_flag(asoc->base.sk, SOCK_DEAD))) {
- /* if were currently in SHUTDOWN_PENDING, but the socket
- * has been closed by user, don't transition to ESTABLISHED.
- * Instead trigger SHUTDOWN bundled with COOKIE_ACK.
+ /* If the socket has been closed by user, don't
+ * transition to ESTABLISHED. Instead trigger SHUTDOWN
+ * bundled with COOKIE_ACK.
*/
sctp_add_cmd_sf(commands, SCTP_CMD_REPLY, SCTP_CHUNK(repl));
return sctp_sf_do_9_2_start_shutdown(net, ep, asoc,
struct sockaddr_storage addr;
struct sctp_ulpevent *event;
+ if (asoc->state < SCTP_STATE_ESTABLISHED)
+ return;
+
memset(&addr, 0, sizeof(struct sockaddr_storage));
memcpy(&addr, &transport->ipaddr, transport->af_specific->sockaddr_len);
struct xdr_buf *rcv_buf = &rqstp->rq_rcv_buf;
struct kvec *head = rqstp->rq_rcv_buf.head;
struct rpc_auth *auth = cred->cr_auth;
- unsigned int savedlen = rcv_buf->len;
u32 offset, opaque_len, maj_stat;
__be32 *p;
offset = (u8 *)(p) - (u8 *)head->iov_base;
if (offset + opaque_len > rcv_buf->len)
goto unwrap_failed;
- rcv_buf->len = offset + opaque_len;
- maj_stat = gss_unwrap(ctx->gc_gss_ctx, offset, rcv_buf);
+ maj_stat = gss_unwrap(ctx->gc_gss_ctx, offset,
+ offset + opaque_len, rcv_buf);
if (maj_stat == GSS_S_CONTEXT_EXPIRED)
clear_bit(RPCAUTH_CRED_UPTODATE, &cred->cr_flags);
if (maj_stat != GSS_S_COMPLETE)
*/
xdr_init_decode(xdr, rcv_buf, p, rqstp);
- auth->au_rslack = auth->au_verfsize + 2 +
- XDR_QUADLEN(savedlen - rcv_buf->len);
- auth->au_ralign = auth->au_verfsize + 2 +
- XDR_QUADLEN(savedlen - rcv_buf->len);
+ auth->au_rslack = auth->au_verfsize + 2 + ctx->gc_gss_ctx->slack;
+ auth->au_ralign = auth->au_verfsize + 2 + ctx->gc_gss_ctx->align;
+
return 0;
unwrap_failed:
trace_rpcgss_unwrap_failed(task);
}
u32
-gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, struct xdr_buf *buf,
- u32 *headskip, u32 *tailskip)
+gss_krb5_aes_decrypt(struct krb5_ctx *kctx, u32 offset, u32 len,
+ struct xdr_buf *buf, u32 *headskip, u32 *tailskip)
{
struct xdr_buf subbuf;
u32 ret = 0;
/* create a segment skipping the header and leaving out the checksum */
xdr_buf_subsegment(buf, &subbuf, offset + GSS_KRB5_TOK_HDR_LEN,
- (buf->len - offset - GSS_KRB5_TOK_HDR_LEN -
+ (len - offset - GSS_KRB5_TOK_HDR_LEN -
kctx->gk5e->cksumlength));
nblocks = (subbuf.len + blocksize - 1) / blocksize;
goto out_err;
/* Get the packet's hmac value */
- ret = read_bytes_from_xdr_buf(buf, buf->len - kctx->gk5e->cksumlength,
+ ret = read_bytes_from_xdr_buf(buf, len - kctx->gk5e->cksumlength,
pkt_hmac, kctx->gk5e->cksumlength);
if (ret)
goto out_err;
}
static u32
-gss_unwrap_kerberos_v1(struct krb5_ctx *kctx, int offset, struct xdr_buf *buf)
+gss_unwrap_kerberos_v1(struct krb5_ctx *kctx, int offset, int len,
+ struct xdr_buf *buf, unsigned int *slack,
+ unsigned int *align)
{
int signalg;
int sealalg;
u32 conflen = kctx->gk5e->conflen;
int crypt_offset;
u8 *cksumkey;
+ unsigned int saved_len = buf->len;
dprintk("RPC: gss_unwrap_kerberos\n");
ptr = (u8 *)buf->head[0].iov_base + offset;
if (g_verify_token_header(&kctx->mech_used, &bodysize, &ptr,
- buf->len - offset))
+ len - offset))
return GSS_S_DEFECTIVE_TOKEN;
if ((ptr[0] != ((KG_TOK_WRAP_MSG >> 8) & 0xff)) ||
(!kctx->initiate && direction != 0))
return GSS_S_BAD_SIG;
+ buf->len = len;
if (kctx->enctype == ENCTYPE_ARCFOUR_HMAC) {
struct crypto_sync_skcipher *cipher;
int err;
data_len = (buf->head[0].iov_base + buf->head[0].iov_len) - data_start;
memmove(orig_start, data_start, data_len);
buf->head[0].iov_len -= (data_start - orig_start);
- buf->len -= (data_start - orig_start);
+ buf->len = len - (data_start - orig_start);
if (gss_krb5_remove_padding(buf, blocksize))
return GSS_S_DEFECTIVE_TOKEN;
+ /* slack must include room for krb5 padding */
+ *slack = XDR_QUADLEN(saved_len - buf->len);
+ /* The GSS blob always precedes the RPC message payload */
+ *align = *slack;
return GSS_S_COMPLETE;
}
}
static u32
-gss_unwrap_kerberos_v2(struct krb5_ctx *kctx, int offset, struct xdr_buf *buf)
+gss_unwrap_kerberos_v2(struct krb5_ctx *kctx, int offset, int len,
+ struct xdr_buf *buf, unsigned int *slack,
+ unsigned int *align)
{
time64_t now;
u8 *ptr;
if (rrc != 0)
rotate_left(offset + 16, buf, rrc);
- err = (*kctx->gk5e->decrypt_v2)(kctx, offset, buf,
+ err = (*kctx->gk5e->decrypt_v2)(kctx, offset, len, buf,
&headskip, &tailskip);
if (err)
return GSS_S_FAILURE;
* it against the original
*/
err = read_bytes_from_xdr_buf(buf,
- buf->len - GSS_KRB5_TOK_HDR_LEN - tailskip,
+ len - GSS_KRB5_TOK_HDR_LEN - tailskip,
decrypted_hdr, GSS_KRB5_TOK_HDR_LEN);
if (err) {
dprintk("%s: error %u getting decrypted_hdr\n", __func__, err);
* Note that buf->head[0].iov_len may indicate the available
* head buffer space rather than that actually occupied.
*/
- movelen = min_t(unsigned int, buf->head[0].iov_len, buf->len);
+ movelen = min_t(unsigned int, buf->head[0].iov_len, len);
movelen -= offset + GSS_KRB5_TOK_HDR_LEN + headskip;
- if (offset + GSS_KRB5_TOK_HDR_LEN + headskip + movelen >
- buf->head[0].iov_len)
- return GSS_S_FAILURE;
+ BUG_ON(offset + GSS_KRB5_TOK_HDR_LEN + headskip + movelen >
+ buf->head[0].iov_len);
memmove(ptr, ptr + GSS_KRB5_TOK_HDR_LEN + headskip, movelen);
buf->head[0].iov_len -= GSS_KRB5_TOK_HDR_LEN + headskip;
- buf->len -= GSS_KRB5_TOK_HDR_LEN + headskip;
+ buf->len = len - GSS_KRB5_TOK_HDR_LEN + headskip;
/* Trim off the trailing "extra count" and checksum blob */
- buf->len -= ec + GSS_KRB5_TOK_HDR_LEN + tailskip;
+ xdr_buf_trim(buf, ec + GSS_KRB5_TOK_HDR_LEN + tailskip);
+ *align = XDR_QUADLEN(GSS_KRB5_TOK_HDR_LEN + headskip);
+ *slack = *align + XDR_QUADLEN(ec + GSS_KRB5_TOK_HDR_LEN + tailskip);
return GSS_S_COMPLETE;
}
}
u32
-gss_unwrap_kerberos(struct gss_ctx *gctx, int offset, struct xdr_buf *buf)
+gss_unwrap_kerberos(struct gss_ctx *gctx, int offset,
+ int len, struct xdr_buf *buf)
{
struct krb5_ctx *kctx = gctx->internal_ctx_id;
case ENCTYPE_DES_CBC_RAW:
case ENCTYPE_DES3_CBC_RAW:
case ENCTYPE_ARCFOUR_HMAC:
- return gss_unwrap_kerberos_v1(kctx, offset, buf);
+ return gss_unwrap_kerberos_v1(kctx, offset, len, buf,
+ &gctx->slack, &gctx->align);
case ENCTYPE_AES128_CTS_HMAC_SHA1_96:
case ENCTYPE_AES256_CTS_HMAC_SHA1_96:
- return gss_unwrap_kerberos_v2(kctx, offset, buf);
+ return gss_unwrap_kerberos_v2(kctx, offset, len, buf,
+ &gctx->slack, &gctx->align);
}
}
u32
gss_unwrap(struct gss_ctx *ctx_id,
int offset,
+ int len,
struct xdr_buf *buf)
{
return ctx_id->mech_type->gm_ops
- ->gss_unwrap(ctx_id, offset, buf);
+ ->gss_unwrap(ctx_id, offset, len, buf);
}
if (svc_getnl(&buf->head[0]) != seq)
goto out;
/* trim off the mic and padding at the end before returning */
- buf->len -= 4 + round_up_to_quad(mic.len);
+ xdr_buf_trim(buf, round_up_to_quad(mic.len) + 4);
stat = 0;
out:
kfree(mic.data);
unwrap_priv_data(struct svc_rqst *rqstp, struct xdr_buf *buf, u32 seq, struct gss_ctx *ctx)
{
u32 priv_len, maj_stat;
- int pad, saved_len, remaining_len, offset;
+ int pad, remaining_len, offset;
clear_bit(RQ_SPLICE_OK, &rqstp->rq_flags);
buf->len -= pad;
fix_priv_head(buf, pad);
- /* Maybe it would be better to give gss_unwrap a length parameter: */
- saved_len = buf->len;
- buf->len = priv_len;
- maj_stat = gss_unwrap(ctx, 0, buf);
+ maj_stat = gss_unwrap(ctx, 0, priv_len, buf);
pad = priv_len - buf->len;
- buf->len = saved_len;
buf->len -= pad;
/* The upper layers assume the buffer is aligned on 4-byte boundaries.
* In the krb5p case, at least, the data ends up offset, so we need to
* here.
*/
rpc_clnt_debugfs_unregister(clnt);
+ rpc_free_clid(clnt);
rpc_clnt_remove_pipedir(clnt);
+ xprt_put(rcu_dereference_raw(clnt->cl_xprt));
kfree(clnt);
rpciod_down();
rpc_unregister_client(clnt);
rpc_free_iostats(clnt->cl_metrics);
clnt->cl_metrics = NULL;
- xprt_put(rcu_dereference_raw(clnt->cl_xprt));
xprt_iter_destroy(&clnt->cl_xpi);
put_cred(clnt->cl_cred);
- rpc_free_clid(clnt);
INIT_WORK(&clnt->cl_work, rpc_free_client_work);
schedule_work(&clnt->cl_work);
{
struct rpc_clnt *clnt = task->tk_client;
+ if (RPC_SIGNALLED(task)) {
+ rpc_call_rpcerror(task, -ERESTARTSYS);
+ return;
+ }
+
if (xprt_adjust_timeout(task->tk_rqstp) == 0)
return;
}
EXPORT_SYMBOL_GPL(xdr_buf_subsegment);
+/**
+ * xdr_buf_trim - lop at most "len" bytes off the end of "buf"
+ * @buf: buf to be trimmed
+ * @len: number of bytes to reduce "buf" by
+ *
+ * Trim an xdr_buf by the given number of bytes by fixing up the lengths. Note
+ * that it's possible that we'll trim less than that amount if the xdr_buf is
+ * too small, or if (for instance) it's all in the head and the parser has
+ * already read too far into it.
+ */
+void xdr_buf_trim(struct xdr_buf *buf, unsigned int len)
+{
+ size_t cur;
+ unsigned int trim = len;
+
+ if (buf->tail[0].iov_len) {
+ cur = min_t(size_t, buf->tail[0].iov_len, trim);
+ buf->tail[0].iov_len -= cur;
+ trim -= cur;
+ if (!trim)
+ goto fix_len;
+ }
+
+ if (buf->page_len) {
+ cur = min_t(unsigned int, buf->page_len, trim);
+ buf->page_len -= cur;
+ trim -= cur;
+ if (!trim)
+ goto fix_len;
+ }
+
+ if (buf->head[0].iov_len) {
+ cur = min_t(size_t, buf->head[0].iov_len, trim);
+ buf->head[0].iov_len -= cur;
+ trim -= cur;
+ }
+fix_len:
+ buf->len -= (len - trim);
+}
+EXPORT_SYMBOL_GPL(xdr_buf_trim);
+
static void __read_bytes_from_xdr_buf(struct xdr_buf *subbuf, void *obj, unsigned int len)
{
unsigned int this_len;
return 0;
}
-static void tipc_sk_send_ack(struct tipc_sock *tsk)
+static struct sk_buff *tipc_sk_build_ack(struct tipc_sock *tsk)
{
struct sock *sk = &tsk->sk;
- struct net *net = sock_net(sk);
struct sk_buff *skb = NULL;
struct tipc_msg *msg;
u32 peer_port = tsk_peer_port(tsk);
u32 dnode = tsk_peer_node(tsk);
if (!tipc_sk_connected(sk))
- return;
+ return NULL;
skb = tipc_msg_create(CONN_MANAGER, CONN_ACK, INT_H_SIZE, 0,
dnode, tsk_own_node(tsk), peer_port,
tsk->portid, TIPC_OK);
if (!skb)
- return;
+ return NULL;
msg = buf_msg(skb);
msg_set_conn_ack(msg, tsk->rcv_unacked);
tsk->rcv_unacked = 0;
tsk->rcv_win = tsk_adv_blocks(tsk->sk.sk_rcvbuf);
msg_set_adv_win(msg, tsk->rcv_win);
}
- tipc_node_xmit_skb(net, skb, dnode, msg_link_selector(msg));
+ return skb;
+}
+
+static void tipc_sk_send_ack(struct tipc_sock *tsk)
+{
+ struct sk_buff *skb;
+
+ skb = tipc_sk_build_ack(tsk);
+ if (!skb)
+ return;
+
+ tipc_node_xmit_skb(sock_net(&tsk->sk), skb, tsk_peer_node(tsk),
+ msg_link_selector(buf_msg(skb)));
}
static int tipc_wait_for_rcvmsg(struct socket *sock, long *timeop)
bool peek = flags & MSG_PEEK;
int offset, required, copy, copied = 0;
int hlen, dlen, err, rc;
- bool ack = false;
long timeout;
/* Catch invalid receive attempts */
/* Copy data if msg ok, otherwise return error/partial data */
if (likely(!err)) {
- ack = msg_ack_required(hdr);
offset = skb_cb->bytes_read;
copy = min_t(int, dlen - offset, buflen - copied);
rc = skb_copy_datagram_msg(skb, hlen + offset, m, copy);
/* Send connection flow control advertisement when applicable */
tsk->rcv_unacked += tsk_inc(tsk, hlen + dlen);
- if (ack || tsk->rcv_unacked >= tsk->rcv_win / TIPC_ACK_RATE)
+ if (tsk->rcv_unacked >= tsk->rcv_win / TIPC_ACK_RATE)
tipc_sk_send_ack(tsk);
/* Exit if all requested data or FIN/error received */
* tipc_sk_filter_connect - check incoming message for a connection-based socket
* @tsk: TIPC socket
* @skb: pointer to message buffer.
+ * @xmitq: for Nagle ACK if any
* Returns true if message should be added to receive queue, false otherwise
*/
-static bool tipc_sk_filter_connect(struct tipc_sock *tsk, struct sk_buff *skb)
+static bool tipc_sk_filter_connect(struct tipc_sock *tsk, struct sk_buff *skb,
+ struct sk_buff_head *xmitq)
{
struct sock *sk = &tsk->sk;
struct net *net = sock_net(sk);
if (!skb_queue_empty(&sk->sk_write_queue))
tipc_sk_push_backlog(tsk);
/* Accept only connection-based messages sent by peer */
- if (likely(con_msg && !err && pport == oport && pnode == onode))
+ if (likely(con_msg && !err && pport == oport &&
+ pnode == onode)) {
+ if (msg_ack_required(hdr)) {
+ struct sk_buff *skb;
+
+ skb = tipc_sk_build_ack(tsk);
+ if (skb)
+ __skb_queue_tail(xmitq, skb);
+ }
return true;
+ }
if (!tsk_peer_msg(tsk, hdr))
return false;
if (!err)
while ((skb = __skb_dequeue(&inputq))) {
hdr = buf_msg(skb);
limit = rcvbuf_limit(sk, skb);
- if ((sk_conn && !tipc_sk_filter_connect(tsk, skb)) ||
+ if ((sk_conn && !tipc_sk_filter_connect(tsk, skb, xmitq)) ||
(!sk_conn && msg_connected(hdr)) ||
(!grp && msg_in_group(hdr)))
err = TIPC_ERR_NO_PORT;
(swap_ ? swab32(val__) : val__); \
})
+/* tipc_sub_write - write val_ to field_ of struct sub_ in user endian format
+ */
+#define tipc_sub_write(sub_, field_, val_) \
+ ({ \
+ struct tipc_subscr *sub__ = sub_; \
+ u32 val__ = val_; \
+ int swap_ = !((sub__)->filter & TIPC_FILTER_MASK); \
+ (sub__)->field_ = swap_ ? swab32(val__) : val__; \
+ })
+
/* tipc_evt_write - write val_ to field_ of struct evt_ in user endian format
*/
#define tipc_evt_write(evt_, field_, val_) \
if (!s || !memcmp(s, &sub->evt.s, sizeof(*s))) {
tipc_sub_unsubscribe(sub);
atomic_dec(&tn->subscription_count);
- } else if (s) {
- break;
+ if (s)
+ break;
}
}
spin_unlock_bh(&con->sub_lock);
{
struct tipc_net *tn = tipc_net(srv->net);
struct tipc_subscription *sub;
+ u32 s_filter = tipc_sub_read(s, filter);
- if (tipc_sub_read(s, filter) & TIPC_SUB_CANCEL) {
- s->filter &= __constant_ntohl(~TIPC_SUB_CANCEL);
+ if (s_filter & TIPC_SUB_CANCEL) {
+ tipc_sub_write(s, filter, s_filter & ~TIPC_SUB_CANCEL);
tipc_conn_delete_sub(con, s);
return 0;
}
return -EWOULDBLOCK;
if (ret == sizeof(s)) {
read_lock_bh(&sk->sk_callback_lock);
- ret = tipc_conn_rcv_sub(srv, con, &s);
+ /* RACE: the connection can be closed in the meantime */
+ if (likely(connected(con)))
+ ret = tipc_conn_rcv_sub(srv, con, &s);
read_unlock_bh(&sk->sk_callback_lock);
if (!ret)
return 0;
struct udp_bearer *ub, struct udp_media_addr *src,
struct udp_media_addr *dst, struct dst_cache *cache)
{
- struct dst_entry *ndst = dst_cache_get(cache);
+ struct dst_entry *ndst;
int ttl, err = 0;
+ local_bh_disable();
+ ndst = dst_cache_get(cache);
if (dst->proto == htons(ETH_P_IP)) {
struct rtable *rt = (struct rtable *)ndst;
src->port, dst->port, false);
#endif
}
+ local_bh_enable();
return err;
tx_error:
+ local_bh_enable();
kfree_skb(skb);
return err;
}
kfree(aead_req);
+ spin_lock_bh(&ctx->decrypt_compl_lock);
pending = atomic_dec_return(&ctx->decrypt_pending);
- if (!pending && READ_ONCE(ctx->async_notify))
+ if (!pending && ctx->async_notify)
complete(&ctx->async_wait.completion);
+ spin_unlock_bh(&ctx->decrypt_compl_lock);
}
static int tls_do_decryption(struct sock *sk,
ready = true;
}
+ spin_lock_bh(&ctx->encrypt_compl_lock);
pending = atomic_dec_return(&ctx->encrypt_pending);
- if (!pending && READ_ONCE(ctx->async_notify))
+ if (!pending && ctx->async_notify)
complete(&ctx->async_wait.completion);
+ spin_unlock_bh(&ctx->encrypt_compl_lock);
if (!ready)
return;
static int bpf_exec_tx_verdict(struct sk_msg *msg, struct sock *sk,
bool full_record, u8 record_type,
- size_t *copied, int flags)
+ ssize_t *copied, int flags)
{
struct tls_context *tls_ctx = tls_get_ctx(sk);
struct tls_sw_context_tx *ctx = tls_sw_ctx_tx(tls_ctx);
psock = sk_psock_get(sk);
if (!psock || !policy) {
err = tls_push_record(sk, flags, record_type);
- if (err && err != -EINPROGRESS) {
+ if (err && sk->sk_err == EBADMSG) {
*copied -= sk_msg_free(sk, msg);
tls_free_open_rec(sk);
+ err = -sk->sk_err;
}
if (psock)
sk_psock_put(sk, psock);
switch (psock->eval) {
case __SK_PASS:
err = tls_push_record(sk, flags, record_type);
- if (err && err != -EINPROGRESS) {
+ if (err && sk->sk_err == EBADMSG) {
*copied -= sk_msg_free(sk, msg);
tls_free_open_rec(sk);
+ err = -sk->sk_err;
goto out_err;
}
break;
unsigned char record_type = TLS_RECORD_TYPE_DATA;
bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
bool eor = !(msg->msg_flags & MSG_MORE);
- size_t try_to_copy, copied = 0;
+ size_t try_to_copy;
+ ssize_t copied = 0;
struct sk_msg *msg_pl, *msg_en;
struct tls_rec *rec;
int required_size;
int num_zc = 0;
int orig_size;
int ret = 0;
+ int pending;
if (msg->msg_flags & ~(MSG_MORE | MSG_DONTWAIT | MSG_NOSIGNAL))
return -EOPNOTSUPP;
goto send_end;
} else if (num_zc) {
/* Wait for pending encryptions to get completed */
- smp_store_mb(ctx->async_notify, true);
+ spin_lock_bh(&ctx->encrypt_compl_lock);
+ ctx->async_notify = true;
- if (atomic_read(&ctx->encrypt_pending))
+ pending = atomic_read(&ctx->encrypt_pending);
+ spin_unlock_bh(&ctx->encrypt_compl_lock);
+ if (pending)
crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
else
reinit_completion(&ctx->async_wait.completion);
+ /* There can be no concurrent accesses, since we have no
+ * pending encrypt operations
+ */
WRITE_ONCE(ctx->async_notify, false);
if (ctx->async_wait.err) {
release_sock(sk);
mutex_unlock(&tls_ctx->tx_lock);
- return copied ? copied : ret;
+ return copied > 0 ? copied : ret;
}
static int tls_sw_do_sendpage(struct sock *sk, struct page *page,
struct sk_msg *msg_pl;
struct tls_rec *rec;
int num_async = 0;
- size_t copied = 0;
+ ssize_t copied = 0;
bool full_record;
int record_room;
int ret = 0;
}
sendpage_end:
ret = sk_stream_error(sk, flags, ret);
- return copied ? copied : ret;
+ return copied > 0 ? copied : ret;
}
int tls_sw_sendpage_locked(struct sock *sk, struct page *page,
bool is_kvec = iov_iter_is_kvec(&msg->msg_iter);
bool is_peek = flags & MSG_PEEK;
int num_async = 0;
+ int pending;
flags |= nonblock;
recv_end:
if (num_async) {
/* Wait for all previously submitted records to be decrypted */
- smp_store_mb(ctx->async_notify, true);
- if (atomic_read(&ctx->decrypt_pending)) {
+ spin_lock_bh(&ctx->decrypt_compl_lock);
+ ctx->async_notify = true;
+ pending = atomic_read(&ctx->decrypt_pending);
+ spin_unlock_bh(&ctx->decrypt_compl_lock);
+ if (pending) {
err = crypto_wait_req(-EINPROGRESS, &ctx->async_wait);
if (err) {
/* one of async decrypt failed */
} else {
reinit_completion(&ctx->async_wait.completion);
}
+
+ /* There can be no concurrent accesses, since we have no
+ * pending decrypt operations
+ */
WRITE_ONCE(ctx->async_notify, false);
/* Drain records from the rx_list & copy if required */
if (tx) {
crypto_init_wait(&sw_ctx_tx->async_wait);
+ spin_lock_init(&sw_ctx_tx->encrypt_compl_lock);
crypto_info = &ctx->crypto_send.info;
cctx = &ctx->tx;
aead = &sw_ctx_tx->aead_send;
sw_ctx_tx->tx_work.sk = sk;
} else {
crypto_init_wait(&sw_ctx_rx->async_wait);
+ spin_lock_init(&sw_ctx_rx->decrypt_compl_lock);
crypto_info = &ctx->crypto_recv.info;
cctx = &ctx->rx;
skb_queue_head_init(&sw_ctx_rx->rx_list);
/* Wait for children sockets to appear; these are the new sockets
* created upon connection establishment.
*/
- timeout = sock_sndtimeo(listener, flags & O_NONBLOCK);
+ timeout = sock_rcvtimeo(listener, flags & O_NONBLOCK);
prepare_to_wait(sk_sleep(listener), &wait, TASK_INTERRUPTIBLE);
while ((connected = vsock_dequeue_accept(listener)) == NULL &&
lock_sock(sk);
+ /* Check if sk has been released before lock_sock */
+ if (sk->sk_shutdown == SHUTDOWN_MASK) {
+ (void)virtio_transport_reset_no_sock(t, pkt);
+ release_sock(sk);
+ sock_put(sk);
+ goto free_pkt;
+ }
+
/* Update CID in case it has changed after a transport reset event */
vsk->local_addr.svm_cid = dst.svm_cid;
if (result)
return result;
- if (rdev->wiphy.debugfsdir)
+ if (!IS_ERR_OR_NULL(rdev->wiphy.debugfsdir))
debugfs_rename(rdev->wiphy.debugfsdir->d_parent,
rdev->wiphy.debugfsdir,
rdev->wiphy.debugfsdir->d_parent, newname);
{
bool unaligned_chunks = mr->flags & XDP_UMEM_UNALIGNED_CHUNK_FLAG;
u32 chunk_size = mr->chunk_size, headroom = mr->headroom;
+ u64 npgs, addr = mr->addr, size = mr->len;
unsigned int chunks, chunks_per_page;
- u64 addr = mr->addr, size = mr->len;
int err;
if (chunk_size < XDP_UMEM_MIN_CHUNK_SIZE || chunk_size > PAGE_SIZE) {
if ((addr + size) < addr)
return -EINVAL;
+ npgs = div_u64(size, PAGE_SIZE);
+ if (npgs > U32_MAX)
+ return -EINVAL;
+
chunks = (unsigned int)div_u64(size, chunk_size);
if (chunks == 0)
return -EINVAL;
umem->size = size;
umem->headroom = headroom;
umem->chunk_size_nohr = chunk_size - headroom;
- umem->npgs = size / PAGE_SIZE;
+ umem->npgs = (u32)npgs;
umem->pgs = NULL;
umem->user = NULL;
umem->flags = mr->flags;
{
struct espintcp_ctx *ctx = espintcp_getctx(sk);
+ ctx->saved_destruct(sk);
kfree(ctx);
}
sk->sk_socket->ops = &espintcp_ops;
ctx->saved_data_ready = sk->sk_data_ready;
ctx->saved_write_space = sk->sk_write_space;
+ ctx->saved_destruct = sk->sk_destruct;
sk->sk_data_ready = espintcp_data_ready;
sk->sk_write_space = espintcp_write_space;
sk->sk_destruct = espintcp_destruct;
struct xfrm_offload *xo = xfrm_offload(skb);
skb_reset_mac_len(skb);
- pskb_pull(skb, skb->mac_len + hsize + x->props.header_len);
-
- if (xo->flags & XFRM_GSO_SEGMENT) {
- skb_reset_transport_header(skb);
+ if (xo->flags & XFRM_GSO_SEGMENT)
skb->transport_header -= x->props.header_len;
- }
+
+ pskb_pull(skb, skb_transport_offset(skb) + x->props.header_len);
}
static void __xfrm_mode_tunnel_prep(struct xfrm_state *x, struct sk_buff *skb,
dev_put(skb->dev);
spin_lock(&x->lock);
- if (nexthdr <= 0) {
+ if (nexthdr < 0) {
if (nexthdr == -EBADMSG) {
xfrm_audit_state_icvfail(x, skb,
x->type->proto);
.get_link_net = xfrmi_get_link_net,
};
+static void __net_exit xfrmi_exit_batch_net(struct list_head *net_exit_list)
+{
+ struct net *net;
+ LIST_HEAD(list);
+
+ rtnl_lock();
+ list_for_each_entry(net, net_exit_list, exit_list) {
+ struct xfrmi_net *xfrmn = net_generic(net, xfrmi_net_id);
+ struct xfrm_if __rcu **xip;
+ struct xfrm_if *xi;
+
+ for (xip = &xfrmn->xfrmi[0];
+ (xi = rtnl_dereference(*xip)) != NULL;
+ xip = &xi->next)
+ unregister_netdevice_queue(xi->dev, &list);
+ }
+ unregister_netdevice_many(&list);
+ rtnl_unlock();
+}
+
static struct pernet_operations xfrmi_net_ops = {
+ .exit_batch = xfrmi_exit_batch_net,
.id = &xfrmi_net_id,
.size = sizeof(struct xfrmi_net),
};
xfrm_state_hold(x);
if (skb_is_gso(skb)) {
- skb_shinfo(skb)->gso_type |= SKB_GSO_ESP;
+ if (skb->inner_protocol)
+ return xfrm_output_gso(net, sk, skb);
- return xfrm_output2(net, sk, skb);
+ skb_shinfo(skb)->gso_type |= SKB_GSO_ESP;
+ goto out;
}
if (x->xso.dev && x->xso.dev->features & NETIF_F_HW_ESP_TX_CSUM)
goto out;
+ } else {
+ if (skb_is_gso(skb))
+ return xfrm_output_gso(net, sk, skb);
}
- if (skb_is_gso(skb))
- return xfrm_output_gso(net, sk, skb);
-
if (skb->ip_summed == CHECKSUM_PARTIAL) {
err = skb_checksum_help(skb);
if (err) {
if (skb->protocol == htons(ETH_P_IP))
proto = AF_INET;
- else if (skb->protocol == htons(ETH_P_IPV6))
+ else if (skb->protocol == htons(ETH_P_IPV6) &&
+ skb->sk->sk_family == AF_INET6)
proto = AF_INET6;
else
return;
static bool xfrm_policy_mark_match(struct xfrm_policy *policy,
struct xfrm_policy *pol)
{
- u32 mark = policy->mark.v & policy->mark.m;
-
- if (policy->mark.v == pol->mark.v && policy->mark.m == pol->mark.m)
- return true;
-
- if ((mark & pol->mark.m) == pol->mark.v &&
+ if (policy->mark.v == pol->mark.v &&
policy->priority == pol->priority)
return true;
#define MAX_INDEX 64
#define MAX_STARS 38
-char bpf_log_buf[BPF_LOG_BUF_SIZE];
-
static void stars(char *str, long val, long max, int width)
{
int i;
asm (
" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp1, @function\n"
" my_tramp1:"
" pushq %rbp\n"
" movq %rsp, %rbp\n"
" call my_direct_func1\n"
" leave\n"
+" .size my_tramp1, .-my_tramp1\n"
" ret\n"
+" .type my_tramp2, @function\n"
" my_tramp2:"
" pushq %rbp\n"
" movq %rsp, %rbp\n"
" call my_direct_func2\n"
" leave\n"
" ret\n"
+" .size my_tramp2, .-my_tramp2\n"
" .popsection\n"
);
asm (
" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp, @function\n"
" my_tramp:"
" pushq %rbp\n"
" movq %rsp, %rbp\n"
" popq %rdi\n"
" leave\n"
" ret\n"
+" .size my_tramp, .-my_tramp\n"
" .popsection\n"
);
asm (
" .pushsection .text, \"ax\", @progbits\n"
+" .type my_tramp, @function\n"
" my_tramp:"
" pushq %rbp\n"
" movq %rsp, %rbp\n"
" popq %rdi\n"
" leave\n"
" ret\n"
+" .size my_tramp, .-my_tramp\n"
" .popsection\n"
);
};
/* kprobe pre_handler: called just before the probed instruction is executed */
-static int handler_pre(struct kprobe *p, struct pt_regs *regs)
+static int __kprobes handler_pre(struct kprobe *p, struct pt_regs *regs)
{
#ifdef CONFIG_X86
pr_info("<%s> pre_handler: p->addr = 0x%p, ip = %lx, flags = 0x%lx\n",
}
/* kprobe post_handler: called after the probed instruction is executed */
-static void handler_post(struct kprobe *p, struct pt_regs *regs,
+static void __kprobes handler_post(struct kprobe *p, struct pt_regs *regs,
unsigned long flags)
{
#ifdef CONFIG_X86
/* Return 0 because we don't handle the fault. */
return 0;
}
+/* NOKPROBE_SYMBOL() is also available */
+NOKPROBE_SYMBOL(handler_fault);
static int __init kprobe_init(void)
{
data->entry_stamp = ktime_get();
return 0;
}
+NOKPROBE_SYMBOL(entry_handler);
/*
* Return-probe handler: Log the return value and duration. Duration may turn
func_name, retval, (long long)delta);
return 0;
}
+NOKPROBE_SYMBOL(ret_handler);
static struct kretprobe my_kretprobe = {
.handler = ret_handler,
my @ignore = ();
my $help = 0;
my $configuration_file = ".checkpatch.conf";
-my $max_line_length = 80;
+my $max_line_length = 100;
my $ignore_perl_version = 0;
my $minimum_perl_version = 5.10.0;
my $min_conf_desc_length = 4;
--types TYPE(,TYPE2...) show only these comma separated message types
--ignore TYPE(,TYPE2...) ignore various comma separated message types
--show-types show the specific message type in the output
- --max-line-length=n set the maximum line length, if exceeded, warn
+ --max-line-length=n set the maximum line length, (default $max_line_length)
+ if exceeded, warn on patches
+ requires --strict for use with --file
--min-conf-desc-length=n set the min description length, if shorter, warn
- --tab-size=n set the number of spaces for tab (default 8)
+ --tab-size=n set the number of spaces for tab (default $tabsize)
--root=PATH PATH to the kernel tree root
--no-summary suppress the per-file summary
--mailback only produce a report in case of warnings/errors
if ($msg_type ne "" &&
(show_type("LONG_LINE") || show_type($msg_type))) {
- WARN($msg_type,
- "line over $max_line_length characters\n" . $herecurr);
+ my $msg_level = \&WARN;
+ $msg_level = \&CHK if ($file);
+ &{$msg_level}($msg_type,
+ "line length of $length exceeds $max_line_length columns\n" . $herecurr);
}
}
${LD} ${KBUILD_LDFLAGS} -r -o ${1} ${objects}
}
+objtool_link()
+{
+ local objtoolopt;
+
+ if [ -n "${CONFIG_VMLINUX_VALIDATION}" ]; then
+ objtoolopt="check"
+ if [ -z "${CONFIG_FRAME_POINTER}" ]; then
+ objtoolopt="${objtoolopt} --no-fp"
+ fi
+ if [ -n "${CONFIG_GCOV_KERNEL}" ]; then
+ objtoolopt="${objtoolopt} --no-unreachable"
+ fi
+ if [ -n "${CONFIG_RETPOLINE}" ]; then
+ objtoolopt="${objtoolopt} --retpoline"
+ fi
+ if [ -n "${CONFIG_X86_SMAP}" ]; then
+ objtoolopt="${objtoolopt} --uaccess"
+ fi
+ info OBJTOOL ${1}
+ tools/objtool/objtool ${objtoolopt} ${1}
+ fi
+}
+
# Link of vmlinux
# ${1} - output file
# ${2}, ${3}, ... - optional extra .o files
#link vmlinux.o
info LD vmlinux.o
modpost_link vmlinux.o
+objtool_link vmlinux.o
# modpost vmlinux.o to check for section mismatches
${MAKE} -f "${srctree}/scripts/Makefile.modpost" MODPOST_VMLINUX=1
#define DATA_SECTIONS ".data", ".data.rel"
#define TEXT_SECTIONS ".text", ".text.unlikely", ".sched.text", \
- ".kprobes.text", ".cpuidle.text"
+ ".kprobes.text", ".cpuidle.text", ".noinstr.text"
#define OTHER_TEXT_SECTIONS ".ref.text", ".head.text", ".spinlock.text", \
".fixup", ".entry.text", ".exception.text", ".text.*", \
".coldtext"
obj-$(CONFIG_SECURITY_LOADPIN) += loadpin/
obj-$(CONFIG_SECURITY_SAFESETID) += safesetid/
obj-$(CONFIG_SECURITY_LOCKDOWN_LSM) += lockdown/
-obj-$(CONFIG_CGROUP_DEVICE) += device_cgroup.o
+obj-$(CONFIG_CGROUPS) += device_cgroup.o
obj-$(CONFIG_BPF_LSM) += bpf/
# Object integrity file lists
*/
error = aa_may_manage_policy(label, ns, mask);
if (error)
- return error;
+ goto end_section;
data = aa_simple_write_to_buffer(buf, size, size, pos);
error = PTR_ERR(data);
error = aa_replace_profiles(ns, label, mask, data);
aa_put_loaddata(data);
}
+end_section:
end_current_label_crit_section(label);
return error;
rule->label = aa_label_parse(&root_ns->unconfined->label, rulestr,
GFP_KERNEL, true, false);
if (IS_ERR(rule->label)) {
+ int err = PTR_ERR(rule->label);
aa_audit_rule_free(rule);
- return PTR_ERR(rule->label);
+ return err;
}
*vrule = rule;
ctx->nnp = aa_get_label(label);
if (!fqname || !*fqname) {
+ aa_put_label(label);
AA_DEBUG("no profile name");
return -EINVAL;
}
op = OP_CHANGE_PROFILE;
}
- label = aa_get_current_label();
-
if (*fqname == '&') {
stack = true;
/* don't have label_parse() do stacking */
int ret;
kuid_t root_uid;
+ new->cap_ambient = old->cap_ambient;
if (WARN_ON(!cap_ambient_invariant_ok(old)))
return -EPERM;
#include <linux/rcupdate.h>
#include <linux/mutex.h>
+#ifdef CONFIG_CGROUP_DEVICE
+
static DEFINE_MUTEX(devcgroup_mutex);
enum devcg_behavior {
};
/**
- * __devcgroup_check_permission - checks if an inode operation is permitted
+ * devcgroup_legacy_check_permission - checks if an inode operation is permitted
* @dev_cgroup: the dev cgroup to be tested against
* @type: device type
* @major: device major number
*
* returns 0 on success, -EPERM case the operation is not permitted
*/
-static int __devcgroup_check_permission(short type, u32 major, u32 minor,
+static int devcgroup_legacy_check_permission(short type, u32 major, u32 minor,
short access)
{
struct dev_cgroup *dev_cgroup;
return 0;
}
+#endif /* CONFIG_CGROUP_DEVICE */
+
+#if defined(CONFIG_CGROUP_DEVICE) || defined(CONFIG_CGROUP_BPF)
+
int devcgroup_check_permission(short type, u32 major, u32 minor, short access)
{
int rc = BPF_CGROUP_RUN_PROG_DEVICE_CGROUP(type, major, minor, access);
if (rc)
return -EPERM;
- return __devcgroup_check_permission(type, major, minor, access);
+ #ifdef CONFIG_CGROUP_DEVICE
+ return devcgroup_legacy_check_permission(type, major, minor, access);
+
+ #else /* CONFIG_CGROUP_DEVICE */
+ return 0;
+
+ #endif /* CONFIG_CGROUP_DEVICE */
}
EXPORT_SYMBOL(devcgroup_check_permission);
+#endif /* defined(CONFIG_CGROUP_DEVICE) || defined(CONFIG_CGROUP_BPF) */
{
long rc;
const char *algo;
- struct crypto_shash **tfm;
+ struct crypto_shash **tfm, *tmp_tfm;
struct shash_desc *desc;
if (type == EVM_XATTR_HMAC) {
algo = hash_algo_name[hash_algo];
}
- if (*tfm == NULL) {
- mutex_lock(&mutex);
- if (*tfm)
- goto out;
- *tfm = crypto_alloc_shash(algo, 0, CRYPTO_NOLOAD);
- if (IS_ERR(*tfm)) {
- rc = PTR_ERR(*tfm);
- pr_err("Can not allocate %s (reason: %ld)\n", algo, rc);
- *tfm = NULL;
+ if (*tfm)
+ goto alloc;
+ mutex_lock(&mutex);
+ if (*tfm)
+ goto unlock;
+
+ tmp_tfm = crypto_alloc_shash(algo, 0, CRYPTO_NOLOAD);
+ if (IS_ERR(tmp_tfm)) {
+ pr_err("Can not allocate %s (reason: %ld)\n", algo,
+ PTR_ERR(tmp_tfm));
+ mutex_unlock(&mutex);
+ return ERR_CAST(tmp_tfm);
+ }
+ if (type == EVM_XATTR_HMAC) {
+ rc = crypto_shash_setkey(tmp_tfm, evmkey, evmkey_len);
+ if (rc) {
+ crypto_free_shash(tmp_tfm);
mutex_unlock(&mutex);
return ERR_PTR(rc);
}
- if (type == EVM_XATTR_HMAC) {
- rc = crypto_shash_setkey(*tfm, evmkey, evmkey_len);
- if (rc) {
- crypto_free_shash(*tfm);
- *tfm = NULL;
- mutex_unlock(&mutex);
- return ERR_PTR(rc);
- }
- }
-out:
- mutex_unlock(&mutex);
}
-
+ *tfm = tmp_tfm;
+unlock:
+ mutex_unlock(&mutex);
+alloc:
desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(*tfm),
GFP_KERNEL);
if (!desc)
data->hdr.length = crypto_shash_digestsize(desc->tfm);
error = -ENODATA;
- list_for_each_entry_rcu(xattr, &evm_config_xattrnames, list) {
+ list_for_each_entry_lockless(xattr, &evm_config_xattrnames, list) {
bool is_ima = false;
if (strcmp(xattr->name, XATTR_NAME_IMA) == 0)
if (!(inode->i_opflags & IOP_XATTR))
return -EOPNOTSUPP;
- list_for_each_entry_rcu(xattr, &evm_config_xattrnames, list) {
+ list_for_each_entry_lockless(xattr, &evm_config_xattrnames, list) {
error = __vfs_getxattr(dentry, inode, xattr->name, NULL, 0);
if (error < 0) {
if (error == -ENODATA)
struct xattr_list *xattr;
namelen = strlen(req_xattr_name);
- list_for_each_entry_rcu(xattr, &evm_config_xattrnames, list) {
+ list_for_each_entry_lockless(xattr, &evm_config_xattrnames, list) {
if ((strlen(xattr->name) == namelen)
&& (strncmp(req_xattr_name, xattr->name, namelen) == 0)) {
found = 1;
goto out;
}
- /* Guard against races in evm_read_xattrs */
+ /*
+ * xattr_list_mutex guards against races in evm_read_xattrs().
+ * Entries are only added to the evm_config_xattrnames list
+ * and never deleted. Therefore, the list is traversed
+ * using list_for_each_entry_lockless() without holding
+ * the mutex in evm_calc_hmac_or_hash(), evm_find_protected_xattrs()
+ * and evm_protected_xattr().
+ */
mutex_lock(&xattr_list_mutex);
list_for_each_entry(tmp, &evm_config_xattrnames, list) {
if (strcmp(xattr->name, tmp->name) == 0) {
loff_t i_size;
int rc;
struct file *f = file;
- bool new_file_instance = false, modified_flags = false;
+ bool new_file_instance = false, modified_mode = false;
/*
* For consistency, fail file's opened with the O_DIRECT flag on
f = dentry_open(&file->f_path, flags, file->f_cred);
if (IS_ERR(f)) {
/*
- * Cannot open the file again, lets modify f_flags
+ * Cannot open the file again, lets modify f_mode
* of original and continue
*/
pr_info_ratelimited("Unable to reopen file for reading.\n");
f = file;
- f->f_flags |= FMODE_READ;
- modified_flags = true;
+ f->f_mode |= FMODE_READ;
+ modified_mode = true;
} else {
new_file_instance = true;
}
out:
if (new_file_instance)
fput(f);
- else if (modified_flags)
- f->f_flags &= ~FMODE_READ;
+ else if (modified_mode)
+ f->f_mode &= ~FMODE_READ;
return rc;
}
integrity_audit_msg(AUDIT_INTEGRITY_STATUS, NULL, NULL,
"policy_update", "signed policy required",
1, 0);
- if (ima_appraise & IMA_APPRAISE_ENFORCE)
- result = -EACCES;
+ result = -EACCES;
} else {
result = ima_parse_add_rule(data);
}
return ukey;
}
-static int calc_hash(struct crypto_shash *tfm, u8 *digest,
- const u8 *buf, unsigned int buflen)
-{
- SHASH_DESC_ON_STACK(desc, tfm);
- int err;
-
- desc->tfm = tfm;
-
- err = crypto_shash_digest(desc, buf, buflen, digest);
- shash_desc_zero(desc);
- return err;
-}
-
static int calc_hmac(u8 *digest, const u8 *key, unsigned int keylen,
const u8 *buf, unsigned int buflen)
{
err = crypto_shash_setkey(tfm, key, keylen);
if (!err)
- err = calc_hash(tfm, digest, buf, buflen);
+ err = crypto_shash_tfm_digest(tfm, buf, buflen, digest);
crypto_free_shash(tfm);
return err;
}
memcpy(derived_buf + strlen(derived_buf) + 1, master_key,
master_keylen);
- ret = calc_hash(hash_tfm, derived_key, derived_buf, derived_buf_len);
+ ret = crypto_shash_tfm_digest(hash_tfm, derived_buf, derived_buf_len,
+ derived_key);
kzfree(derived_buf);
return ret;
}
int security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen)
{
- return call_int_hook(secid_to_secctx, -EOPNOTSUPP, secid, secdata,
- seclen);
+ struct security_hook_list *hp;
+ int rc;
+
+ /*
+ * Currently, only one LSM can implement secid_to_secctx (i.e this
+ * LSM hook is not "stackable").
+ */
+ hlist_for_each_entry(hp, &security_hook_heads.secid_to_secctx, list) {
+ rc = hp->hook.secid_to_secctx(secid, secdata, seclen);
+ if (rc != LSM_RET_DEFAULT(secid_to_secctx))
+ return rc;
+ }
+
+ return LSM_RET_DEFAULT(secid_to_secctx);
}
EXPORT_SYMBOL(security_secid_to_secctx);
"audit_control", "setfcap"
#define COMMON_CAP2_PERMS "mac_override", "mac_admin", "syslog", \
- "wake_alarm", "block_suspend", "audit_read"
+ "wake_alarm", "block_suspend", "audit_read", "perfmon"
-#if CAP_LAST_CAP > CAP_AUDIT_READ
+#if CAP_LAST_CAP > CAP_PERFMON
#error New capability defined, please update COMMON_CAP2_PERMS.
#endif
if (info.index >= 32)
return -EINVAL;
/* check whether the dsp was already loaded */
- if (hw->dsp_loaded & (1 << info.index))
+ if (hw->dsp_loaded & (1u << info.index))
return -EBUSY;
err = hw->ops.dsp_load(hw, &info);
if (err < 0)
return err;
- hw->dsp_loaded |= (1 << info.index);
+ hw->dsp_loaded |= (1u << info.index);
return 0;
}
no_delta_check:
if (runtime->status->hw_ptr == new_hw_ptr) {
+ runtime->hw_ptr_jiffies = curr_jiffies;
update_audio_tstamp(substream, &curr_tstamp, &audio_tstamp);
return 0;
}
runtime->event(runtime->substream);
}
+/* buffer refcount management: call with runtime->lock held */
+static inline void snd_rawmidi_buffer_ref(struct snd_rawmidi_runtime *runtime)
+{
+ runtime->buffer_ref++;
+}
+
+static inline void snd_rawmidi_buffer_unref(struct snd_rawmidi_runtime *runtime)
+{
+ runtime->buffer_ref--;
+}
+
static int snd_rawmidi_runtime_create(struct snd_rawmidi_substream *substream)
{
struct snd_rawmidi_runtime *runtime;
if (!newbuf)
return -ENOMEM;
spin_lock_irq(&runtime->lock);
+ if (runtime->buffer_ref) {
+ spin_unlock_irq(&runtime->lock);
+ kvfree(newbuf);
+ return -EBUSY;
+ }
oldbuf = runtime->buffer;
runtime->buffer = newbuf;
runtime->buffer_size = params->buffer_size;
long result = 0, count1;
struct snd_rawmidi_runtime *runtime = substream->runtime;
unsigned long appl_ptr;
+ int err = 0;
spin_lock_irqsave(&runtime->lock, flags);
+ snd_rawmidi_buffer_ref(runtime);
while (count > 0 && runtime->avail) {
count1 = runtime->buffer_size - runtime->appl_ptr;
if (count1 > count)
if (userbuf) {
spin_unlock_irqrestore(&runtime->lock, flags);
if (copy_to_user(userbuf + result,
- runtime->buffer + appl_ptr, count1)) {
- return result > 0 ? result : -EFAULT;
- }
+ runtime->buffer + appl_ptr, count1))
+ err = -EFAULT;
spin_lock_irqsave(&runtime->lock, flags);
+ if (err)
+ goto out;
}
result += count1;
count -= count1;
}
+ out:
+ snd_rawmidi_buffer_unref(runtime);
spin_unlock_irqrestore(&runtime->lock, flags);
- return result;
+ return result > 0 ? result : err;
}
long snd_rawmidi_kernel_read(struct snd_rawmidi_substream *substream,
return -EAGAIN;
}
}
+ snd_rawmidi_buffer_ref(runtime);
while (count > 0 && runtime->avail > 0) {
count1 = runtime->buffer_size - runtime->appl_ptr;
if (count1 > count)
}
__end:
count1 = runtime->avail < runtime->buffer_size;
+ snd_rawmidi_buffer_unref(runtime);
spin_unlock_irqrestore(&runtime->lock, flags);
if (count1)
snd_rawmidi_output_trigger(substream, 1);
__entry->irq,
__entry->index,
__print_array(__get_dynamic_array(cip_header),
- __get_dynamic_array_len(cip_header),
- sizeof(u8)))
+ __get_dynamic_array_len(cip_header), 1))
);
#endif
case 0x10ec0282:
case 0x10ec0283:
case 0x10ec0286:
+ case 0x10ec0287:
case 0x10ec0288:
case 0x10ec0285:
case 0x10ec0298:
SND_PCI_QUIRK(0x1458, 0xa002, "Gigabyte EP45-DS3/Z87X-UD3H", ALC889_FIXUP_FRONT_HP_NO_PRESENCE),
SND_PCI_QUIRK(0x1458, 0xa0b8, "Gigabyte AZ370-Gaming", ALC1220_FIXUP_GB_DUAL_CODECS),
SND_PCI_QUIRK(0x1458, 0xa0cd, "Gigabyte X570 Aorus Master", ALC1220_FIXUP_CLEVO_P950),
+ SND_PCI_QUIRK(0x1458, 0xa0ce, "Gigabyte X570 Aorus Xtreme", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1462, 0x1228, "MSI-GP63", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1462, 0x1275, "MSI-GL63", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1462, 0x1276, "MSI-GL73", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1558, 0x97e1, "Clevo P970[ER][CDFN]", ALC1220_FIXUP_CLEVO_P950),
SND_PCI_QUIRK(0x1558, 0x65d1, "Clevo PB51[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK(0x1558, 0x67d1, "Clevo PB71[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x50d3, "Clevo PC50[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x70d1, "Clevo PC70[ER][CDF]", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
+ SND_PCI_QUIRK(0x1558, 0x7714, "Clevo X170", ALC1220_FIXUP_CLEVO_PB51ED_PINS),
SND_PCI_QUIRK_VENDOR(0x1558, "Clevo laptop", ALC882_FIXUP_EAPD),
SND_PCI_QUIRK(0x161f, 0x2054, "Medion laptop", ALC883_FIXUP_EAPD),
SND_PCI_QUIRK(0x17aa, 0x3a0d, "Lenovo Y530", ALC882_FIXUP_LENOVO_Y530),
{ 0x19, 0x21a11010 }, /* dock mic */
{ }
};
- /* Assure the speaker pin to be coupled with DAC NID 0x03; otherwise
- * the speaker output becomes too low by some reason on Thinkpads with
- * ALC298 codec
- */
- static const hda_nid_t preferred_pairs[] = {
- 0x14, 0x03, 0x17, 0x02, 0x21, 0x02,
- 0
- };
struct alc_spec *spec = codec->spec;
if (action == HDA_FIXUP_ACT_PRE_PROBE) {
- spec->gen.preferred_dacs = preferred_pairs;
spec->parse_flags = HDA_PINCFG_NO_HP_FIXUP;
snd_hda_apply_pincfgs(codec, pincfgs);
} else if (action == HDA_FIXUP_ACT_INIT) {
}
}
+static void alc_fixup_tpt470_dacs(struct hda_codec *codec,
+ const struct hda_fixup *fix, int action)
+{
+ /* Assure the speaker pin to be coupled with DAC NID 0x03; otherwise
+ * the speaker output becomes too low by some reason on Thinkpads with
+ * ALC298 codec
+ */
+ static const hda_nid_t preferred_pairs[] = {
+ 0x14, 0x03, 0x17, 0x02, 0x21, 0x02,
+ 0
+ };
+ struct alc_spec *spec = codec->spec;
+
+ if (action == HDA_FIXUP_ACT_PRE_PROBE)
+ spec->gen.preferred_dacs = preferred_pairs;
+}
+
static void alc_shutup_dell_xps13(struct hda_codec *codec)
{
struct alc_spec *spec = codec->spec;
}
}
+static void alc225_fixup_s3_pop_noise(struct hda_codec *codec,
+ const struct hda_fixup *fix, int action)
+{
+ if (action != HDA_FIXUP_ACT_PRE_PROBE)
+ return;
+
+ codec->power_save_node = 1;
+}
+
/* Forcibly assign NID 0x03 to HP/LO while NID 0x02 to SPK for EQ */
static void alc274_fixup_bind_dacs(struct hda_codec *codec,
const struct hda_fixup *fix, int action)
ALC269_FIXUP_HP_LINE1_MIC1_LED,
ALC269_FIXUP_INV_DMIC,
ALC269_FIXUP_LENOVO_DOCK,
+ ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST,
ALC269_FIXUP_NO_SHUTUP,
ALC286_FIXUP_SONY_MIC_NO_PRESENCE,
ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT,
ALC233_FIXUP_ACER_HEADSET_MIC,
ALC294_FIXUP_LENOVO_MIC_LOCATION,
ALC225_FIXUP_DELL_WYSE_MIC_NO_PRESENCE,
+ ALC225_FIXUP_S3_POP_NOISE,
ALC700_FIXUP_INTEL_REFERENCE,
ALC274_FIXUP_DELL_BIND_DACS,
ALC274_FIXUP_DELL_AIO_LINEOUT_VERB,
+ ALC298_FIXUP_TPT470_DOCK_FIX,
ALC298_FIXUP_TPT470_DOCK,
ALC255_FIXUP_DUMMY_LINEOUT_VERB,
ALC255_FIXUP_DELL_HEADSET_MIC,
ALC294_FIXUP_ASUS_DUAL_SPK,
ALC285_FIXUP_THINKPAD_HEADSET_JACK,
ALC294_FIXUP_ASUS_HPE,
+ ALC294_FIXUP_ASUS_COEF_1B,
ALC285_FIXUP_HP_GPIO_LED,
ALC285_FIXUP_HP_MUTE_LED,
ALC236_FIXUP_HP_MUTE_LED,
+ ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET,
+ ALC295_FIXUP_ASUS_MIC_NO_PRESENCE,
};
static const struct hda_fixup alc269_fixups[] = {
.chained = true,
.chain_id = ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT
},
+ [ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc269_fixup_limit_int_mic_boost,
+ .chained = true,
+ .chain_id = ALC269_FIXUP_LENOVO_DOCK,
+ },
[ALC269_FIXUP_PINCFG_NO_HP_TO_LINEOUT] = {
.type = HDA_FIXUP_FUNC,
.v.func = alc269_fixup_pincfg_no_hp_to_lineout,
{ }
},
.chained = true,
+ .chain_id = ALC225_FIXUP_S3_POP_NOISE
+ },
+ [ALC225_FIXUP_S3_POP_NOISE] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc225_fixup_s3_pop_noise,
+ .chained = true,
.chain_id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC
},
[ALC700_FIXUP_INTEL_REFERENCE] = {
.chained = true,
.chain_id = ALC274_FIXUP_DELL_BIND_DACS
},
- [ALC298_FIXUP_TPT470_DOCK] = {
+ [ALC298_FIXUP_TPT470_DOCK_FIX] = {
.type = HDA_FIXUP_FUNC,
.v.func = alc_fixup_tpt470_dock,
.chained = true,
.chain_id = ALC293_FIXUP_LENOVO_SPK_NOISE
},
+ [ALC298_FIXUP_TPT470_DOCK] = {
+ .type = HDA_FIXUP_FUNC,
+ .v.func = alc_fixup_tpt470_dacs,
+ .chained = true,
+ .chain_id = ALC298_FIXUP_TPT470_DOCK_FIX
+ },
[ALC255_FIXUP_DUMMY_LINEOUT_VERB] = {
.type = HDA_FIXUP_PINS,
.v.pins = (const struct hda_pintbl[]) {
.chained = true,
.chain_id = ALC294_FIXUP_ASUS_HEADSET_MIC
},
+ [ALC294_FIXUP_ASUS_COEF_1B] = {
+ .type = HDA_FIXUP_VERBS,
+ .v.verbs = (const struct hda_verb[]) {
+ /* Set bit 10 to correct noisy output after reboot from
+ * Windows 10 (due to pop noise reduction?)
+ */
+ { 0x20, AC_VERB_SET_COEF_INDEX, 0x1b },
+ { 0x20, AC_VERB_SET_PROC_COEF, 0x4e4b },
+ { }
+ },
+ },
[ALC285_FIXUP_HP_GPIO_LED] = {
.type = HDA_FIXUP_FUNC,
.v.func = alc285_fixup_hp_gpio_led,
.type = HDA_FIXUP_FUNC,
.v.func = alc236_fixup_hp_mute_led,
},
+ [ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET] = {
+ .type = HDA_FIXUP_VERBS,
+ .v.verbs = (const struct hda_verb[]) {
+ { 0x1a, AC_VERB_SET_PIN_WIDGET_CONTROL, 0xc5 },
+ { }
+ },
+ },
+ [ALC295_FIXUP_ASUS_MIC_NO_PRESENCE] = {
+ .type = HDA_FIXUP_PINS,
+ .v.pins = (const struct hda_pintbl[]) {
+ { 0x19, 0x01a1913c }, /* use as headset mic, without its own jack detect */
+ { }
+ },
+ .chained = true,
+ .chain_id = ALC269_FIXUP_HEADSET_MODE
+ },
};
static const struct snd_pci_quirk alc269_fixup_tbl[] = {
SND_PCI_QUIRK(0x1043, 0x18b1, "Asus MJ401TA", ALC256_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1043, 0x18f1, "Asus FX505DT", ALC256_FIXUP_ASUS_HEADSET_MIC),
SND_PCI_QUIRK(0x1043, 0x19ce, "ASUS B9450FA", ALC294_FIXUP_ASUS_HPE),
+ SND_PCI_QUIRK(0x1043, 0x19e1, "ASUS UX581LV", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x1a13, "Asus G73Jw", ALC269_FIXUP_ASUS_G73JW),
SND_PCI_QUIRK(0x1043, 0x1a30, "ASUS X705UD", ALC256_FIXUP_ASUS_MIC),
+ SND_PCI_QUIRK(0x1043, 0x1b11, "ASUS UX431DA", ALC294_FIXUP_ASUS_COEF_1B),
SND_PCI_QUIRK(0x1043, 0x1b13, "Asus U41SV", ALC269_FIXUP_INV_DMIC),
SND_PCI_QUIRK(0x1043, 0x1bbd, "ASUS Z550MA", ALC255_FIXUP_ASUS_MIC_NO_PRESENCE),
SND_PCI_QUIRK(0x1043, 0x1c23, "Asus X55U", ALC269_FIXUP_LIMIT_INT_MIC_BOOST),
SND_PCI_QUIRK(0x10ec, 0x10f2, "Intel Reference board", ALC700_FIXUP_INTEL_REFERENCE),
SND_PCI_QUIRK(0x10f7, 0x8338, "Panasonic CF-SZ6", ALC269_FIXUP_HEADSET_MODE),
SND_PCI_QUIRK(0x144d, 0xc109, "Samsung Ativ book 9 (NP900X3G)", ALC269_FIXUP_INV_DMIC),
+ SND_PCI_QUIRK(0x144d, 0xc169, "Samsung Notebook 9 Pen (NP930SBE-K01US)", ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET),
+ SND_PCI_QUIRK(0x144d, 0xc176, "Samsung Notebook 9 Pro (NP930MBE-K04US)", ALC298_FIXUP_SAMSUNG_HEADPHONE_VERY_QUIET),
SND_PCI_QUIRK(0x144d, 0xc740, "Samsung Ativ book 8 (NP870Z5G)", ALC269_FIXUP_ATIV_BOOK_8),
SND_PCI_QUIRK(0x1458, 0xfa53, "Gigabyte BXBT-2807", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x1462, 0xb120, "MSI Cubi MS-B120", ALC283_FIXUP_HEADSET_MIC),
SND_PCI_QUIRK(0x17aa, 0x21b8, "Thinkpad Edge 14", ALC269_FIXUP_SKU_IGNORE),
SND_PCI_QUIRK(0x17aa, 0x21ca, "Thinkpad L412", ALC269_FIXUP_SKU_IGNORE),
SND_PCI_QUIRK(0x17aa, 0x21e9, "Thinkpad Edge 15", ALC269_FIXUP_SKU_IGNORE),
- SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK),
+ SND_PCI_QUIRK(0x17aa, 0x21f6, "Thinkpad T530", ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST),
SND_PCI_QUIRK(0x17aa, 0x21fa, "Thinkpad X230", ALC269_FIXUP_LENOVO_DOCK),
SND_PCI_QUIRK(0x17aa, 0x21f3, "Thinkpad T430", ALC269_FIXUP_LENOVO_DOCK),
SND_PCI_QUIRK(0x17aa, 0x21fb, "Thinkpad T430s", ALC269_FIXUP_LENOVO_DOCK),
{.id = ALC269_FIXUP_HEADSET_MODE, .name = "headset-mode"},
{.id = ALC269_FIXUP_HEADSET_MODE_NO_HP_MIC, .name = "headset-mode-no-hp-mic"},
{.id = ALC269_FIXUP_LENOVO_DOCK, .name = "lenovo-dock"},
+ {.id = ALC269_FIXUP_LENOVO_DOCK_LIMIT_BOOST, .name = "lenovo-dock-limit-boost"},
{.id = ALC269_FIXUP_HP_GPIO_LED, .name = "hp-gpio-led"},
{.id = ALC269_FIXUP_HP_DOCK_GPIO_MIC1_LED, .name = "hp-dock-gpio-mic1-led"},
{.id = ALC269_FIXUP_DELL1_MIC_NO_PRESENCE, .name = "dell-headset-multi"},
{.id = ALC292_FIXUP_TPT440_DOCK, .name = "tpt440-dock"},
{.id = ALC292_FIXUP_TPT440, .name = "tpt440"},
{.id = ALC292_FIXUP_TPT460, .name = "tpt460"},
+ {.id = ALC298_FIXUP_TPT470_DOCK_FIX, .name = "tpt470-dock-fix"},
{.id = ALC298_FIXUP_TPT470_DOCK, .name = "tpt470-dock"},
{.id = ALC233_FIXUP_LENOVO_MULTI_CODECS, .name = "dual-codecs"},
{.id = ALC700_FIXUP_INTEL_REFERENCE, .name = "alc700-ref"},
{0x12, 0x90a60130},
{0x17, 0x90170110},
{0x21, 0x03211020}),
+ SND_HDA_PIN_QUIRK(0x10ec0295, 0x1043, "ASUS", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE,
+ {0x12, 0x90a60120},
+ {0x17, 0x90170110},
+ {0x21, 0x04211030}),
+ SND_HDA_PIN_QUIRK(0x10ec0295, 0x1043, "ASUS", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x17, 0x90170110},
+ {0x21, 0x03211020}),
+ SND_HDA_PIN_QUIRK(0x10ec0295, 0x1043, "ASUS", ALC295_FIXUP_ASUS_MIC_NO_PRESENCE,
+ {0x12, 0x90a60130},
+ {0x17, 0x90170110},
+ {0x21, 0x03211020}),
SND_HDA_PIN_QUIRK(0x10ec0295, 0x1028, "Dell", ALC269_FIXUP_DELL4_MIC_NO_PRESENCE,
{0x14, 0x90170110},
{0x21, 0x04211020}),
case 0x10ec0215:
case 0x10ec0245:
case 0x10ec0285:
+ case 0x10ec0287:
case 0x10ec0289:
spec->codec_variant = ALC269_TYPE_ALC215;
spec->shutup = alc225_shutup;
spec->gen.mixer_nid = 0;
break;
case 0x10ec0225:
- codec->power_save_node = 1;
- /* fall through */
case 0x10ec0295:
case 0x10ec0299:
spec->codec_variant = ALC269_TYPE_ALC225;
HDA_CODEC_ENTRY(0x10ec0284, "ALC284", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0285, "ALC285", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0286, "ALC286", patch_alc269),
+ HDA_CODEC_ENTRY(0x10ec0287, "ALC287", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0288, "ALC288", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0289, "ALC289", patch_alc269),
HDA_CODEC_ENTRY(0x10ec0290, "ALC290", patch_alc269),
pci_write_config_byte(ice->pci, 0x61, ice->eeprom.data[ICE_EEP1_ACLINK]);
pci_write_config_byte(ice->pci, 0x62, ice->eeprom.data[ICE_EEP1_I2SID]);
pci_write_config_byte(ice->pci, 0x63, ice->eeprom.data[ICE_EEP1_SPDIF]);
- if (ice->eeprom.subvendor != ICE1712_SUBDEVICE_STDSP24) {
+ if (ice->eeprom.subvendor != ICE1712_SUBDEVICE_STDSP24 &&
+ ice->eeprom.subvendor != ICE1712_SUBDEVICE_STAUDIO_ADCIII) {
ice->gpio.write_mask = ice->eeprom.gpiomask;
ice->gpio.direction = ice->eeprom.gpiodir;
snd_ice1712_write(ice, ICE1712_IREG_GPIO_WRITE_MASK,
cval->res = 384;
}
break;
+ case USB_ID(0x0495, 0x3042): /* ESS Technology Asus USB DAC */
+ if ((strstr(kctl->id.name, "Playback Volume") != NULL) ||
+ strstr(kctl->id.name, "Capture Volume") != NULL) {
+ cval->min >>= 8;
+ cval->max = 0;
+ cval->res = 1;
+ }
+ break;
}
}
{}
};
+/* Rear panel + front mic on Gigabyte TRX40 Aorus Master with ALC1220-VB */
+static const struct usbmix_name_map aorus_master_alc1220vb_map[] = {
+ { 17, NULL }, /* OT, IEC958?, disabled */
+ { 19, NULL, 12 }, /* FU, Input Gain Pad - broken response, disabled */
+ { 16, "Line Out" }, /* OT */
+ { 22, "Line Out Playback" }, /* FU */
+ { 7, "Line" }, /* IT */
+ { 19, "Line Capture" }, /* FU */
+ { 8, "Mic" }, /* IT */
+ { 20, "Mic Capture" }, /* FU */
+ { 9, "Front Mic" }, /* IT */
+ { 21, "Front Mic Capture" }, /* FU */
+ {}
+};
+
/*
* Control map entries
*/
.id = USB_ID(0x1b1c, 0x0a42),
.map = corsair_virtuoso_map,
},
+ { /* Gigabyte TRX40 Aorus Master (rear panel + front mic) */
+ .id = USB_ID(0x0414, 0xa001),
+ .map = aorus_master_alc1220vb_map,
+ },
{ /* Gigabyte TRX40 Aorus Pro WiFi */
.id = USB_ID(0x0414, 0xa002),
.map = trx40_mobo_map,
.map = trx40_mobo_map,
.connector_map = trx40_mobo_connector_map,
},
+ { /* Asrock TRX40 Creator */
+ .id = USB_ID(0x26ce, 0x0a01),
+ .map = trx40_mobo_map,
+ .connector_map = trx40_mobo_connector_map,
+ },
{ 0 } /* terminator */
};
ALC1220_VB_DESKTOP(0x0414, 0xa002), /* Gigabyte TRX40 Aorus Pro WiFi */
ALC1220_VB_DESKTOP(0x0db0, 0x0d64), /* MSI TRX40 Creator */
ALC1220_VB_DESKTOP(0x0db0, 0x543d), /* MSI TRX40 */
+ALC1220_VB_DESKTOP(0x26ce, 0x0a01), /* Asrock TRX40 Creator */
#undef ALC1220_VB_DESKTOP
+/* Two entries for Gigabyte TRX40 Aorus Master:
+ * TRX40 Aorus Master has two USB-audio devices, one for the front headphone
+ * with ESS SABRE9218 DAC chip, while another for the rest I/O (the rear
+ * panel and the front mic) with Realtek ALC1220-VB.
+ * Here we provide two distinct names for making UCM profiles easier.
+ */
+{
+ USB_DEVICE(0x0414, 0xa000),
+ .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
+ .vendor_name = "Gigabyte",
+ .product_name = "Aorus Master Front Headphone",
+ .profile_name = "Gigabyte-Aorus-Master-Front-Headphone",
+ .ifnum = QUIRK_NO_INTERFACE
+ }
+},
+{
+ USB_DEVICE(0x0414, 0xa001),
+ .driver_info = (unsigned long) & (const struct snd_usb_audio_quirk) {
+ .vendor_name = "Gigabyte",
+ .product_name = "Aorus Master Main Audio",
+ .profile_name = "Gigabyte-Aorus-Master-Main-Audio",
+ .ifnum = QUIRK_NO_INTERFACE
+ }
+},
+
#undef USB_DEVICE_VENDOR_SPEC
&& (requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS)
msleep(20);
- /* Zoom R16/24, Logitech H650e, Jabra 550a needs a tiny delay here,
- * otherwise requests like get/set frequency return as failed despite
- * actually succeeding.
+ /* Zoom R16/24, Logitech H650e, Jabra 550a, Kingston HyperX needs a tiny
+ * delay here, otherwise requests like get/set frequency return as
+ * failed despite actually succeeding.
*/
if ((chip->usb_id == USB_ID(0x1686, 0x00dd) ||
chip->usb_id == USB_ID(0x046d, 0x0a46) ||
- chip->usb_id == USB_ID(0x0b0e, 0x0349)) &&
+ chip->usb_id == USB_ID(0x0b0e, 0x0349) ||
+ chip->usb_id == USB_ID(0x0951, 0x16ad)) &&
(requesttype & USB_TYPE_MASK) == USB_TYPE_CLASS)
usleep_range(1000, 2000);
}
#define ORC_TYPE_CALL 0
#define ORC_TYPE_REGS 1
#define ORC_TYPE_REGS_IRET 2
-#define UNWIND_HINT_TYPE_SAVE 3
-#define UNWIND_HINT_TYPE_RESTORE 4
+#define UNWIND_HINT_TYPE_RET_OFFSET 3
#ifndef __ASSEMBLY__
/*
#define _UAPI_ASM_X86_UNISTD_H
/* x32 syscall flag bit */
-#define __X32_SYSCALL_BIT 0x40000000UL
+#define __X32_SYSCALL_BIT 0x40000000
#ifndef __KERNEL__
# ifdef __i386__
pr_err("Failed to apply a boot config magic: %d\n", ret);
goto out;
}
+ ret = 0;
out:
close(fd);
free(data);
llvm \
llvm-version \
clang \
- libbpf
+ libbpf \
+ libpfm4
FEATURE_TESTS ?= $(FEATURE_TESTS_BASIC)
test-libaio.bin \
test-libzstd.bin \
test-clang-bpf-global-var.bin \
- test-file-handle.bin
+ test-file-handle.bin \
+ test-libpfm4.bin
FILES := $(addprefix $(OUTPUT),$(FILES))
$(OUTPUT)test-file-handle.bin:
$(BUILD)
+$(OUTPUT)test-libpfm4.bin:
+ $(BUILD) -lpfm
+
###############################
clean:
/*
* Check OpenCSD library version is sufficient to provide required features
*/
-#define OCSD_MIN_VER ((0 << 16) | (11 << 8) | (0))
+#define OCSD_MIN_VER ((0 << 16) | (14 << 8) | (0))
#if !defined(OCSD_VER_NUM) || (OCSD_VER_NUM < OCSD_MIN_VER)
-#error "OpenCSD >= 0.11.0 is required"
+#error "OpenCSD >= 0.14.0 is required"
#endif
int main(void)
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+#include <sys/types.h>
+#include <perfmon/pfmlib.h>
+
+int main(void)
+{
+ pfm_initialize();
+ return 0;
+}
const char * const *mounts;
char path[PATH_MAX];
bool found;
+ bool checked;
long magic;
};
.name = "sysfs",
.mounts = sysfs__fs_known_mountpoints,
.magic = SYSFS_MAGIC,
+ .checked = false,
},
[FS__PROCFS] = {
.name = "proc",
.mounts = procfs__known_mountpoints,
.magic = PROC_SUPER_MAGIC,
+ .checked = false,
},
[FS__DEBUGFS] = {
.name = "debugfs",
.mounts = debugfs__known_mountpoints,
.magic = DEBUGFS_MAGIC,
+ .checked = false,
},
[FS__TRACEFS] = {
.name = "tracefs",
.mounts = tracefs__known_mountpoints,
.magic = TRACEFS_MAGIC,
+ .checked = false,
},
[FS__HUGETLBFS] = {
.name = "hugetlbfs",
.mounts = hugetlbfs__known_mountpoints,
.magic = HUGETLBFS_MAGIC,
+ .checked = false,
},
[FS__BPF_FS] = {
.name = "bpf",
.mounts = bpf_fs__known_mountpoints,
.magic = BPF_FS_MAGIC,
+ .checked = false,
},
};
}
fclose(fp);
+ fs->checked = true;
return fs->found = found;
}
return false;
fs->found = true;
+ fs->checked = true;
strncpy(fs->path, override_path, sizeof(fs->path) - 1);
fs->path[sizeof(fs->path) - 1] = '\0';
return true;
if (fs->found)
return (const char *)fs->path;
+ /* the mount point was already checked for the mount point
+ * but and did not exist, so return NULL to avoid scanning again.
+ * This makes the found and not found paths cost equivalent
+ * in case of multiple calls.
+ */
+ if (fs->checked)
+ return NULL;
+
return fs__get_mountpoint(fs);
}
const char *name##__mount(void); \
bool name##__configured(void); \
+/*
+ * The xxxx__mountpoint() entry points find the first match mount point for each
+ * filesystems listed below, where xxxx is the filesystem type.
+ *
+ * The interface is as follows:
+ *
+ * - If a mount point is found on first call, it is cached and used for all
+ * subsequent calls.
+ *
+ * - If a mount point is not found, NULL is returned on first call and all
+ * subsequent calls.
+ */
FS(sysfs)
FS(procfs)
FS(debugfs)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+/*
+ * Lightweight buffered reading library.
+ *
+ * Copyright 2019 Google LLC.
+ */
+#ifndef __API_IO__
+#define __API_IO__
+
+#include <stdlib.h>
+#include <unistd.h>
+
+struct io {
+ /* File descriptor being read/ */
+ int fd;
+ /* Size of the read buffer. */
+ unsigned int buf_len;
+ /* Pointer to storage for buffering read. */
+ char *buf;
+ /* End of the storage. */
+ char *end;
+ /* Currently accessed data pointer. */
+ char *data;
+ /* Set true on when the end of file on read error. */
+ bool eof;
+};
+
+static inline void io__init(struct io *io, int fd,
+ char *buf, unsigned int buf_len)
+{
+ io->fd = fd;
+ io->buf_len = buf_len;
+ io->buf = buf;
+ io->end = buf;
+ io->data = buf;
+ io->eof = false;
+}
+
+/* Reads one character from the "io" file with similar semantics to fgetc. */
+static inline int io__get_char(struct io *io)
+{
+ char *ptr = io->data;
+
+ if (io->eof)
+ return -1;
+
+ if (ptr == io->end) {
+ ssize_t n = read(io->fd, io->buf, io->buf_len);
+
+ if (n <= 0) {
+ io->eof = true;
+ return -1;
+ }
+ ptr = &io->buf[0];
+ io->end = &io->buf[n];
+ }
+ io->data = ptr + 1;
+ return *ptr;
+}
+
+/* Read a hexadecimal value with no 0x prefix into the out argument hex. If the
+ * first character isn't hexadecimal returns -2, io->eof returns -1, otherwise
+ * returns the character after the hexadecimal value which may be -1 for eof.
+ * If the read value is larger than a u64 the high-order bits will be dropped.
+ */
+static inline int io__get_hex(struct io *io, __u64 *hex)
+{
+ bool first_read = true;
+
+ *hex = 0;
+ while (true) {
+ int ch = io__get_char(io);
+
+ if (ch < 0)
+ return ch;
+ if (ch >= '0' && ch <= '9')
+ *hex = (*hex << 4) | (ch - '0');
+ else if (ch >= 'a' && ch <= 'f')
+ *hex = (*hex << 4) | (ch - 'a' + 10);
+ else if (ch >= 'A' && ch <= 'F')
+ *hex = (*hex << 4) | (ch - 'A' + 10);
+ else if (first_read)
+ return -2;
+ else
+ return ch;
+ first_read = false;
+ }
+}
+
+/* Read a positive decimal value with out argument dec. If the first character
+ * isn't a decimal returns -2, io->eof returns -1, otherwise returns the
+ * character after the decimal value which may be -1 for eof. If the read value
+ * is larger than a u64 the high-order bits will be dropped.
+ */
+static inline int io__get_dec(struct io *io, __u64 *dec)
+{
+ bool first_read = true;
+
+ *dec = 0;
+ while (true) {
+ int ch = io__get_char(io);
+
+ if (ch < 0)
+ return ch;
+ if (ch >= '0' && ch <= '9')
+ *dec = (*dec * 10) + ch - '0';
+ else if (first_read)
+ return -2;
+ else
+ return ch;
+ first_read = false;
+ }
+}
+
+#endif /* __API_IO__ */
#define PT_REGS_PARM3_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[4])
#define PT_REGS_PARM4_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[5])
#define PT_REGS_PARM5_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[6])
-#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), grps[14])
+#define PT_REGS_RET_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[14])
#define PT_REGS_FP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[11])
#define PT_REGS_RC_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[2])
#define PT_REGS_SP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), gprs[15])
-#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), pdw.addr)
+#define PT_REGS_IP_CORE(x) BPF_CORE_READ((PT_REGS_S390 *)(x), psw.addr)
#elif defined(bpf_target_arm)
int perf_cpu_map__cpu(const struct perf_cpu_map *cpus, int idx)
{
- if (idx < cpus->nr)
+ if (cpus && idx < cpus->nr)
return cpus->map[idx];
return -1;
#include <internal/mmap.h>
#include <internal/cpumap.h>
#include <internal/threadmap.h>
-#include <internal/xyarray.h>
#include <internal/lib.h>
#include <linux/zalloc.h>
-#include <sys/ioctl.h>
#include <stdlib.h>
#include <errno.h>
#include <unistd.h>
void perf_evlist__exit(struct perf_evlist *evlist)
{
perf_cpu_map__put(evlist->cpus);
+ perf_cpu_map__put(evlist->all_cpus);
perf_thread_map__put(evlist->threads);
evlist->cpus = NULL;
+ evlist->all_cpus = NULL;
evlist->threads = NULL;
fdarray__exit(&evlist->pollfd);
}
{ .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = "time", .help = (h), .callback = parse_opt_approxidate_cb }
#define OPT_CALLBACK(s, l, v, a, h, f) \
{ .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = (a), .help = (h), .callback = (f) }
+#define OPT_CALLBACK_SET(s, l, v, os, a, h, f) \
+ { .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = (a), .help = (h), .callback = (f), .set = check_vtype(os, bool *)}
#define OPT_CALLBACK_NOOPT(s, l, v, a, h, f) \
{ .type = OPTION_CALLBACK, .short_name = (s), .long_name = (l), .value = (v), .argh = (a), .help = (h), .callback = (f), .flags = PARSE_OPT_NOARG }
#define OPT_CALLBACK_DEFAULT(s, l, v, a, h, f, d) \
// SPDX-License-Identifier: GPL-2.0
#include "symbol/kallsyms.h"
+#include "api/io.h"
#include <stdio.h>
-#include <stdlib.h>
+#include <sys/stat.h>
+#include <fcntl.h>
u8 kallsyms2elf_type(char type)
{
return symbol_type == 'T' || symbol_type == 'W';
}
-/*
- * While we find nice hex chars, build a long_val.
- * Return number of chars processed.
- */
-int hex2u64(const char *ptr, u64 *long_val)
+static void read_to_eol(struct io *io)
{
- char *p;
+ int ch;
- *long_val = strtoull(ptr, &p, 16);
-
- return p - ptr;
+ for (;;) {
+ ch = io__get_char(io);
+ if (ch < 0 || ch == '\n')
+ return;
+ }
}
int kallsyms__parse(const char *filename, void *arg,
int (*process_symbol)(void *arg, const char *name,
char type, u64 start))
{
- char *line = NULL;
- size_t n;
- int err = -1;
- FILE *file = fopen(filename, "r");
-
- if (file == NULL)
- goto out_failure;
-
- err = 0;
+ struct io io;
+ char bf[BUFSIZ];
+ int err;
- while (!feof(file)) {
- u64 start;
- int line_len, len;
- char symbol_type;
- char *symbol_name;
+ io.fd = open(filename, O_RDONLY, 0);
- line_len = getline(&line, &n, file);
- if (line_len < 0 || !line)
- break;
+ if (io.fd < 0)
+ return -1;
- line[--line_len] = '\0'; /* \n */
+ io__init(&io, io.fd, bf, sizeof(bf));
- len = hex2u64(line, &start);
+ err = 0;
+ while (!io.eof) {
+ __u64 start;
+ int ch;
+ size_t i;
+ char symbol_type;
+ char symbol_name[KSYM_NAME_LEN + 1];
- /* Skip the line if we failed to parse the address. */
- if (!len)
+ if (io__get_hex(&io, &start) != ' ') {
+ read_to_eol(&io);
continue;
-
- len++;
- if (len + 2 >= line_len)
+ }
+ symbol_type = io__get_char(&io);
+ if (io__get_char(&io) != ' ') {
+ read_to_eol(&io);
continue;
-
- symbol_type = line[len];
- len += 2;
- symbol_name = line + len;
- len = line_len - len;
-
- if (len >= KSYM_NAME_LEN) {
- err = -1;
- break;
}
+ for (i = 0; i < sizeof(symbol_name); i++) {
+ ch = io__get_char(&io);
+ if (ch < 0 || ch == '\n')
+ break;
+ symbol_name[i] = ch;
+ }
+ symbol_name[i] = '\0';
err = process_symbol(arg, symbol_name, symbol_type, start);
if (err)
break;
}
- free(line);
- fclose(file);
+ close(io.fd);
return err;
-
-out_failure:
- return -1;
}
return isupper(type) ? STB_GLOBAL : STB_LOCAL;
}
-int hex2u64(const char *ptr, u64 *long_val);
-
u8 kallsyms2elf_type(char type);
bool kallsyms__is_function(char symbol_type);
case KBUFFER_TYPE_TIME_EXTEND:
case KBUFFER_TYPE_TIME_STAMP:
return NULL;
- };
+ }
*size = length;
default:
break;
}
- asprintf(&str, val ? "TRUE" : "FALSE");
+ if (asprintf(&str, val ? "TRUE" : "FALSE") < 0)
+ str = NULL;
break;
}
}
break;
}
- asprintf(&str, "(%s) %s (%s)", left, op, right);
+ if (asprintf(&str, "(%s) %s (%s)", left, op, right) < 0)
+ str = NULL;
break;
case TEP_FILTER_OP_NOT:
right_val = 0;
if (right_val >= 0) {
/* just return the opposite */
- asprintf(&str, right_val ? "FALSE" : "TRUE");
+ if (asprintf(&str, right_val ? "FALSE" : "TRUE") < 0)
+ str = NULL;
break;
}
- asprintf(&str, "%s(%s)", op, right);
+ if (asprintf(&str, "%s(%s)", op, right) < 0)
+ str = NULL;
break;
default:
{
char *str = NULL;
- asprintf(&str, "%lld", arg->value.val);
+ if (asprintf(&str, "%lld", arg->value.val) < 0)
+ str = NULL;
return str;
}
break;
}
- asprintf(&str, "%s %s %s", lstr, op, rstr);
+ if (asprintf(&str, "%s %s %s", lstr, op, rstr) < 0)
+ str = NULL;
out:
free(lstr);
free(rstr);
if (!op)
op = "<=";
- asprintf(&str, "%s %s %s", lstr, op, rstr);
+ if (asprintf(&str, "%s %s %s", lstr, op, rstr) < 0)
+ str = NULL;
break;
default:
if (!op)
op = "!~";
- asprintf(&str, "%s %s \"%s\"",
- arg->str.field->name, op, arg->str.val);
+ if (asprintf(&str, "%s %s \"%s\"",
+ arg->str.field->name, op, arg->str.val) < 0)
+ str = NULL;
break;
default:
switch (arg->type) {
case TEP_FILTER_ARG_BOOLEAN:
- asprintf(&str, arg->boolean.value ? "TRUE" : "FALSE");
+ if (asprintf(&str, arg->boolean.value ? "TRUE" : "FALSE") < 0)
+ str = NULL;
return str;
case TEP_FILTER_ARG_OP:
objtool-y += arch/$(SRCARCH)/
+
+objtool-y += weak.o
+
+objtool-$(SUBCMD_CHECK) += check.o
+objtool-$(SUBCMD_CHECK) += special.o
+objtool-$(SUBCMD_ORC) += check.o
+objtool-$(SUBCMD_ORC) += orc_gen.o
+objtool-$(SUBCMD_ORC) += orc_dump.o
+
objtool-y += builtin-check.o
objtool-y += builtin-orc.o
-objtool-y += check.o
-objtool-y += orc_gen.o
-objtool-y += orc_dump.o
objtool-y += elf.o
-objtool-y += special.o
objtool-y += objtool.o
objtool-y += libstring.o
might be corrupt due to a gcc bug. For more details, see:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=70646
+9. file.o: warning: objtool: funcA() call to funcB() with UACCESS enabled
+
+ This means that an unexpected call to a non-whitelisted function exists
+ outside of arch-specific guards.
+ X86: SMAP (stac/clac): __uaccess_begin()/__uaccess_end()
+ ARM: PAN: uaccess_enable()/uaccess_disable()
+
+ These functions should be called to denote a minimal critical section around
+ access to __user variables. See also: https://lwn.net/Articles/517475/
+
+ The intention of the warning is to prevent calls to funcB() from eventually
+ calling schedule(), potentially leaking the AC flags state, and not
+ restoring them correctly.
+
+ It also helps verify that there are no unexpected calls to funcB() which may
+ access user space pages with protections against doing so disabled.
+
+ To fix, either:
+ 1) remove explicit calls to funcB() from funcA().
+ 2) add the correct guards before and after calls to low level functions like
+ __get_user_size()/__put_user_size().
+ 3) add funcB to uaccess_safe_builtin whitelist in tools/objtool/check.c, if
+ funcB obviously does not call schedule(), and is marked notrace (since
+ function tracing inserts additional calls, which is not obvious from the
+ sources).
+
+10. file.o: warning: func()+0x5c: alternative modifies stack
+
+ This means that an alternative includes instructions that modify the
+ stack. The problem is that there is only one ORC unwind table, this means
+ that the ORC unwind entries must be valid for each of the alternatives.
+ The easiest way to enforce this is to ensure alternatives do not contain
+ any ORC entries, which in turn implies the above constraint.
+
+11. file.o: warning: unannotated intra-function call
+
+ This warning means that a direct call is done to a destination which
+ is not at the beginning of a function. If this is a legit call, you
+ can remove this warning by putting the ANNOTATE_INTRA_FUNCTION_CALL
+ directive right before the call.
+
If the error doesn't seem to make sense, it could be a bug in objtool.
Feel free to ask the objtool maintainer for help.
INCLUDES := -I$(srctree)/tools/include \
-I$(srctree)/tools/arch/$(HOSTARCH)/include/uapi \
- -I$(srctree)/tools/arch/$(SRCARCH)/include
+ -I$(srctree)/tools/arch/$(SRCARCH)/include \
+ -I$(srctree)/tools/objtool/arch/$(SRCARCH)/include
WARNINGS := $(EXTRA_WARNINGS) -Wno-switch-default -Wno-switch-enum -Wno-packed
CFLAGS := -Werror $(WARNINGS) $(KBUILD_HOSTCFLAGS) -g $(INCLUDES) $(LIBELF_FLAGS)
LDFLAGS += $(LIBELF_LIBS) $(LIBSUBCMD) $(KBUILD_HOSTLDFLAGS)
CFLAGS += $(if $(elfshdr),,-DLIBELF_USE_DEPRECATED)
AWK = awk
+
+SUBCMD_CHECK := n
+SUBCMD_ORC := n
+
+ifeq ($(SRCARCH),x86)
+ SUBCMD_CHECK := y
+ SUBCMD_ORC := y
+endif
+
+export SUBCMD_CHECK SUBCMD_ORC
export srctree OUTPUT CFLAGS SRCARCH AWK
include $(srctree)/tools/build/Makefile.include
$(OBJTOOL_IN): fixdep FORCE
+ @$(CONFIG_SHELL) ./sync-check.sh
@$(MAKE) $(build)=objtool
$(OBJTOOL): $(LIBSUBCMD) $(OBJTOOL_IN)
- @$(CONFIG_SHELL) ./sync-check.sh
$(QUIET_LINK)$(CC) $(OBJTOOL_IN) $(LDFLAGS) -o $@
#include <stdbool.h>
#include <linux/list.h>
-#include "elf.h"
+#include "objtool.h"
#include "cfi.h"
+#include <asm/orc_types.h>
+
enum insn_type {
INSN_JUMP_CONDITIONAL,
INSN_JUMP_UNCONDITIONAL,
INSN_CALL_DYNAMIC,
INSN_RETURN,
INSN_CONTEXT_SWITCH,
- INSN_STACK,
INSN_BUG,
INSN_NOP,
INSN_STAC,
struct stack_op {
struct op_dest dest;
struct op_src src;
+ struct list_head list;
};
-void arch_initial_func_cfi_state(struct cfi_state *state);
+struct instruction;
+
+void arch_initial_func_cfi_state(struct cfi_init_state *state);
-int arch_decode_instruction(struct elf *elf, struct section *sec,
+int arch_decode_instruction(const struct elf *elf, const struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
- unsigned long *immediate, struct stack_op *op);
+ unsigned long *immediate,
+ struct list_head *ops_list);
bool arch_callee_saved_reg(unsigned char reg);
+unsigned long arch_jump_destination(struct instruction *insn);
+
+unsigned long arch_dest_rela_offset(int addend);
+
#endif /* _ARCH_H */
#include "../../../arch/x86/lib/inat.c"
#include "../../../arch/x86/lib/insn.c"
+#include "../../check.h"
#include "../../elf.h"
#include "../../arch.h"
#include "../../warn.h"
{CFI_DI, CFI_R15},
};
-static int is_x86_64(struct elf *elf)
+static int is_x86_64(const struct elf *elf)
{
switch (elf->ehdr.e_machine) {
case EM_X86_64:
}
}
-int arch_decode_instruction(struct elf *elf, struct section *sec,
+unsigned long arch_dest_rela_offset(int addend)
+{
+ return addend + 4;
+}
+
+unsigned long arch_jump_destination(struct instruction *insn)
+{
+ return insn->offset + insn->len + insn->immediate;
+}
+
+#define ADD_OP(op) \
+ if (!(op = calloc(1, sizeof(*op)))) \
+ return -1; \
+ else for (list_add_tail(&op->list, ops_list); op; op = NULL)
+
+int arch_decode_instruction(const struct elf *elf, const struct section *sec,
unsigned long offset, unsigned int maxlen,
unsigned int *len, enum insn_type *type,
- unsigned long *immediate, struct stack_op *op)
+ unsigned long *immediate,
+ struct list_head *ops_list)
{
struct insn insn;
int x86_64, sign;
unsigned char op1, op2, rex = 0, rex_b = 0, rex_r = 0, rex_w = 0,
rex_x = 0, modrm = 0, modrm_mod = 0, modrm_rm = 0,
modrm_reg = 0, sib = 0;
+ struct stack_op *op = NULL;
+ struct symbol *sym;
x86_64 = is_x86_64(elf);
if (x86_64 == -1)
insn_get_length(&insn);
if (!insn_complete(&insn)) {
- WARN_FUNC("can't decode instruction", sec, offset);
+ WARN("can't decode instruction at %s:0x%lx", sec->name, offset);
return -1;
}
if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
/* add/sub reg, %rsp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
}
break;
case 0x50 ... 0x57:
/* push reg */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
- op->dest.type = OP_DEST_PUSH;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+ op->dest.type = OP_DEST_PUSH;
+ }
break;
case 0x58 ... 0x5f:
/* pop reg */
- *type = INSN_STACK;
- op->src.type = OP_SRC_POP;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+ ADD_OP(op) {
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[op1 & 0x7][rex_b];
+ }
break;
case 0x68:
case 0x6a:
/* push immediate */
- *type = INSN_STACK;
- op->src.type = OP_SRC_CONST;
- op->dest.type = OP_DEST_PUSH;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ }
break;
case 0x70 ... 0x7f:
if (modrm == 0xe4) {
/* and imm, %rsp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_AND;
- op->src.reg = CFI_SP;
- op->src.offset = insn.immediate.value;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_AND;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
break;
}
break;
/* add/sub imm, %rsp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_SP;
- op->src.offset = insn.immediate.value * sign;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.immediate.value * sign;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
break;
case 0x89:
if (rex_w && !rex_r && modrm_mod == 3 && modrm_reg == 4) {
/* mov %rsp, reg */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = CFI_SP;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = op_to_cfi_reg[modrm_rm][rex_b];
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG;
+ op->src.reg = CFI_SP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_rm][rex_b];
+ }
break;
}
if (rex_w && !rex_b && modrm_mod == 3 && modrm_rm == 4) {
/* mov reg, %rsp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
break;
}
(modrm_mod == 1 || modrm_mod == 2) && modrm_rm == 5) {
/* mov reg, disp(%rbp) */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
- op->dest.type = OP_DEST_REG_INDIRECT;
- op->dest.reg = CFI_BP;
- op->dest.offset = insn.displacement.value;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG_INDIRECT;
+ op->dest.reg = CFI_BP;
+ op->dest.offset = insn.displacement.value;
+ }
} else if (rex_w && !rex_b && modrm_rm == 4 && sib == 0x24) {
/* mov reg, disp(%rsp) */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG;
- op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
- op->dest.type = OP_DEST_REG_INDIRECT;
- op->dest.reg = CFI_SP;
- op->dest.offset = insn.displacement.value;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG;
+ op->src.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ op->dest.type = OP_DEST_REG_INDIRECT;
+ op->dest.reg = CFI_SP;
+ op->dest.offset = insn.displacement.value;
+ }
}
break;
if (rex_w && !rex_b && modrm_mod == 1 && modrm_rm == 5) {
/* mov disp(%rbp), reg */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG_INDIRECT;
- op->src.reg = CFI_BP;
- op->src.offset = insn.displacement.value;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG_INDIRECT;
+ op->src.reg = CFI_BP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ }
} else if (rex_w && !rex_b && sib == 0x24 &&
modrm_mod != 3 && modrm_rm == 4) {
/* mov disp(%rsp), reg */
- *type = INSN_STACK;
- op->src.type = OP_SRC_REG_INDIRECT;
- op->src.reg = CFI_SP;
- op->src.offset = insn.displacement.value;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ ADD_OP(op) {
+ op->src.type = OP_SRC_REG_INDIRECT;
+ op->src.reg = CFI_SP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
+ }
}
break;
case 0x8d:
if (sib == 0x24 && rex_w && !rex_b && !rex_x) {
- *type = INSN_STACK;
- if (!insn.displacement.value) {
- /* lea (%rsp), reg */
- op->src.type = OP_SRC_REG;
- } else {
- /* lea disp(%rsp), reg */
- op->src.type = OP_SRC_ADD;
- op->src.offset = insn.displacement.value;
+ ADD_OP(op) {
+ if (!insn.displacement.value) {
+ /* lea (%rsp), reg */
+ op->src.type = OP_SRC_REG;
+ } else {
+ /* lea disp(%rsp), reg */
+ op->src.type = OP_SRC_ADD;
+ op->src.offset = insn.displacement.value;
+ }
+ op->src.reg = CFI_SP;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
}
- op->src.reg = CFI_SP;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = op_to_cfi_reg[modrm_reg][rex_r];
} else if (rex == 0x48 && modrm == 0x65) {
/* lea disp(%rbp), %rsp */
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_BP;
- op->src.offset = insn.displacement.value;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_BP;
+ op->src.offset = insn.displacement.value;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
} else if (rex == 0x49 && modrm == 0x62 &&
insn.displacement.value == -8) {
* Restoring rsp back to its original value after a
* stack realignment.
*/
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_R10;
- op->src.offset = -8;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_R10;
+ op->src.offset = -8;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
} else if (rex == 0x49 && modrm == 0x65 &&
insn.displacement.value == -16) {
* Restoring rsp back to its original value after a
* stack realignment.
*/
- *type = INSN_STACK;
- op->src.type = OP_SRC_ADD;
- op->src.reg = CFI_R13;
- op->src.offset = -16;
- op->dest.type = OP_DEST_REG;
- op->dest.reg = CFI_SP;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_R13;
+ op->src.offset = -16;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
}
break;
case 0x8f:
/* pop to mem */
- *type = INSN_STACK;
- op->src.type = OP_SRC_POP;
- op->dest.type = OP_DEST_MEM;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
+ }
break;
case 0x90:
case 0x9c:
/* pushf */
- *type = INSN_STACK;
- op->src.type = OP_SRC_CONST;
- op->dest.type = OP_DEST_PUSHF;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSHF;
+ }
break;
case 0x9d:
/* popf */
- *type = INSN_STACK;
- op->src.type = OP_SRC_POPF;
- op->dest.type = OP_DEST_MEM;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_POPF;
+ op->dest.type = OP_DEST_MEM;
+ }
break;
case 0x0f:
} else if (op2 == 0xa0 || op2 == 0xa8) {
/* push fs/gs */
- *type = INSN_STACK;
- op->src.type = OP_SRC_CONST;
- op->dest.type = OP_DEST_PUSH;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ }
} else if (op2 == 0xa1 || op2 == 0xa9) {
/* pop fs/gs */
- *type = INSN_STACK;
- op->src.type = OP_SRC_POP;
- op->dest.type = OP_DEST_MEM;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_POP;
+ op->dest.type = OP_DEST_MEM;
+ }
}
break;
* mov bp, sp
* pop bp
*/
- *type = INSN_STACK;
- op->dest.type = OP_DEST_LEAVE;
+ ADD_OP(op)
+ op->dest.type = OP_DEST_LEAVE;
break;
*type = INSN_RETURN;
break;
+ case 0xcf: /* iret */
+ /*
+ * Handle sync_core(), which has an IRET to self.
+ * All other IRET are in STT_NONE entry code.
+ */
+ sym = find_symbol_containing(sec, offset);
+ if (sym && sym->type == STT_FUNC) {
+ ADD_OP(op) {
+ /* add $40, %rsp */
+ op->src.type = OP_SRC_ADD;
+ op->src.reg = CFI_SP;
+ op->src.offset = 5*8;
+ op->dest.type = OP_DEST_REG;
+ op->dest.reg = CFI_SP;
+ }
+ break;
+ }
+
+ /* fallthrough */
+
case 0xca: /* retf */
case 0xcb: /* retf */
- case 0xcf: /* iret */
*type = INSN_CONTEXT_SWITCH;
break;
case 0xe8:
*type = INSN_CALL;
+ /*
+ * For the impact on the stack, a CALL behaves like
+ * a PUSH of an immediate value (the return address).
+ */
+ ADD_OP(op) {
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ }
break;
case 0xfc:
else if (modrm_reg == 6) {
/* push from mem */
- *type = INSN_STACK;
- op->src.type = OP_SRC_CONST;
- op->dest.type = OP_DEST_PUSH;
+ ADD_OP(op) {
+ op->src.type = OP_SRC_CONST;
+ op->dest.type = OP_DEST_PUSH;
+ }
}
break;
return 0;
}
-void arch_initial_func_cfi_state(struct cfi_state *state)
+void arch_initial_func_cfi_state(struct cfi_init_state *state)
{
int i;
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+
+#ifndef _OBJTOOL_CFI_REGS_H
+#define _OBJTOOL_CFI_REGS_H
+
+#define CFI_AX 0
+#define CFI_DX 1
+#define CFI_CX 2
+#define CFI_BX 3
+#define CFI_SI 4
+#define CFI_DI 5
+#define CFI_BP 6
+#define CFI_SP 7
+#define CFI_R8 8
+#define CFI_R9 9
+#define CFI_R10 10
+#define CFI_R11 11
+#define CFI_R12 12
+#define CFI_R13 13
+#define CFI_R14 14
+#define CFI_R15 15
+#define CFI_RA 16
+#define CFI_NUM_REGS 17
+
+#endif /* _OBJTOOL_CFI_REGS_H */
*/
#include <subcmd/parse-options.h>
+#include <string.h>
#include "builtin.h"
-#include "check.h"
+#include "objtool.h"
-bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats;
+bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux;
static const char * const check_usage[] = {
"objtool check [<options>] file.o",
OPT_BOOLEAN('b', "backtrace", &backtrace, "unwind on error"),
OPT_BOOLEAN('a', "uaccess", &uaccess, "enable uaccess checking"),
OPT_BOOLEAN('s', "stats", &stats, "print statistics"),
+ OPT_BOOLEAN('d', "duplicate", &validate_dup, "duplicate validation for vmlinux.o"),
+ OPT_BOOLEAN('l', "vmlinux", &vmlinux, "vmlinux.o validation"),
OPT_END(),
};
int cmd_check(int argc, const char **argv)
{
- const char *objname;
+ const char *objname, *s;
argc = parse_options(argc, argv, check_options, check_usage, 0);
objname = argv[0];
+ s = strstr(objname, "vmlinux.o");
+ if (s && !s[9])
+ vmlinux = true;
+
return check(objname, false);
}
#include <string.h>
#include "builtin.h"
-#include "check.h"
-
+#include "objtool.h"
static const char *orc_usage[] = {
"objtool orc generate [<options>] file.o",
#include <subcmd/parse-options.h>
extern const struct option check_options[];
-extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats;
+extern bool no_fp, no_unreachable, retpoline, module, backtrace, uaccess, stats, validate_dup, vmlinux;
extern int cmd_check(int argc, const char **argv);
extern int cmd_orc(int argc, const char **argv);
#ifndef _OBJTOOL_CFI_H
#define _OBJTOOL_CFI_H
+#include "cfi_regs.h"
+
#define CFI_UNDEFINED -1
#define CFI_CFA -2
#define CFI_SP_INDIRECT -3
#define CFI_BP_INDIRECT -4
-#define CFI_AX 0
-#define CFI_DX 1
-#define CFI_CX 2
-#define CFI_BX 3
-#define CFI_SI 4
-#define CFI_DI 5
-#define CFI_BP 6
-#define CFI_SP 7
-#define CFI_R8 8
-#define CFI_R9 9
-#define CFI_R10 10
-#define CFI_R11 11
-#define CFI_R12 12
-#define CFI_R13 13
-#define CFI_R14 14
-#define CFI_R15 15
-#define CFI_RA 16
-#define CFI_NUM_REGS 17
-
struct cfi_reg {
int base;
int offset;
};
-struct cfi_state {
+struct cfi_init_state {
+ struct cfi_reg regs[CFI_NUM_REGS];
struct cfi_reg cfa;
+};
+
+struct cfi_state {
struct cfi_reg regs[CFI_NUM_REGS];
+ struct cfi_reg vals[CFI_NUM_REGS];
+ struct cfi_reg cfa;
+ int stack_size;
+ int drap_reg, drap_offset;
+ unsigned char type;
+ bool bp_scratch;
+ bool drap;
+ bool end;
};
#endif /* _OBJTOOL_CFI_H */
#include <stdlib.h>
#include "builtin.h"
+#include "cfi.h"
+#include "arch.h"
#include "check.h"
-#include "elf.h"
#include "special.h"
-#include "arch.h"
#include "warn.h"
#include <linux/hashtable.h>
};
const char *objname;
-struct cfi_state initial_func_cfi;
+struct cfi_init_state initial_func_cfi;
struct instruction *find_insn(struct objtool_file *file,
struct section *sec, unsigned long offset)
{
struct instruction *insn;
- hash_for_each_possible(file->insn_hash, insn, hash, offset)
+ hash_for_each_possible(file->insn_hash, insn, hash, sec_offset_hash(sec, offset)) {
if (insn->sec == sec && insn->offset == offset)
return insn;
+ }
return NULL;
}
return __dead_end_function(file, func, 0);
}
-static void clear_insn_state(struct insn_state *state)
+static void init_cfi_state(struct cfi_state *cfi)
{
int i;
- memset(state, 0, sizeof(*state));
- state->cfa.base = CFI_UNDEFINED;
for (i = 0; i < CFI_NUM_REGS; i++) {
- state->regs[i].base = CFI_UNDEFINED;
- state->vals[i].base = CFI_UNDEFINED;
+ cfi->regs[i].base = CFI_UNDEFINED;
+ cfi->vals[i].base = CFI_UNDEFINED;
}
- state->drap_reg = CFI_UNDEFINED;
- state->drap_offset = -1;
+ cfi->cfa.base = CFI_UNDEFINED;
+ cfi->drap_reg = CFI_UNDEFINED;
+ cfi->drap_offset = -1;
+}
+
+static void init_insn_state(struct insn_state *state, struct section *sec)
+{
+ memset(state, 0, sizeof(*state));
+ init_cfi_state(&state->cfi);
+
+ /*
+ * We need the full vmlinux for noinstr validation, otherwise we can
+ * not correctly determine insn->call_dest->sec (external symbols do
+ * not have a section).
+ */
+ if (vmlinux && sec)
+ state->noinstr = sec->noinstr;
}
/*
strncmp(sec->name, ".discard.", 9))
sec->text = true;
+ if (!strcmp(sec->name, ".noinstr.text") ||
+ !strcmp(sec->name, ".entry.text"))
+ sec->noinstr = true;
+
for (offset = 0; offset < sec->len; offset += insn->len) {
insn = malloc(sizeof(*insn));
if (!insn) {
}
memset(insn, 0, sizeof(*insn));
INIT_LIST_HEAD(&insn->alts);
- clear_insn_state(&insn->state);
+ INIT_LIST_HEAD(&insn->stack_ops);
+ init_cfi_state(&insn->cfi);
insn->sec = sec;
insn->offset = offset;
sec->len - offset,
&insn->len, &insn->type,
&insn->immediate,
- &insn->stack_op);
+ &insn->stack_ops);
if (ret)
goto err;
- hash_add(file->insn_hash, &insn->hash, insn->offset);
+ hash_add(file->insn_hash, &insn->hash, sec_offset_hash(sec, insn->offset));
list_add_tail(&insn->list, &file->insn_list);
nr_insns++;
}
return ret;
}
+static struct instruction *find_last_insn(struct objtool_file *file,
+ struct section *sec)
+{
+ struct instruction *insn = NULL;
+ unsigned int offset;
+ unsigned int end = (sec->len > 10) ? sec->len - 10 : 0;
+
+ for (offset = sec->len - 1; offset >= end && !insn; offset--)
+ insn = find_insn(file, sec, offset);
+
+ return insn;
+}
+
/*
* Mark "ud2" instructions and manually annotated dead ends.
*/
struct section *sec;
struct rela *rela;
struct instruction *insn;
- bool found;
/*
* By default, "ud2" is a dead end unless otherwise annotated, because
if (insn)
insn = list_prev_entry(insn, list);
else if (rela->addend == rela->sym->sec->len) {
- found = false;
- list_for_each_entry_reverse(insn, &file->insn_list, list) {
- if (insn->sec == rela->sym->sec) {
- found = true;
- break;
- }
- }
-
- if (!found) {
+ insn = find_last_insn(file, rela->sym->sec);
+ if (!insn) {
WARN("can't find unreachable insn at %s+0x%x",
rela->sym->sec->name, rela->addend);
return -1;
if (insn)
insn = list_prev_entry(insn, list);
else if (rela->addend == rela->sym->sec->len) {
- found = false;
- list_for_each_entry_reverse(insn, &file->insn_list, list) {
- if (insn->sec == rela->sym->sec) {
- found = true;
- break;
- }
- }
-
- if (!found) {
+ insn = find_last_insn(file, rela->sym->sec);
+ if (!insn) {
WARN("can't find reachable insn at %s+0x%x",
rela->sym->sec->name, rela->addend);
return -1;
"__asan_report_store16_noabort",
/* KCOV */
"write_comp_data",
+ "check_kcov_mode",
"__sanitizer_cov_trace_pc",
"__sanitizer_cov_trace_const_cmp1",
"__sanitizer_cov_trace_const_cmp2",
insn->offset, insn->len);
if (!rela) {
dest_sec = insn->sec;
- dest_off = insn->offset + insn->len + insn->immediate;
+ dest_off = arch_jump_destination(insn);
} else if (rela->sym->type == STT_SECTION) {
dest_sec = rela->sym->sec;
- dest_off = rela->addend + 4;
+ dest_off = arch_dest_rela_offset(rela->addend);
} else if (rela->sym->sec->idx) {
dest_sec = rela->sym->sec;
- dest_off = rela->sym->sym.st_value + rela->addend + 4;
+ dest_off = rela->sym->sym.st_value +
+ arch_dest_rela_offset(rela->addend);
} else if (strstr(rela->sym->name, "_indirect_thunk_")) {
/*
* Retpoline jumps are really dynamic jumps in
return 0;
}
+static void remove_insn_ops(struct instruction *insn)
+{
+ struct stack_op *op, *tmp;
+
+ list_for_each_entry_safe(op, tmp, &insn->stack_ops, list) {
+ list_del(&op->list);
+ free(op);
+ }
+}
+
/*
* Find the destination instructions for all calls.
*/
rela = find_rela_by_dest_range(file->elf, insn->sec,
insn->offset, insn->len);
if (!rela) {
- dest_off = insn->offset + insn->len + insn->immediate;
+ dest_off = arch_jump_destination(insn);
insn->call_dest = find_func_by_offset(insn->sec, dest_off);
if (!insn->call_dest)
insn->call_dest = find_symbol_by_offset(insn->sec, dest_off);
continue;
if (!insn->call_dest) {
- WARN_FUNC("unsupported intra-function call",
- insn->sec, insn->offset);
- if (retpoline)
- WARN("If this is a retpoline, please patch it in with alternatives and annotate it with ANNOTATE_NOSPEC_ALTERNATIVE.");
+ WARN_FUNC("unannotated intra-function call", insn->sec, insn->offset);
return -1;
}
}
} else if (rela->sym->type == STT_SECTION) {
+ dest_off = arch_dest_rela_offset(rela->addend);
insn->call_dest = find_func_by_offset(rela->sym->sec,
- rela->addend+4);
+ dest_off);
if (!insn->call_dest) {
- WARN_FUNC("can't find call dest symbol at %s+0x%x",
+ WARN_FUNC("can't find call dest symbol at %s+0x%lx",
insn->sec, insn->offset,
rela->sym->sec->name,
- rela->addend + 4);
+ dest_off);
return -1;
}
} else
insn->call_dest = rela->sym;
+
+ /*
+ * Whatever stack impact regular CALLs have, should be undone
+ * by the RETURN of the called function.
+ *
+ * Annotated intra-function calls retain the stack_ops but
+ * are converted to JUMP, see read_intra_function_calls().
+ */
+ remove_insn_ops(insn);
}
return 0;
struct instruction *orig_insn,
struct instruction **new_insn)
{
+ static unsigned int alt_group_next_index = 1;
struct instruction *last_orig_insn, *last_new_insn, *insn, *fake_jump = NULL;
+ unsigned int alt_group = alt_group_next_index++;
unsigned long dest_off;
last_orig_insn = NULL;
if (insn->offset >= special_alt->orig_off + special_alt->orig_len)
break;
- insn->alt_group = true;
+ insn->alt_group = alt_group;
last_orig_insn = insn;
}
}
memset(fake_jump, 0, sizeof(*fake_jump));
INIT_LIST_HEAD(&fake_jump->alts);
- clear_insn_state(&fake_jump->state);
+ INIT_LIST_HEAD(&fake_jump->stack_ops);
+ init_cfi_state(&fake_jump->cfi);
fake_jump->sec = special_alt->new_sec;
fake_jump->offset = FAKE_JUMP_OFFSET;
}
last_new_insn = NULL;
+ alt_group = alt_group_next_index++;
insn = *new_insn;
sec_for_each_insn_from(file, insn) {
if (insn->offset >= special_alt->new_off + special_alt->new_len)
insn->ignore = orig_insn->ignore_alts;
insn->func = orig_insn->func;
+ insn->alt_group = alt_group;
/*
* Since alternative replacement code is copy/pasted by the
if (!insn->immediate)
continue;
- dest_off = insn->offset + insn->len + insn->immediate;
+ dest_off = arch_jump_destination(insn);
if (dest_off == special_alt->new_off + special_alt->new_len) {
if (!fake_jump) {
WARN("%s: alternative jump to end of section",
}
if (special_alt->group) {
+ if (!special_alt->orig_len) {
+ WARN_FUNC("empty alternative entry",
+ orig_insn->sec, orig_insn->offset);
+ continue;
+ }
+
ret = handle_group_alt(file, special_alt, orig_insn,
&new_insn);
if (ret)
return -1;
}
- cfa = &insn->state.cfa;
-
- if (hint->type == UNWIND_HINT_TYPE_SAVE) {
- insn->save = true;
- continue;
+ cfa = &insn->cfi.cfa;
- } else if (hint->type == UNWIND_HINT_TYPE_RESTORE) {
- insn->restore = true;
- insn->hint = true;
+ if (hint->type == UNWIND_HINT_TYPE_RET_OFFSET) {
+ insn->ret_offset = hint->sp_offset;
continue;
}
}
cfa->offset = hint->sp_offset;
- insn->state.type = hint->type;
- insn->state.end = hint->end;
+ insn->cfi.type = hint->type;
+ insn->cfi.end = hint->end;
}
return 0;
return 0;
}
+static int read_instr_hints(struct objtool_file *file)
+{
+ struct section *sec;
+ struct instruction *insn;
+ struct rela *rela;
+
+ sec = find_section_by_name(file->elf, ".rela.discard.instr_end");
+ if (!sec)
+ return 0;
+
+ list_for_each_entry(rela, &sec->rela_list, list) {
+ if (rela->sym->type != STT_SECTION) {
+ WARN("unexpected relocation symbol type in %s", sec->name);
+ return -1;
+ }
+
+ insn = find_insn(file, rela->sym->sec, rela->addend);
+ if (!insn) {
+ WARN("bad .discard.instr_end entry");
+ return -1;
+ }
+
+ insn->instr--;
+ }
+
+ sec = find_section_by_name(file->elf, ".rela.discard.instr_begin");
+ if (!sec)
+ return 0;
+
+ list_for_each_entry(rela, &sec->rela_list, list) {
+ if (rela->sym->type != STT_SECTION) {
+ WARN("unexpected relocation symbol type in %s", sec->name);
+ return -1;
+ }
+
+ insn = find_insn(file, rela->sym->sec, rela->addend);
+ if (!insn) {
+ WARN("bad .discard.instr_begin entry");
+ return -1;
+ }
+
+ insn->instr++;
+ }
+
+ return 0;
+}
+
+static int read_intra_function_calls(struct objtool_file *file)
+{
+ struct instruction *insn;
+ struct section *sec;
+ struct rela *rela;
+
+ sec = find_section_by_name(file->elf, ".rela.discard.intra_function_calls");
+ if (!sec)
+ return 0;
+
+ list_for_each_entry(rela, &sec->rela_list, list) {
+ unsigned long dest_off;
+
+ if (rela->sym->type != STT_SECTION) {
+ WARN("unexpected relocation symbol type in %s",
+ sec->name);
+ return -1;
+ }
+
+ insn = find_insn(file, rela->sym->sec, rela->addend);
+ if (!insn) {
+ WARN("bad .discard.intra_function_call entry");
+ return -1;
+ }
+
+ if (insn->type != INSN_CALL) {
+ WARN_FUNC("intra_function_call not a direct call",
+ insn->sec, insn->offset);
+ return -1;
+ }
+
+ /*
+ * Treat intra-function CALLs as JMPs, but with a stack_op.
+ * See add_call_destinations(), which strips stack_ops from
+ * normal CALLs.
+ */
+ insn->type = INSN_JUMP_UNCONDITIONAL;
+
+ dest_off = insn->offset + insn->len + insn->immediate;
+ insn->jump_dest = find_insn(file, insn->sec, dest_off);
+ if (!insn->jump_dest) {
+ WARN_FUNC("can't find call dest at %s+0x%lx",
+ insn->sec, insn->offset,
+ insn->sec->name, dest_off);
+ return -1;
+ }
+ }
+
+ return 0;
+}
+
static void mark_rodata(struct objtool_file *file)
{
struct section *sec;
* .rodata.str1.* sections are ignored; they don't contain jump tables.
*/
for_each_sec(file, sec) {
- if ((!strncmp(sec->name, ".rodata", 7) && !strstr(sec->name, ".str1.")) ||
- !strcmp(sec->name, C_JUMP_TABLE_SECTION)) {
+ if (!strncmp(sec->name, ".rodata", 7) &&
+ !strstr(sec->name, ".str1.")) {
sec->rodata = true;
found = true;
}
if (ret)
return ret;
+ ret = read_intra_function_calls(file);
+ if (ret)
+ return ret;
+
ret = add_call_destinations(file);
if (ret)
return ret;
if (ret)
return ret;
+ ret = read_instr_hints(file);
+ if (ret)
+ return ret;
+
return 0;
}
static bool is_fentry_call(struct instruction *insn)
{
- if (insn->type == INSN_CALL &&
+ if (insn->type == INSN_CALL && insn->call_dest &&
insn->call_dest->type == STT_NOTYPE &&
!strcmp(insn->call_dest->name, "__fentry__"))
return true;
return false;
}
-static bool has_modified_stack_frame(struct insn_state *state)
+static bool has_modified_stack_frame(struct instruction *insn, struct insn_state *state)
{
+ u8 ret_offset = insn->ret_offset;
+ struct cfi_state *cfi = &state->cfi;
int i;
- if (state->cfa.base != initial_func_cfi.cfa.base ||
- state->cfa.offset != initial_func_cfi.cfa.offset ||
- state->stack_size != initial_func_cfi.cfa.offset ||
- state->drap)
+ if (cfi->cfa.base != initial_func_cfi.cfa.base || cfi->drap)
+ return true;
+
+ if (cfi->cfa.offset != initial_func_cfi.cfa.offset + ret_offset)
+ return true;
+
+ if (cfi->stack_size != initial_func_cfi.cfa.offset + ret_offset)
return true;
- for (i = 0; i < CFI_NUM_REGS; i++)
- if (state->regs[i].base != initial_func_cfi.regs[i].base ||
- state->regs[i].offset != initial_func_cfi.regs[i].offset)
+ /*
+ * If there is a ret offset hint then don't check registers
+ * because a callee-saved register might have been pushed on
+ * the stack.
+ */
+ if (ret_offset)
+ return false;
+
+ for (i = 0; i < CFI_NUM_REGS; i++) {
+ if (cfi->regs[i].base != initial_func_cfi.regs[i].base ||
+ cfi->regs[i].offset != initial_func_cfi.regs[i].offset)
return true;
+ }
return false;
}
static bool has_valid_stack_frame(struct insn_state *state)
{
- if (state->cfa.base == CFI_BP && state->regs[CFI_BP].base == CFI_CFA &&
- state->regs[CFI_BP].offset == -16)
+ struct cfi_state *cfi = &state->cfi;
+
+ if (cfi->cfa.base == CFI_BP && cfi->regs[CFI_BP].base == CFI_CFA &&
+ cfi->regs[CFI_BP].offset == -16)
return true;
- if (state->drap && state->regs[CFI_BP].base == CFI_BP)
+ if (cfi->drap && cfi->regs[CFI_BP].base == CFI_BP)
return true;
return false;
}
-static int update_insn_state_regs(struct instruction *insn, struct insn_state *state)
+static int update_cfi_state_regs(struct instruction *insn,
+ struct cfi_state *cfi,
+ struct stack_op *op)
{
- struct cfi_reg *cfa = &state->cfa;
- struct stack_op *op = &insn->stack_op;
+ struct cfi_reg *cfa = &cfi->cfa;
if (cfa->base != CFI_SP && cfa->base != CFI_SP_INDIRECT)
return 0;
return 0;
}
-static void save_reg(struct insn_state *state, unsigned char reg, int base,
- int offset)
+static void save_reg(struct cfi_state *cfi, unsigned char reg, int base, int offset)
{
if (arch_callee_saved_reg(reg) &&
- state->regs[reg].base == CFI_UNDEFINED) {
- state->regs[reg].base = base;
- state->regs[reg].offset = offset;
+ cfi->regs[reg].base == CFI_UNDEFINED) {
+ cfi->regs[reg].base = base;
+ cfi->regs[reg].offset = offset;
}
}
-static void restore_reg(struct insn_state *state, unsigned char reg)
+static void restore_reg(struct cfi_state *cfi, unsigned char reg)
{
- state->regs[reg].base = CFI_UNDEFINED;
- state->regs[reg].offset = 0;
+ cfi->regs[reg].base = initial_func_cfi.regs[reg].base;
+ cfi->regs[reg].offset = initial_func_cfi.regs[reg].offset;
}
/*
* 41 5d pop %r13
* c3 retq
*/
-static int update_insn_state(struct instruction *insn, struct insn_state *state)
+static int update_cfi_state(struct instruction *insn, struct cfi_state *cfi,
+ struct stack_op *op)
{
- struct stack_op *op = &insn->stack_op;
- struct cfi_reg *cfa = &state->cfa;
- struct cfi_reg *regs = state->regs;
+ struct cfi_reg *cfa = &cfi->cfa;
+ struct cfi_reg *regs = cfi->regs;
/* stack operations don't make sense with an undefined CFA */
if (cfa->base == CFI_UNDEFINED) {
return 0;
}
- if (state->type == ORC_TYPE_REGS || state->type == ORC_TYPE_REGS_IRET)
- return update_insn_state_regs(insn, state);
+ if (cfi->type == ORC_TYPE_REGS || cfi->type == ORC_TYPE_REGS_IRET)
+ return update_cfi_state_regs(insn, cfi, op);
switch (op->dest.type) {
/* mov %rsp, %rbp */
cfa->base = op->dest.reg;
- state->bp_scratch = false;
+ cfi->bp_scratch = false;
}
else if (op->src.reg == CFI_SP &&
- op->dest.reg == CFI_BP && state->drap) {
+ op->dest.reg == CFI_BP && cfi->drap) {
/* drap: mov %rsp, %rbp */
regs[CFI_BP].base = CFI_BP;
- regs[CFI_BP].offset = -state->stack_size;
- state->bp_scratch = false;
+ regs[CFI_BP].offset = -cfi->stack_size;
+ cfi->bp_scratch = false;
}
else if (op->src.reg == CFI_SP && cfa->base == CFI_SP) {
* ...
* mov %rax, %rsp
*/
- state->vals[op->dest.reg].base = CFI_CFA;
- state->vals[op->dest.reg].offset = -state->stack_size;
+ cfi->vals[op->dest.reg].base = CFI_CFA;
+ cfi->vals[op->dest.reg].offset = -cfi->stack_size;
}
else if (op->src.reg == CFI_BP && op->dest.reg == CFI_SP &&
*
* Restore the original stack pointer (Clang).
*/
- state->stack_size = -state->regs[CFI_BP].offset;
+ cfi->stack_size = -cfi->regs[CFI_BP].offset;
}
else if (op->dest.reg == cfa->base) {
/* mov %reg, %rsp */
if (cfa->base == CFI_SP &&
- state->vals[op->src.reg].base == CFI_CFA) {
+ cfi->vals[op->src.reg].base == CFI_CFA) {
/*
* This is needed for the rare case
* ...
* mov %rcx, %rsp
*/
- cfa->offset = -state->vals[op->src.reg].offset;
- state->stack_size = cfa->offset;
+ cfa->offset = -cfi->vals[op->src.reg].offset;
+ cfi->stack_size = cfa->offset;
} else {
cfa->base = CFI_UNDEFINED;
if (op->dest.reg == CFI_SP && op->src.reg == CFI_SP) {
/* add imm, %rsp */
- state->stack_size -= op->src.offset;
+ cfi->stack_size -= op->src.offset;
if (cfa->base == CFI_SP)
cfa->offset -= op->src.offset;
break;
if (op->dest.reg == CFI_SP && op->src.reg == CFI_BP) {
/* lea disp(%rbp), %rsp */
- state->stack_size = -(op->src.offset + regs[CFI_BP].offset);
+ cfi->stack_size = -(op->src.offset + regs[CFI_BP].offset);
break;
}
if (op->src.reg == CFI_SP && cfa->base == CFI_SP) {
/* drap: lea disp(%rsp), %drap */
- state->drap_reg = op->dest.reg;
+ cfi->drap_reg = op->dest.reg;
/*
* lea disp(%rsp), %reg
* ...
* mov %rcx, %rsp
*/
- state->vals[op->dest.reg].base = CFI_CFA;
- state->vals[op->dest.reg].offset = \
- -state->stack_size + op->src.offset;
+ cfi->vals[op->dest.reg].base = CFI_CFA;
+ cfi->vals[op->dest.reg].offset = \
+ -cfi->stack_size + op->src.offset;
break;
}
- if (state->drap && op->dest.reg == CFI_SP &&
- op->src.reg == state->drap_reg) {
+ if (cfi->drap && op->dest.reg == CFI_SP &&
+ op->src.reg == cfi->drap_reg) {
/* drap: lea disp(%drap), %rsp */
cfa->base = CFI_SP;
- cfa->offset = state->stack_size = -op->src.offset;
- state->drap_reg = CFI_UNDEFINED;
- state->drap = false;
+ cfa->offset = cfi->stack_size = -op->src.offset;
+ cfi->drap_reg = CFI_UNDEFINED;
+ cfi->drap = false;
break;
}
- if (op->dest.reg == state->cfa.base) {
+ if (op->dest.reg == cfi->cfa.base) {
WARN_FUNC("unsupported stack register modification",
insn->sec, insn->offset);
return -1;
case OP_SRC_AND:
if (op->dest.reg != CFI_SP ||
- (state->drap_reg != CFI_UNDEFINED && cfa->base != CFI_SP) ||
- (state->drap_reg == CFI_UNDEFINED && cfa->base != CFI_BP)) {
+ (cfi->drap_reg != CFI_UNDEFINED && cfa->base != CFI_SP) ||
+ (cfi->drap_reg == CFI_UNDEFINED && cfa->base != CFI_BP)) {
WARN_FUNC("unsupported stack pointer realignment",
insn->sec, insn->offset);
return -1;
}
- if (state->drap_reg != CFI_UNDEFINED) {
+ if (cfi->drap_reg != CFI_UNDEFINED) {
/* drap: and imm, %rsp */
- cfa->base = state->drap_reg;
- cfa->offset = state->stack_size = 0;
- state->drap = true;
+ cfa->base = cfi->drap_reg;
+ cfa->offset = cfi->stack_size = 0;
+ cfi->drap = true;
}
/*
case OP_SRC_POP:
case OP_SRC_POPF:
- if (!state->drap && op->dest.type == OP_DEST_REG &&
- op->dest.reg == cfa->base) {
+ if (!cfi->drap && op->dest.reg == cfa->base) {
/* pop %rbp */
cfa->base = CFI_SP;
}
- if (state->drap && cfa->base == CFI_BP_INDIRECT &&
- op->dest.type == OP_DEST_REG &&
- op->dest.reg == state->drap_reg &&
- state->drap_offset == -state->stack_size) {
+ if (cfi->drap && cfa->base == CFI_BP_INDIRECT &&
+ op->dest.reg == cfi->drap_reg &&
+ cfi->drap_offset == -cfi->stack_size) {
/* drap: pop %drap */
- cfa->base = state->drap_reg;
+ cfa->base = cfi->drap_reg;
cfa->offset = 0;
- state->drap_offset = -1;
+ cfi->drap_offset = -1;
- } else if (regs[op->dest.reg].offset == -state->stack_size) {
+ } else if (regs[op->dest.reg].offset == -cfi->stack_size) {
/* pop %reg */
- restore_reg(state, op->dest.reg);
+ restore_reg(cfi, op->dest.reg);
}
- state->stack_size -= 8;
+ cfi->stack_size -= 8;
if (cfa->base == CFI_SP)
cfa->offset -= 8;
break;
case OP_SRC_REG_INDIRECT:
- if (state->drap && op->src.reg == CFI_BP &&
- op->src.offset == state->drap_offset) {
+ if (cfi->drap && op->src.reg == CFI_BP &&
+ op->src.offset == cfi->drap_offset) {
/* drap: mov disp(%rbp), %drap */
- cfa->base = state->drap_reg;
+ cfa->base = cfi->drap_reg;
cfa->offset = 0;
- state->drap_offset = -1;
+ cfi->drap_offset = -1;
}
- if (state->drap && op->src.reg == CFI_BP &&
+ if (cfi->drap && op->src.reg == CFI_BP &&
op->src.offset == regs[op->dest.reg].offset) {
/* drap: mov disp(%rbp), %reg */
- restore_reg(state, op->dest.reg);
+ restore_reg(cfi, op->dest.reg);
} else if (op->src.reg == cfa->base &&
op->src.offset == regs[op->dest.reg].offset + cfa->offset) {
/* mov disp(%rbp), %reg */
/* mov disp(%rsp), %reg */
- restore_reg(state, op->dest.reg);
+ restore_reg(cfi, op->dest.reg);
}
break;
case OP_DEST_PUSH:
case OP_DEST_PUSHF:
- state->stack_size += 8;
+ cfi->stack_size += 8;
if (cfa->base == CFI_SP)
cfa->offset += 8;
if (op->src.type != OP_SRC_REG)
break;
- if (state->drap) {
- if (op->src.reg == cfa->base && op->src.reg == state->drap_reg) {
+ if (cfi->drap) {
+ if (op->src.reg == cfa->base && op->src.reg == cfi->drap_reg) {
/* drap: push %drap */
cfa->base = CFI_BP_INDIRECT;
- cfa->offset = -state->stack_size;
+ cfa->offset = -cfi->stack_size;
/* save drap so we know when to restore it */
- state->drap_offset = -state->stack_size;
+ cfi->drap_offset = -cfi->stack_size;
- } else if (op->src.reg == CFI_BP && cfa->base == state->drap_reg) {
+ } else if (op->src.reg == CFI_BP && cfa->base == cfi->drap_reg) {
/* drap: push %rbp */
- state->stack_size = 0;
+ cfi->stack_size = 0;
} else if (regs[op->src.reg].base == CFI_UNDEFINED) {
/* drap: push %reg */
- save_reg(state, op->src.reg, CFI_BP, -state->stack_size);
+ save_reg(cfi, op->src.reg, CFI_BP, -cfi->stack_size);
}
} else {
/* push %reg */
- save_reg(state, op->src.reg, CFI_CFA, -state->stack_size);
+ save_reg(cfi, op->src.reg, CFI_CFA, -cfi->stack_size);
}
/* detect when asm code uses rbp as a scratch register */
if (!no_fp && insn->func && op->src.reg == CFI_BP &&
cfa->base != CFI_BP)
- state->bp_scratch = true;
+ cfi->bp_scratch = true;
break;
case OP_DEST_REG_INDIRECT:
- if (state->drap) {
- if (op->src.reg == cfa->base && op->src.reg == state->drap_reg) {
+ if (cfi->drap) {
+ if (op->src.reg == cfa->base && op->src.reg == cfi->drap_reg) {
/* drap: mov %drap, disp(%rbp) */
cfa->base = CFI_BP_INDIRECT;
cfa->offset = op->dest.offset;
/* save drap offset so we know when to restore it */
- state->drap_offset = op->dest.offset;
+ cfi->drap_offset = op->dest.offset;
}
else if (regs[op->src.reg].base == CFI_UNDEFINED) {
/* drap: mov reg, disp(%rbp) */
- save_reg(state, op->src.reg, CFI_BP, op->dest.offset);
+ save_reg(cfi, op->src.reg, CFI_BP, op->dest.offset);
}
} else if (op->dest.reg == cfa->base) {
/* mov reg, disp(%rbp) */
/* mov reg, disp(%rsp) */
- save_reg(state, op->src.reg, CFI_CFA,
- op->dest.offset - state->cfa.offset);
+ save_reg(cfi, op->src.reg, CFI_CFA,
+ op->dest.offset - cfi->cfa.offset);
}
break;
case OP_DEST_LEAVE:
- if ((!state->drap && cfa->base != CFI_BP) ||
- (state->drap && cfa->base != state->drap_reg)) {
+ if ((!cfi->drap && cfa->base != CFI_BP) ||
+ (cfi->drap && cfa->base != cfi->drap_reg)) {
WARN_FUNC("leave instruction with modified stack frame",
insn->sec, insn->offset);
return -1;
/* leave (mov %rbp, %rsp; pop %rbp) */
- state->stack_size = -state->regs[CFI_BP].offset - 8;
- restore_reg(state, CFI_BP);
+ cfi->stack_size = -cfi->regs[CFI_BP].offset - 8;
+ restore_reg(cfi, CFI_BP);
- if (!state->drap) {
+ if (!cfi->drap) {
cfa->base = CFI_SP;
cfa->offset -= 8;
}
}
/* pop mem */
- state->stack_size -= 8;
+ cfi->stack_size -= 8;
if (cfa->base == CFI_SP)
cfa->offset -= 8;
return 0;
}
-static bool insn_state_match(struct instruction *insn, struct insn_state *state)
+static int handle_insn_ops(struct instruction *insn, struct insn_state *state)
+{
+ struct stack_op *op;
+
+ list_for_each_entry(op, &insn->stack_ops, list) {
+ struct cfi_state old_cfi = state->cfi;
+ int res;
+
+ res = update_cfi_state(insn, &state->cfi, op);
+ if (res)
+ return res;
+
+ if (insn->alt_group && memcmp(&state->cfi, &old_cfi, sizeof(struct cfi_state))) {
+ WARN_FUNC("alternative modifies stack", insn->sec, insn->offset);
+ return -1;
+ }
+
+ if (op->dest.type == OP_DEST_PUSHF) {
+ if (!state->uaccess_stack) {
+ state->uaccess_stack = 1;
+ } else if (state->uaccess_stack >> 31) {
+ WARN_FUNC("PUSHF stack exhausted",
+ insn->sec, insn->offset);
+ return 1;
+ }
+ state->uaccess_stack <<= 1;
+ state->uaccess_stack |= state->uaccess;
+ }
+
+ if (op->src.type == OP_SRC_POPF) {
+ if (state->uaccess_stack) {
+ state->uaccess = state->uaccess_stack & 1;
+ state->uaccess_stack >>= 1;
+ if (state->uaccess_stack == 1)
+ state->uaccess_stack = 0;
+ }
+ }
+ }
+
+ return 0;
+}
+
+static bool insn_cfi_match(struct instruction *insn, struct cfi_state *cfi2)
{
- struct insn_state *state1 = &insn->state, *state2 = state;
+ struct cfi_state *cfi1 = &insn->cfi;
int i;
- if (memcmp(&state1->cfa, &state2->cfa, sizeof(state1->cfa))) {
+ if (memcmp(&cfi1->cfa, &cfi2->cfa, sizeof(cfi1->cfa))) {
+
WARN_FUNC("stack state mismatch: cfa1=%d%+d cfa2=%d%+d",
insn->sec, insn->offset,
- state1->cfa.base, state1->cfa.offset,
- state2->cfa.base, state2->cfa.offset);
+ cfi1->cfa.base, cfi1->cfa.offset,
+ cfi2->cfa.base, cfi2->cfa.offset);
- } else if (memcmp(&state1->regs, &state2->regs, sizeof(state1->regs))) {
+ } else if (memcmp(&cfi1->regs, &cfi2->regs, sizeof(cfi1->regs))) {
for (i = 0; i < CFI_NUM_REGS; i++) {
- if (!memcmp(&state1->regs[i], &state2->regs[i],
+ if (!memcmp(&cfi1->regs[i], &cfi2->regs[i],
sizeof(struct cfi_reg)))
continue;
WARN_FUNC("stack state mismatch: reg1[%d]=%d%+d reg2[%d]=%d%+d",
insn->sec, insn->offset,
- i, state1->regs[i].base, state1->regs[i].offset,
- i, state2->regs[i].base, state2->regs[i].offset);
+ i, cfi1->regs[i].base, cfi1->regs[i].offset,
+ i, cfi2->regs[i].base, cfi2->regs[i].offset);
break;
}
- } else if (state1->type != state2->type) {
+ } else if (cfi1->type != cfi2->type) {
+
WARN_FUNC("stack state mismatch: type1=%d type2=%d",
- insn->sec, insn->offset, state1->type, state2->type);
+ insn->sec, insn->offset, cfi1->type, cfi2->type);
+
+ } else if (cfi1->drap != cfi2->drap ||
+ (cfi1->drap && cfi1->drap_reg != cfi2->drap_reg) ||
+ (cfi1->drap && cfi1->drap_offset != cfi2->drap_offset)) {
- } else if (state1->drap != state2->drap ||
- (state1->drap && state1->drap_reg != state2->drap_reg) ||
- (state1->drap && state1->drap_offset != state2->drap_offset)) {
WARN_FUNC("stack state mismatch: drap1=%d(%d,%d) drap2=%d(%d,%d)",
insn->sec, insn->offset,
- state1->drap, state1->drap_reg, state1->drap_offset,
- state2->drap, state2->drap_reg, state2->drap_offset);
+ cfi1->drap, cfi1->drap_reg, cfi1->drap_offset,
+ cfi2->drap, cfi2->drap_reg, cfi2->drap_offset);
} else
return true;
static int validate_call(struct instruction *insn, struct insn_state *state)
{
+ if (state->noinstr && state->instr <= 0 &&
+ (!insn->call_dest || !insn->call_dest->sec->noinstr)) {
+ WARN_FUNC("call to %s() leaves .noinstr.text section",
+ insn->sec, insn->offset, call_dest_name(insn));
+ return 1;
+ }
+
if (state->uaccess && !func_uaccess_safe(insn->call_dest)) {
WARN_FUNC("call to %s() with UACCESS enabled",
insn->sec, insn->offset, call_dest_name(insn));
static int validate_sibling_call(struct instruction *insn, struct insn_state *state)
{
- if (has_modified_stack_frame(state)) {
+ if (has_modified_stack_frame(insn, state)) {
WARN_FUNC("sibling call from callable instruction with modified stack frame",
insn->sec, insn->offset);
return 1;
static int validate_return(struct symbol *func, struct instruction *insn, struct insn_state *state)
{
+ if (state->noinstr && state->instr > 0) {
+ WARN_FUNC("return with instrumentation enabled",
+ insn->sec, insn->offset);
+ return 1;
+ }
+
if (state->uaccess && !func_uaccess_safe(func)) {
WARN_FUNC("return with UACCESS enabled",
insn->sec, insn->offset);
return 1;
}
- if (func && has_modified_stack_frame(state)) {
+ if (func && has_modified_stack_frame(insn, state)) {
WARN_FUNC("return with modified stack frame",
insn->sec, insn->offset);
return 1;
}
- if (state->bp_scratch) {
+ if (state->cfi.bp_scratch) {
WARN_FUNC("BP used as a scratch register",
insn->sec, insn->offset);
return 1;
return 0;
}
+/*
+ * Alternatives should not contain any ORC entries, this in turn means they
+ * should not contain any CFI ops, which implies all instructions should have
+ * the same same CFI state.
+ *
+ * It is possible to constuct alternatives that have unreachable holes that go
+ * unreported (because they're NOPs), such holes would result in CFI_UNDEFINED
+ * states which then results in ORC entries, which we just said we didn't want.
+ *
+ * Avoid them by copying the CFI entry of the first instruction into the whole
+ * alternative.
+ */
+static void fill_alternative_cfi(struct objtool_file *file, struct instruction *insn)
+{
+ struct instruction *first_insn = insn;
+ int alt_group = insn->alt_group;
+
+ sec_for_each_insn_continue(file, insn) {
+ if (insn->alt_group != alt_group)
+ break;
+ insn->cfi = first_insn->cfi;
+ }
+}
+
/*
* Follow the branch starting at the given instruction, and recursively follow
* any other branches (jumps). Meanwhile, track the frame pointer state at
* tools/objtool/Documentation/stack-validation.txt.
*/
static int validate_branch(struct objtool_file *file, struct symbol *func,
- struct instruction *first, struct insn_state state)
+ struct instruction *insn, struct insn_state state)
{
struct alternative *alt;
- struct instruction *insn, *next_insn;
+ struct instruction *next_insn;
struct section *sec;
u8 visited;
int ret;
- insn = first;
sec = insn->sec;
- if (insn->alt_group && list_empty(&insn->alts)) {
- WARN_FUNC("don't know how to handle branch to middle of alternative instruction group",
- sec, insn->offset);
- return 1;
- }
-
while (1) {
next_insn = next_insn_same_sec(file, insn);
visited = 1 << state.uaccess;
if (insn->visited) {
- if (!insn->hint && !insn_state_match(insn, &state))
+ if (!insn->hint && !insn_cfi_match(insn, &state.cfi))
return 1;
if (insn->visited & visited)
return 0;
}
- if (insn->hint) {
- if (insn->restore) {
- struct instruction *save_insn, *i;
-
- i = insn;
- save_insn = NULL;
- sym_for_each_insn_continue_reverse(file, func, i) {
- if (i->save) {
- save_insn = i;
- break;
- }
- }
-
- if (!save_insn) {
- WARN_FUNC("no corresponding CFI save for CFI restore",
- sec, insn->offset);
- return 1;
- }
-
- if (!save_insn->visited) {
- /*
- * Oops, no state to copy yet.
- * Hopefully we can reach this
- * instruction from another branch
- * after the save insn has been
- * visited.
- */
- if (insn == first)
- return 0;
-
- WARN_FUNC("objtool isn't smart enough to handle this CFI save/restore combo",
- sec, insn->offset);
- return 1;
- }
-
- insn->state = save_insn->state;
- }
-
- state = insn->state;
+ if (state.noinstr)
+ state.instr += insn->instr;
- } else
- insn->state = state;
+ if (insn->hint)
+ state.cfi = insn->cfi;
+ else
+ insn->cfi = state.cfi;
insn->visited |= visited;
- if (!insn->ignore_alts) {
+ if (!insn->ignore_alts && !list_empty(&insn->alts)) {
bool skip_orig = false;
list_for_each_entry(alt, &insn->alts, list) {
}
}
+ if (insn->alt_group)
+ fill_alternative_cfi(file, insn);
+
if (skip_orig)
return 0;
}
+ if (handle_insn_ops(insn, &state))
+ return 1;
+
switch (insn->type) {
case INSN_RETURN:
}
return 0;
- case INSN_STACK:
- if (update_insn_state(insn, &state))
- return 1;
-
- if (insn->stack_op.dest.type == OP_DEST_PUSHF) {
- if (!state.uaccess_stack) {
- state.uaccess_stack = 1;
- } else if (state.uaccess_stack >> 31) {
- WARN_FUNC("PUSHF stack exhausted", sec, insn->offset);
- return 1;
- }
- state.uaccess_stack <<= 1;
- state.uaccess_stack |= state.uaccess;
- }
-
- if (insn->stack_op.src.type == OP_SRC_POPF) {
- if (state.uaccess_stack) {
- state.uaccess = state.uaccess_stack & 1;
- state.uaccess_stack >>= 1;
- if (state.uaccess_stack == 1)
- state.uaccess_stack = 0;
- }
- }
-
- break;
-
case INSN_STAC:
if (state.uaccess) {
WARN_FUNC("recursive UACCESS enable", sec, insn->offset);
return 0;
if (!next_insn) {
- if (state.cfa.base == CFI_UNDEFINED)
+ if (state.cfi.cfa.base == CFI_UNDEFINED)
return 0;
WARN("%s: unexpected end of section", sec->name);
return 1;
return 0;
}
-static int validate_unwind_hints(struct objtool_file *file)
+static int validate_unwind_hints(struct objtool_file *file, struct section *sec)
{
struct instruction *insn;
- int ret, warnings = 0;
struct insn_state state;
+ int ret, warnings = 0;
if (!file->hints)
return 0;
- clear_insn_state(&state);
+ init_insn_state(&state, sec);
- for_each_insn(file, insn) {
+ if (sec) {
+ insn = find_insn(file, sec, 0);
+ if (!insn)
+ return 0;
+ } else {
+ insn = list_first_entry(&file->insn_list, typeof(*insn), list);
+ }
+
+ while (&insn->list != &file->insn_list && (!sec || insn->sec == sec)) {
if (insn->hint && !insn->visited) {
ret = validate_branch(file, insn->func, insn, state);
if (ret && backtrace)
BT_FUNC("<=== (hint)", insn);
warnings += ret;
}
+
+ insn = list_next_entry(insn, list);
}
return warnings;
return false;
}
-static int validate_section(struct objtool_file *file, struct section *sec)
+static int validate_symbol(struct objtool_file *file, struct section *sec,
+ struct symbol *sym, struct insn_state *state)
{
- struct symbol *func;
struct instruction *insn;
- struct insn_state state;
- int ret, warnings = 0;
+ int ret;
+
+ if (!sym->len) {
+ WARN("%s() is missing an ELF size annotation", sym->name);
+ return 1;
+ }
+
+ if (sym->pfunc != sym || sym->alias != sym)
+ return 0;
- clear_insn_state(&state);
+ insn = find_insn(file, sec, sym->offset);
+ if (!insn || insn->ignore || insn->visited)
+ return 0;
+
+ state->uaccess = sym->uaccess_safe;
+
+ ret = validate_branch(file, insn->func, insn, *state);
+ if (ret && backtrace)
+ BT_FUNC("<=== (sym)", insn);
+ return ret;
+}
- state.cfa = initial_func_cfi.cfa;
- memcpy(&state.regs, &initial_func_cfi.regs,
- CFI_NUM_REGS * sizeof(struct cfi_reg));
- state.stack_size = initial_func_cfi.cfa.offset;
+static int validate_section(struct objtool_file *file, struct section *sec)
+{
+ struct insn_state state;
+ struct symbol *func;
+ int warnings = 0;
list_for_each_entry(func, &sec->symbol_list, list) {
if (func->type != STT_FUNC)
continue;
- if (!func->len) {
- WARN("%s() is missing an ELF size annotation",
- func->name);
- warnings++;
- }
+ init_insn_state(&state, sec);
+ state.cfi.cfa = initial_func_cfi.cfa;
+ memcpy(&state.cfi.regs, &initial_func_cfi.regs,
+ CFI_NUM_REGS * sizeof(struct cfi_reg));
+ state.cfi.stack_size = initial_func_cfi.cfa.offset;
- if (func->pfunc != func || func->alias != func)
- continue;
+ warnings += validate_symbol(file, sec, func, &state);
+ }
- insn = find_insn(file, sec, func->offset);
- if (!insn || insn->ignore || insn->visited)
- continue;
+ return warnings;
+}
- state.uaccess = func->uaccess_safe;
+static int validate_vmlinux_functions(struct objtool_file *file)
+{
+ struct section *sec;
+ int warnings = 0;
- ret = validate_branch(file, func, insn, state);
- if (ret && backtrace)
- BT_FUNC("<=== (func)", insn);
- warnings += ret;
+ sec = find_section_by_name(file->elf, ".noinstr.text");
+ if (sec) {
+ warnings += validate_section(file, sec);
+ warnings += validate_unwind_hints(file, sec);
+ }
+
+ sec = find_section_by_name(file->elf, ".entry.text");
+ if (sec) {
+ warnings += validate_section(file, sec);
+ warnings += validate_unwind_hints(file, sec);
}
return warnings;
struct section *sec;
int warnings = 0;
- for_each_sec(file, sec)
+ for_each_sec(file, sec) {
+ if (!(sec->sh.sh_flags & SHF_EXECINSTR))
+ continue;
+
warnings += validate_section(file, sec);
+ }
return warnings;
}
objname = _objname;
- file.elf = elf_read(objname, orc ? O_RDWR : O_RDONLY);
+ file.elf = elf_open_read(objname, orc ? O_RDWR : O_RDONLY);
if (!file.elf)
return 1;
if (list_empty(&file.insn_list))
goto out;
+ if (vmlinux && !validate_dup) {
+ ret = validate_vmlinux_functions(&file);
+ if (ret < 0)
+ goto out;
+
+ warnings += ret;
+ goto out;
+ }
+
if (retpoline) {
ret = validate_retpoline(&file);
if (ret < 0)
goto out;
warnings += ret;
- ret = validate_unwind_hints(&file);
+ ret = validate_unwind_hints(&file, NULL);
if (ret < 0)
goto out;
warnings += ret;
#define _CHECK_H
#include <stdbool.h>
-#include "elf.h"
#include "cfi.h"
#include "arch.h"
-#include "orc.h"
-#include <linux/hashtable.h>
struct insn_state {
- struct cfi_reg cfa;
- struct cfi_reg regs[CFI_NUM_REGS];
- int stack_size;
- unsigned char type;
- bool bp_scratch;
- bool drap, end, uaccess, df;
+ struct cfi_state cfi;
unsigned int uaccess_stack;
- int drap_reg, drap_offset;
- struct cfi_reg vals[CFI_NUM_REGS];
+ bool uaccess;
+ bool df;
+ bool noinstr;
+ s8 instr;
};
struct instruction {
unsigned int len;
enum insn_type type;
unsigned long immediate;
- bool alt_group, dead_end, ignore, hint, save, restore, ignore_alts;
+ bool dead_end, ignore, ignore_alts;
+ bool hint;
bool retpoline_safe;
+ s8 instr;
u8 visited;
+ u8 ret_offset;
+ int alt_group;
struct symbol *call_dest;
struct instruction *jump_dest;
struct instruction *first_jump_src;
struct rela *jump_table;
struct list_head alts;
struct symbol *func;
- struct stack_op stack_op;
- struct insn_state state;
+ struct list_head stack_ops;
+ struct cfi_state cfi;
struct orc_entry orc;
};
-struct objtool_file {
- struct elf *elf;
- struct list_head insn_list;
- DECLARE_HASHTABLE(insn_hash, 20);
- bool ignore_unreachables, c_file, hints, rodata;
-};
-
-int check(const char *objname, bool orc);
-
struct instruction *find_insn(struct objtool_file *file,
struct section *sec, unsigned long offset);
return jhash(str, strlen(str), 0);
}
+static inline int elf_hash_bits(void)
+{
+ return vmlinux ? ELF_HASH_BITS : 16;
+}
+
+#define elf_hash_add(hashtable, node, key) \
+ hlist_add_head(node, &hashtable[hash_min(key, elf_hash_bits())])
+
+static void elf_hash_init(struct hlist_head *table)
+{
+ __hash_init(table, 1U << elf_hash_bits());
+}
+
+#define elf_hash_for_each_possible(name, obj, member, key) \
+ hlist_for_each_entry(obj, &name[hash_min(key, elf_hash_bits())], member)
+
static void rb_add(struct rb_root *tree, struct rb_node *node,
int (*cmp)(struct rb_node *, const struct rb_node *))
{
rb_insert_color(node, tree);
}
-static struct rb_node *rb_find_first(struct rb_root *tree, const void *key,
+static struct rb_node *rb_find_first(const struct rb_root *tree, const void *key,
int (*cmp)(const void *key, const struct rb_node *))
{
struct rb_node *node = tree->rb_node;
return 0;
}
-struct section *find_section_by_name(struct elf *elf, const char *name)
+struct section *find_section_by_name(const struct elf *elf, const char *name)
{
struct section *sec;
- hash_for_each_possible(elf->section_name_hash, sec, name_hash, str_hash(name))
+ elf_hash_for_each_possible(elf->section_name_hash, sec, name_hash, str_hash(name))
if (!strcmp(sec->name, name))
return sec;
{
struct section *sec;
- hash_for_each_possible(elf->section_hash, sec, hash, idx)
+ elf_hash_for_each_possible(elf->section_hash, sec, hash, idx)
if (sec->idx == idx)
return sec;
{
struct symbol *sym;
- hash_for_each_possible(elf->symbol_hash, sym, hash, idx)
+ elf_hash_for_each_possible(elf->symbol_hash, sym, hash, idx)
if (sym->idx == idx)
return sym;
return NULL;
}
-struct symbol *find_symbol_containing(struct section *sec, unsigned long offset)
+struct symbol *find_symbol_containing(const struct section *sec, unsigned long offset)
{
struct rb_node *node;
return NULL;
}
-struct symbol *find_symbol_by_name(struct elf *elf, const char *name)
+struct symbol *find_symbol_by_name(const struct elf *elf, const char *name)
{
struct symbol *sym;
- hash_for_each_possible(elf->symbol_name_hash, sym, name_hash, str_hash(name))
+ elf_hash_for_each_possible(elf->symbol_name_hash, sym, name_hash, str_hash(name))
if (!strcmp(sym->name, name))
return sym;
return NULL;
}
-struct rela *find_rela_by_dest_range(struct elf *elf, struct section *sec,
+struct rela *find_rela_by_dest_range(const struct elf *elf, struct section *sec,
unsigned long offset, unsigned int len)
{
struct rela *rela, *r = NULL;
sec = sec->rela;
for_offset_range(o, offset, offset + len) {
- hash_for_each_possible(elf->rela_hash, rela, hash,
+ elf_hash_for_each_possible(elf->rela_hash, rela, hash,
sec_offset_hash(sec, o)) {
if (rela->sec != sec)
continue;
return NULL;
}
-struct rela *find_rela_by_dest(struct elf *elf, struct section *sec, unsigned long offset)
+struct rela *find_rela_by_dest(const struct elf *elf, struct section *sec, unsigned long offset)
{
return find_rela_by_dest_range(elf, sec, offset, 1);
}
sec->len = sec->sh.sh_size;
list_add_tail(&sec->list, &elf->sections);
- hash_add(elf->section_hash, &sec->hash, sec->idx);
- hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));
+ elf_hash_add(elf->section_hash, &sec->hash, sec->idx);
+ elf_hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));
}
if (stats)
static int read_symbols(struct elf *elf)
{
- struct section *symtab, *sec;
+ struct section *symtab, *symtab_shndx, *sec;
struct symbol *sym, *pfunc;
struct list_head *entry;
struct rb_node *pnode;
int symbols_nr, i;
char *coldstr;
+ Elf_Data *shndx_data = NULL;
+ Elf32_Word shndx;
symtab = find_section_by_name(elf, ".symtab");
if (!symtab) {
return -1;
}
+ symtab_shndx = find_section_by_name(elf, ".symtab_shndx");
+ if (symtab_shndx)
+ shndx_data = symtab_shndx->data;
+
symbols_nr = symtab->sh.sh_size / symtab->sh.sh_entsize;
for (i = 0; i < symbols_nr; i++) {
sym->idx = i;
- if (!gelf_getsym(symtab->data, i, &sym->sym)) {
- WARN_ELF("gelf_getsym");
+ if (!gelf_getsymshndx(symtab->data, shndx_data, i, &sym->sym,
+ &shndx)) {
+ WARN_ELF("gelf_getsymshndx");
goto err;
}
sym->type = GELF_ST_TYPE(sym->sym.st_info);
sym->bind = GELF_ST_BIND(sym->sym.st_info);
- if (sym->sym.st_shndx > SHN_UNDEF &&
- sym->sym.st_shndx < SHN_LORESERVE) {
- sym->sec = find_section_by_index(elf,
- sym->sym.st_shndx);
+ if ((sym->sym.st_shndx > SHN_UNDEF &&
+ sym->sym.st_shndx < SHN_LORESERVE) ||
+ (shndx_data && sym->sym.st_shndx == SHN_XINDEX)) {
+ if (sym->sym.st_shndx != SHN_XINDEX)
+ shndx = sym->sym.st_shndx;
+
+ sym->sec = find_section_by_index(elf, shndx);
if (!sym->sec) {
WARN("couldn't find section for symbol %s",
sym->name);
else
entry = &sym->sec->symbol_list;
list_add(&sym->list, entry);
- hash_add(elf->symbol_hash, &sym->hash, sym->idx);
- hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
+ elf_hash_add(elf->symbol_hash, &sym->hash, sym->idx);
+ elf_hash_add(elf->symbol_name_hash, &sym->name_hash, str_hash(sym->name));
}
if (stats)
return -1;
}
+void elf_add_rela(struct elf *elf, struct rela *rela)
+{
+ struct section *sec = rela->sec;
+
+ list_add_tail(&rela->list, &sec->rela_list);
+ elf_hash_add(elf->rela_hash, &rela->hash, rela_hash(rela));
+}
+
static int read_relas(struct elf *elf)
{
struct section *sec;
return -1;
}
- list_add_tail(&rela->list, &sec->rela_list);
- hash_add(elf->rela_hash, &rela->hash, rela_hash(rela));
+ elf_add_rela(elf, rela);
nr_rela++;
}
max_rela = max(max_rela, nr_rela);
return 0;
}
-struct elf *elf_read(const char *name, int flags)
+struct elf *elf_open_read(const char *name, int flags)
{
struct elf *elf;
Elf_Cmd cmd;
perror("malloc");
return NULL;
}
- memset(elf, 0, sizeof(*elf));
+ memset(elf, 0, offsetof(struct elf, sections));
- hash_init(elf->symbol_hash);
- hash_init(elf->symbol_name_hash);
- hash_init(elf->section_hash);
- hash_init(elf->section_name_hash);
- hash_init(elf->rela_hash);
INIT_LIST_HEAD(&elf->sections);
+ elf_hash_init(elf->symbol_hash);
+ elf_hash_init(elf->symbol_name_hash);
+ elf_hash_init(elf->section_hash);
+ elf_hash_init(elf->section_name_hash);
+ elf_hash_init(elf->rela_hash);
+
elf->fd = open(name, flags);
if (elf->fd == -1) {
fprintf(stderr, "objtool: Can't open '%s': %s\n",
shstrtab->changed = true;
list_add_tail(&sec->list, &elf->sections);
- hash_add(elf->section_hash, &sec->hash, sec->idx);
- hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));
+ elf_hash_add(elf->section_hash, &sec->hash, sec->idx);
+ elf_hash_add(elf->section_name_hash, &sec->name_hash, str_hash(sec->name));
return sec;
}
return 0;
}
-int elf_write(struct elf *elf)
+int elf_write(const struct elf *elf)
{
struct section *sec;
Elf_Scn *s;
char *name;
int idx;
unsigned int len;
- bool changed, text, rodata;
+ bool changed, text, rodata, noinstr;
};
struct symbol {
bool jump_table_start;
};
+#define ELF_HASH_BITS 20
+
struct elf {
Elf *elf;
GElf_Ehdr ehdr;
int fd;
char *name;
struct list_head sections;
- DECLARE_HASHTABLE(symbol_hash, 20);
- DECLARE_HASHTABLE(symbol_name_hash, 20);
- DECLARE_HASHTABLE(section_hash, 16);
- DECLARE_HASHTABLE(section_name_hash, 16);
- DECLARE_HASHTABLE(rela_hash, 20);
+ DECLARE_HASHTABLE(symbol_hash, ELF_HASH_BITS);
+ DECLARE_HASHTABLE(symbol_name_hash, ELF_HASH_BITS);
+ DECLARE_HASHTABLE(section_hash, ELF_HASH_BITS);
+ DECLARE_HASHTABLE(section_name_hash, ELF_HASH_BITS);
+ DECLARE_HASHTABLE(rela_hash, ELF_HASH_BITS);
};
#define OFFSET_STRIDE_BITS 4
return sec_offset_hash(rela->sec, rela->offset);
}
-struct elf *elf_read(const char *name, int flags);
-struct section *find_section_by_name(struct elf *elf, const char *name);
+struct elf *elf_open_read(const char *name, int flags);
+struct section *elf_create_section(struct elf *elf, const char *name, size_t entsize, int nr);
+struct section *elf_create_rela_section(struct elf *elf, struct section *base);
+void elf_add_rela(struct elf *elf, struct rela *rela);
+int elf_write(const struct elf *elf);
+void elf_close(struct elf *elf);
+
+struct section *find_section_by_name(const struct elf *elf, const char *name);
struct symbol *find_func_by_offset(struct section *sec, unsigned long offset);
struct symbol *find_symbol_by_offset(struct section *sec, unsigned long offset);
-struct symbol *find_symbol_by_name(struct elf *elf, const char *name);
-struct symbol *find_symbol_containing(struct section *sec, unsigned long offset);
-struct rela *find_rela_by_dest(struct elf *elf, struct section *sec, unsigned long offset);
-struct rela *find_rela_by_dest_range(struct elf *elf, struct section *sec,
+struct symbol *find_symbol_by_name(const struct elf *elf, const char *name);
+struct symbol *find_symbol_containing(const struct section *sec, unsigned long offset);
+struct rela *find_rela_by_dest(const struct elf *elf, struct section *sec, unsigned long offset);
+struct rela *find_rela_by_dest_range(const struct elf *elf, struct section *sec,
unsigned long offset, unsigned int len);
struct symbol *find_func_containing(struct section *sec, unsigned long offset);
-struct section *elf_create_section(struct elf *elf, const char *name, size_t
- entsize, int nr);
-struct section *elf_create_rela_section(struct elf *elf, struct section *base);
int elf_rebuild_rela_section(struct section *sec);
-int elf_write(struct elf *elf);
-void elf_close(struct elf *elf);
#define for_each_sec(file, sec) \
list_for_each_entry(sec, &file->elf->sections, list)
printf("\n");
- exit(129);
+ if (!help)
+ exit(129);
+ exit(0);
}
static void handle_options(int *argc, const char ***argv)
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0-or-later */
+/*
+ * Copyright (C) 2020 Matt Helsley <mhelsley@vmware.com>
+ */
+
+#ifndef _OBJTOOL_H
+#define _OBJTOOL_H
+
+#include <stdbool.h>
+#include <linux/list.h>
+#include <linux/hashtable.h>
+
+#include "elf.h"
+
+struct objtool_file {
+ struct elf *elf;
+ struct list_head insn_list;
+ DECLARE_HASHTABLE(insn_hash, 20);
+ bool ignore_unreachables, c_file, hints, rodata;
+};
+
+int check(const char *objname, bool orc);
+int orc_dump(const char *objname);
+int create_orc(struct objtool_file *file);
+int create_orc_sections(struct objtool_file *file);
+
+#endif /* _OBJTOOL_H */
+++ /dev/null
-/* SPDX-License-Identifier: GPL-2.0-or-later */
-/*
- * Copyright (C) 2017 Josh Poimboeuf <jpoimboe@redhat.com>
- */
-
-#ifndef _ORC_H
-#define _ORC_H
-
-#include <asm/orc_types.h>
-
-struct objtool_file;
-
-int create_orc(struct objtool_file *file);
-int create_orc_sections(struct objtool_file *file);
-
-int orc_dump(const char *objname);
-
-#endif /* _ORC_H */
*/
#include <unistd.h>
-#include "orc.h"
+#include <asm/orc_types.h>
+#include "objtool.h"
#include "warn.h"
static const char *reg_name(unsigned int reg)
#include <stdlib.h>
#include <string.h>
-#include "orc.h"
#include "check.h"
#include "warn.h"
for_each_insn(file, insn) {
struct orc_entry *orc = &insn->orc;
- struct cfi_reg *cfa = &insn->state.cfa;
- struct cfi_reg *bp = &insn->state.regs[CFI_BP];
+ struct cfi_reg *cfa = &insn->cfi.cfa;
+ struct cfi_reg *bp = &insn->cfi.regs[CFI_BP];
- orc->end = insn->state.end;
+ orc->end = insn->cfi.end;
if (cfa->base == CFI_UNDEFINED) {
orc->sp_reg = ORC_REG_UNDEFINED;
orc->sp_offset = cfa->offset;
orc->bp_offset = bp->offset;
- orc->type = insn->state.type;
+ orc->type = insn->cfi.type;
}
return 0;
rela->offset = idx * sizeof(int);
rela->sec = ip_relasec;
- list_add_tail(&rela->list, &ip_relasec->rela_list);
- hash_add(elf->rela_hash, &rela->hash, rela_hash(rela));
+ elf_add_rela(elf, rela);
return 0;
}
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-or-later
+/*
+ * Copyright (C) 2020 Matt Helsley <mhelsley@vmware.com>
+ * Weak definitions necessary to compile objtool without
+ * some subcommands (e.g. check, orc).
+ */
+
+#include <stdbool.h>
+#include <errno.h>
+#include "objtool.h"
+
+#define __weak __attribute__((weak))
+
+#define UNSUPPORTED(name) \
+({ \
+ fprintf(stderr, "error: objtool: " name " not implemented\n"); \
+ return ENOSYS; \
+})
+
+const char __weak *objname;
+
+int __weak check(const char *_objname, bool orc)
+{
+ UNSUPPORTED("check subcommand");
+}
+
+int __weak orc_dump(const char *_objname)
+{
+ UNSUPPORTED("orc");
+}
+
+int __weak create_orc(struct objtool_file *file)
+{
+ UNSUPPORTED("orc");
+}
+
+int __weak create_orc_sections(struct objtool_file *file)
+{
+ UNSUPPORTED("orc");
+}
man7dir=$(mandir)/man7
ASCIIDOC=asciidoc
-ASCIIDOC_EXTRA = --unsafe -f asciidoc.conf
+ASCIIDOC_EXTRA += --unsafe -f asciidoc.conf
ASCIIDOC_HTML = xhtml11
MANPAGE_XSL = manpage-normal.xsl
XMLTO_EXTRA =
ifdef USE_ASCIIDOCTOR
ASCIIDOC = asciidoctor
-ASCIIDOC_EXTRA = -a compat-mode
+ASCIIDOC_EXTRA += -a compat-mode
ASCIIDOC_EXTRA += -I. -rasciidoctor-extensions
ASCIIDOC_EXTRA += -a mansource="perf" -a manmanual="perf Manual"
ASCIIDOC_HTML = xhtml5
e synthesize error events
d create a debug log
g synthesize a call chain (use with i or x)
+ G synthesize a call chain on existing event records
l synthesize last branch entries (use with i or x)
+ L synthesize last branch entries on existing event records
s skip initial number of events
The default is all events i.e. the same as --itrace=ibxwpe,
Also the number of last branch entries (default 64, max. 1024) for
instructions or transactions events can be specified.
+ Similar to options g and l, size may also be specified for options G and L.
+ On x86, note that G and L work poorly when data has been recorded with
+ large PEBS. Refer linkperf:perf-intel-pt[1] man page for details.
+
It is also possible to skip events generated (instructions, branches, transactions,
ptwrite, power) at the beginning. This is useful to ignore initialization code.
'epoll'::
Eventpoll (epoll) stressing benchmarks.
+'internals'::
+ Benchmark internal perf functionality.
+
'all'::
All benchmark subsystems.
*ctl*::
Suite for evaluating multiple epoll_ctl calls.
+SUITES FOR 'internals'
+~~~~~~~~~~~~~~~~~~~~~~
+*synthesize*::
+Suite for evaluating perf's event synthesis performance.
+
SEE ALSO
--------
linkperf:perf[1]
--display::
Switch to HITM type (rmt, lcl) to display and sort on. Total HITMs as default.
+--stitch-lbr::
+ Show callgraph with stitched LBRs, which may have more complete
+ callgraph. The perf.data file must have been obtained using
+ perf c2c record --call-graph lbr.
+ Disabled by default. In common cases with call stack overflows,
+ it can recreate better call stacks than the default lbr call stack
+ output. But this approach is not full proof. There can be cases
+ where it creates incorrect call stacks from incorrect matches.
+ The known limitations include exception handing such as
+ setjmp/longjmp will have calls/returns not match.
+
C2C RECORD
----------
The perf c2c record command setup options related to HITM cacheline analysis
To also trace kernel space presents a problem, namely kernel self-modifying
code. A fairly good kernel image is available in /proc/kcore but to get an
accurate image a copy of /proc/kcore needs to be made under the same conditions
-as the data capture. A script perf-with-kcore can do that, but beware that the
-script makes use of 'sudo' to copy /proc/kcore. If you have perf installed
-locally from the source tree you can do:
+as the data capture. 'perf record' can make a copy of /proc/kcore if the option
+--kcore is used, but access to /proc/kcore is restricted e.g.
- ~/libexec/perf-core/perf-with-kcore record pt_ls -e intel_pt// -- ls
+ sudo perf record -o pt_ls --kcore -e intel_pt// -- ls
-which will create a directory named 'pt_ls' and put the perf.data file and
-copies of /proc/kcore, /proc/kallsyms and /proc/modules into it. Then to use
-'perf report' becomes:
+which will create a directory named 'pt_ls' and put the perf.data file (named
+simply 'data') and copies of /proc/kcore, /proc/kallsyms and /proc/modules into
+it. The other tools understand the directory format, so to use 'perf report'
+becomes:
- ~/libexec/perf-core/perf-with-kcore report pt_ls
+ sudo perf report -i pt_ls
Because samples are synthesized after-the-fact, the sampling period can be
selected for reporting. e.g. sample every microsecond
- ~/libexec/perf-core/perf-with-kcore report pt_ls --itrace=i1usge
+ sudo perf report pt_ls --itrace=i1usge
See the sections below for more information about the --itrace option.
e synthesize tracing error events
d create a debug log
g synthesize a call chain (use with i or x)
+ G synthesize a call chain on existing event records
l synthesize last branch entries (use with i or x)
+ L synthesize last branch entries on existing event records
s skip initial number of events
"Instructions" events look like they were recorded by "perf record -e
Note that last branch entries are cleared for each sample, so there is no overlap
from one sample to the next.
+The G and L options are designed in particular for sample mode, and work much
+like g and l but add call chain and branch stack to the other selected events
+instead of synthesized events. For example, to record branch-misses events for
+'ls' and then add a call chain derived from the Intel PT trace:
+
+ perf record --aux-sample -e '{intel_pt//u,branch-misses:u}' -- ls
+ perf report --itrace=Ge
+
+Although in fact G is a default for perf report, so that is the same as just:
+
+ perf report
+
+One caveat with the G and L options is that they work poorly with "Large PEBS".
+Large PEBS means PEBS records will be accumulated by hardware and the written
+into the event buffer in one go. That reduces interrupts, but can give very
+late timestamps. Because the Intel PT trace is synchronized by timestamps,
+the PEBS events do not match the trace. Currently, Large PEBS is used only in
+certain circumstances:
+ - hardware supports it
+ - PEBS is used
+ - event period is specified, instead of frequency
+ - the sample type is limited to the following flags:
+ PERF_SAMPLE_IP | PERF_SAMPLE_TID | PERF_SAMPLE_ADDR |
+ PERF_SAMPLE_ID | PERF_SAMPLE_CPU | PERF_SAMPLE_STREAM_ID |
+ PERF_SAMPLE_DATA_SRC | PERF_SAMPLE_IDENTIFIER |
+ PERF_SAMPLE_TRANSACTION | PERF_SAMPLE_PHYS_ADDR |
+ PERF_SAMPLE_REGS_INTR | PERF_SAMPLE_REGS_USER |
+ PERF_SAMPLE_PERIOD (and sometimes) | PERF_SAMPLE_TIME
+Because Intel PT sample mode uses a different sample type to the list above,
+Large PEBS is not used with Intel PT sample mode. To avoid Large PEBS in other
+cases, avoid specifying the event period i.e. avoid the 'perf record' -c option,
+--count option, or 'period' config term.
+
To disable trace decoding entirely, use the option --no-itrace.
It is also possible to skip events generated (instructions, branches, transactions)
perf stat -e r1a8 -a sleep 1
perf record -e r1a8 ...
+It's also possible to use pmu syntax:
+
+ perf record -e r1a8 -a sleep 1
+ perf record -e cpu/r1a8/ ...
+
You should refer to the processor specific documentation for getting these
details. Some of them are referenced in the SEE ALSO section below.
the first event (the leader) samples, and it only reads the values of the
other events in the group.
+However, in the case AUX area events (e.g. Intel PT or CoreSight), the AUX
+area event must be the leader, so then the second event samples, not the first.
+
OPTIONS
-------
--switch-output --no-no-buildid --no-no-buildid-cache
+--switch-output-event::
+Events that will cause the switch of the perf.data file, auto-selecting
+--switch-output=signal, the results are similar as internally the side band
+thread will also send a SIGUSR2 to the main one.
+
+Uses the same syntax as --event, it will just not be recorded, serving only to
+switch the perf.data file as soon as the --switch-output event is processed by
+a separate sideband thread.
+
+This sideband thread is also used to other purposes, like processing the
+PERF_RECORD_BPF_EVENT records as they happen, asking the kernel for extra BPF
+information, etc.
+
--switch-max-files=N::
When rotating perf.data with --switch-output, only keep N files.
Limit the sample data max size, <size> is expected to be a number with
appended unit character - B/K/M/G
+--num-thread-synthesize::
+ The number of threads to run when synthesizing events for existing processes.
+ By default, the number of threads equals 1.
+
SEE ALSO
--------
linkperf:perf-stat[1], linkperf:perf-list[1], linkperf:perf-intel-pt[1]
This option extends the perf report to show reference callgraphs,
which collected by reference event, in no callgraph event.
+--stitch-lbr::
+ Show callgraph with stitched LBRs, which may have more complete
+ callgraph. The perf.data file must have been obtained using
+ perf record --call-graph lbr.
+ Disabled by default. In common cases with call stack overflows,
+ it can recreate better call stacks than the default lbr call stack
+ output. But this approach is not full proof. There can be cases
+ where it creates incorrect call stacks from incorrect matches.
+ The known limitations include exception handing such as
+ setjmp/longjmp will have calls/returns not match.
+
--socket-filter::
Only report the samples on the processor socket that match with this filter
--show-on-off-events::
Show the --switch-on/off events too.
+--stitch-lbr::
+ Show callgraph with stitched LBRs, which may have more complete
+ callgraph. The perf.data file must have been obtained using
+ perf record --call-graph lbr.
+ Disabled by default. In common cases with call stack overflows,
+ it can recreate better call stacks than the default lbr call stack
+ output. But this approach is not full proof. There can be cases
+ where it creates incorrect call stacks from incorrect matches.
+ The known limitations include exception handing such as
+ setjmp/longjmp will have calls/returns not match.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
The overhead percentage could be high in some cases, for instance with small, sub 100ms intervals. Use with caution.
example: 'perf stat -I 1000 -e cycles -a sleep 5'
+If the metric exists, it is calculated by the counts generated in this interval and the metric is printed after #.
+
--interval-count times::
Print count deltas for fixed number of times.
This option should be used together with "-I" option.
go straight to the histogram browser, just like 'perf top' with no events
explicitely specified does.
+--stitch-lbr::
+ Show callgraph with stitched LBRs, which may have more complete
+ callgraph. The option must be used with --call-graph lbr recording.
+ Disabled by default. In common cases with call stack overflows,
+ it can recreate better call stacks than the default lbr call stack
+ output. But this approach is not full proof. There can be cases
+ where it creates incorrect call stacks from incorrect matches.
+ The known limitations include exception handing such as
+ setjmp/longjmp will have calls/returns not match.
INTERACTIVE PROMPTING KEYS
--------------------------
Indicates that trace contains records of PERF_RECORD_COMPRESSED type
that have perf_events records in compressed form.
+ HEADER_CPU_PMU_CAPS = 28,
+
+ A list of cpu PMU capabilities. The format of data is as below.
+
+struct {
+ u32 nr_cpu_pmu_caps;
+ {
+ char name[];
+ char value[];
+ } [nr_cpu_pmu_caps]
+};
+
+
+Example:
+ cpu pmu capabilities: branches=32, max_precise=3, pmu_name=icelake
+
other bits are reserved and should ignored for now
HEADER_FEAT_BITS = 256,
# non-config cases
config := 1
-NON_CONFIG_TARGETS := clean python-clean TAGS tags cscope help install-doc install-man install-html install-info install-pdf doc man html info pdf
+NON_CONFIG_TARGETS := clean python-clean TAGS tags cscope help
ifdef MAKECMDGOALS
ifeq ($(filter-out $(NON_CONFIG_TARGETS),$(MAKECMDGOALS)),)
# 'make doc' should call 'make -C Documentation all'
$(DOC_TARGETS):
- $(Q)$(MAKE) -C $(DOC_DIR) O=$(OUTPUT) $(@:doc=all)
+ $(Q)$(MAKE) -C $(DOC_DIR) O=$(OUTPUT) $(@:doc=all) ASCIIDOC_EXTRA=$(ASCIIDOC_EXTRA)
TAG_FOLDERS= . ../lib ../include
TAG_FILES= ../../include/uapi/linux/perf_event.h
# 'make install-doc' should call 'make -C Documentation install'
$(INSTALL_DOC_TARGETS):
- $(Q)$(MAKE) -C $(DOC_DIR) O=$(OUTPUT) $(@:-doc=)
+ $(Q)$(MAKE) -C $(DOC_DIR) O=$(OUTPUT) $(@:-doc=) ASCIIDOC_EXTRA=$(ASCIIDOC_EXTRA)
### Cleaning rules
#include "../../util/event.h"
#include "../../util/evlist.h"
#include "../../util/evsel.h"
+#include "../../util/perf_api_probe.h"
#include "../../util/evsel_config.h"
#include "../../util/pmu.h"
#include "../../util/cs-etm.h"
ret = perf_pmu__scan_file(pmu, path, "%x", &hash);
if (ret != 1) {
pr_err("failed to set sink \"%s\" on event %s with %d (%s)\n",
- sink, perf_evsel__name(evsel), errno,
+ sink, evsel__name(evsel), errno,
str_error_r(errno, msg, sizeof(msg)));
return ret;
}
* when a context switch happened.
*/
if (!perf_cpu_map__empty(cpus)) {
- perf_evsel__set_sample_bit(cs_etm_evsel, CPU);
+ evsel__set_sample_bit(cs_etm_evsel, CPU);
err = cs_etm_set_option(itr, cs_etm_evsel,
ETM_OPT_CTXTID | ETM_OPT_TS);
/* In per-cpu case, always need the time of mmap events etc */
if (!perf_cpu_map__empty(cpus))
- perf_evsel__set_sample_bit(tracking_evsel, TIME);
+ evsel__set_sample_bit(tracking_evsel, TIME);
}
out:
*/
perf_evlist__to_front(evlist, arm_spe_evsel);
- perf_evsel__set_sample_bit(arm_spe_evsel, CPU);
- perf_evsel__set_sample_bit(arm_spe_evsel, TIME);
- perf_evsel__set_sample_bit(arm_spe_evsel, TID);
+ evsel__set_sample_bit(arm_spe_evsel, CPU);
+ evsel__set_sample_bit(arm_spe_evsel, TIME);
+ evsel__set_sample_bit(arm_spe_evsel, TID);
/* Add dummy event to keep tracking */
err = parse_events(evlist, "dummy:u", NULL);
tracking_evsel->core.attr.freq = 0;
tracking_evsel->core.attr.sample_period = 1;
- perf_evsel__set_sample_bit(tracking_evsel, TIME);
- perf_evsel__set_sample_bit(tracking_evsel, CPU);
- perf_evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
+ evsel__set_sample_bit(tracking_evsel, TIME);
+ evsel__set_sample_bit(tracking_evsel, CPU);
+ evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
return 0;
}
#include <string.h>
#include <linux/stringify.h>
#include "header.h"
+#include "metricgroup.h"
+#include <api/fs/fs.h>
#define mfspr(rn) ({unsigned long rval; \
asm volatile("mfspr %0," __stringify(rn) \
return bufp;
}
+
+int arch_get_runtimeparam(void)
+{
+ int count;
+ return sysfs__read_int("/devices/hv_24x7/interface/sockets", &count) < 0 ? 1 : count;
+}
struct event_key *key)
{
key->info = 0;
- key->key = perf_evsel__intval(evsel, sample, "req");
+ key->key = evsel__intval(evsel, sample, "req");
}
static const char *get_hcall_exit_reason(u64 exit_code)
{
unsigned long insn;
- insn = perf_evsel__intval(evsel, sample, "instruction");
+ insn = evsel__intval(evsel, sample, "instruction");
key->key = icpt_insn_decoder(insn);
key->exit_reasons = sie_icpt_insn_codes;
}
struct perf_sample *sample,
struct event_key *key)
{
- key->key = perf_evsel__intval(evsel, sample, "order_code");
+ key->key = evsel__intval(evsel, sample, "order_code");
key->exit_reasons = sie_sigp_order_codes;
}
struct perf_sample *sample,
struct event_key *key)
{
- key->key = perf_evsel__intval(evsel, sample, "code");
+ key->key = evsel__intval(evsel, sample, "code");
key->exit_reasons = sie_diagnose_codes;
}
struct perf_sample *sample,
struct event_key *key)
{
- key->key = perf_evsel__intval(evsel, sample, "code");
+ key->key = evsel__intval(evsel, sample, "code");
key->exit_reasons = sie_icpt_prog_codes;
}
goto next_event;
if (strcmp(event->comm.comm, comm1) == 0) {
- CHECK__(perf_evsel__parse_sample(evsel, event,
- &sample));
+ CHECK__(evsel__parse_sample(evsel, event, &sample));
comm1_time = sample.time;
}
if (strcmp(event->comm.comm, comm2) == 0) {
- CHECK__(perf_evsel__parse_sample(evsel, event,
- &sample));
+ CHECK__(evsel__parse_sample(evsel, event, &sample));
comm2_time = sample.time;
}
next_event:
* AUX event.
*/
if (!perf_cpu_map__empty(cpus))
- perf_evsel__set_sample_bit(intel_bts_evsel, CPU);
+ evsel__set_sample_bit(intel_bts_evsel, CPU);
}
/* Add dummy event to keep tracking */
#include "../../../util/pmu.h"
#include "../../../util/debug.h"
#include "../../../util/auxtrace.h"
+#include "../../../util/perf_api_probe.h"
#include "../../../util/record.h"
#include "../../../util/target.h"
#include "../../../util/tsc.h"
evsel = evlist__last(evlist);
- perf_evsel__set_sample_bit(evsel, CPU);
- perf_evsel__set_sample_bit(evsel, TIME);
+ evsel__set_sample_bit(evsel, CPU);
+ evsel__set_sample_bit(evsel, TIME);
evsel->core.system_wide = true;
evsel->no_aux_samples = true;
switch_evsel->no_aux_samples = true;
switch_evsel->immediate = true;
- perf_evsel__set_sample_bit(switch_evsel, TID);
- perf_evsel__set_sample_bit(switch_evsel, TIME);
- perf_evsel__set_sample_bit(switch_evsel, CPU);
- perf_evsel__reset_sample_bit(switch_evsel, BRANCH_STACK);
+ evsel__set_sample_bit(switch_evsel, TID);
+ evsel__set_sample_bit(switch_evsel, TIME);
+ evsel__set_sample_bit(switch_evsel, CPU);
+ evsel__reset_sample_bit(switch_evsel, BRANCH_STACK);
opts->record_switch_events = false;
ptr->have_sched_switch = 3;
* AUX event.
*/
if (!perf_cpu_map__empty(cpus))
- perf_evsel__set_sample_bit(intel_pt_evsel, CPU);
+ evsel__set_sample_bit(intel_pt_evsel, CPU);
}
/* Add dummy event to keep tracking */
/* In per-cpu case, always need the time of mmap events etc */
if (!perf_cpu_map__empty(cpus)) {
- perf_evsel__set_sample_bit(tracking_evsel, TIME);
+ evsel__set_sample_bit(tracking_evsel, TIME);
/* And the CPU for switch events */
- perf_evsel__set_sample_bit(tracking_evsel, CPU);
+ evsel__set_sample_bit(tracking_evsel, CPU);
}
- perf_evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
+ evsel__reset_sample_bit(tracking_evsel, BRANCH_STACK);
}
/*
static void mmio_event_get_key(struct evsel *evsel, struct perf_sample *sample,
struct event_key *key)
{
- key->key = perf_evsel__intval(evsel, sample, "gpa");
- key->info = perf_evsel__intval(evsel, sample, "type");
+ key->key = evsel__intval(evsel, sample, "gpa");
+ key->info = evsel__intval(evsel, sample, "type");
}
#define KVM_TRACE_MMIO_READ_UNSATISFIED 0
/* MMIO write begin event in kernel. */
if (!strcmp(evsel->name, "kvm:kvm_mmio") &&
- perf_evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_WRITE) {
+ evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_WRITE) {
mmio_event_get_key(evsel, sample, key);
return true;
}
/* MMIO read end event in kernel.*/
if (!strcmp(evsel->name, "kvm:kvm_mmio") &&
- perf_evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_READ) {
+ evsel__intval(evsel, sample, "type") == KVM_TRACE_MMIO_READ) {
mmio_event_get_key(evsel, sample, key);
return true;
}
struct perf_sample *sample,
struct event_key *key)
{
- key->key = perf_evsel__intval(evsel, sample, "port");
- key->info = perf_evsel__intval(evsel, sample, "rw");
+ key->key = evsel__intval(evsel, sample, "port");
+ key->info = evsel__intval(evsel, sample, "rw");
}
static bool ioport_event_begin(struct evsel *evsel,
perf-y += futex-wake-parallel.o
perf-y += futex-requeue.o
perf-y += futex-lock-pi.o
-
perf-y += epoll-wait.o
perf-y += epoll-ctl.o
+perf-y += synthesize.o
+perf-y += kallsyms-parse.o
perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-lib.o
perf-$(CONFIG_X86_64) += mem-memcpy-x86-64-asm.o
int bench_futex_requeue(int argc, const char **argv);
/* pi futexes */
int bench_futex_lock_pi(int argc, const char **argv);
-
int bench_epoll_wait(int argc, const char **argv);
int bench_epoll_ctl(int argc, const char **argv);
+int bench_synthesize(int argc, const char **argv);
+int bench_kallsyms_parse(int argc, const char **argv);
#define BENCH_FORMAT_DEFAULT_STR "default"
#define BENCH_FORMAT_DEFAULT 0
qsort(worker, nthreads, sizeof(struct worker), cmpworker);
for (i = 0; i < nthreads; i++) {
- unsigned long t = worker[i].ops / bench__runtime.tv_sec;
+ unsigned long t = bench__runtime.tv_sec > 0 ?
+ worker[i].ops / bench__runtime.tv_sec : 0;
update_stats(&throughput_stats, t);
pthread_mutex_destroy(&thread_lock);
for (i = 0; i < nthreads; i++) {
- unsigned long t = worker[i].ops / bench__runtime.tv_sec;
+ unsigned long t = bench__runtime.tv_sec > 0 ?
+ worker[i].ops / bench__runtime.tv_sec : 0;
update_stats(&throughput_stats, t);
if (!silent) {
if (nfutexes == 1)
pthread_mutex_destroy(&thread_lock);
for (i = 0; i < nthreads; i++) {
- unsigned long t = worker[i].ops / bench__runtime.tv_sec;
+ unsigned long t = bench__runtime.tv_sec > 0 ?
+ worker[i].ops / bench__runtime.tv_sec : 0;
update_stats(&throughput_stats, t);
if (!silent)
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Benchmark of /proc/kallsyms parsing.
+ *
+ * Copyright 2020 Google LLC.
+ */
+#include <stdlib.h>
+#include "bench.h"
+#include "../util/stat.h"
+#include <linux/time64.h>
+#include <subcmd/parse-options.h>
+#include <symbol/kallsyms.h>
+
+static unsigned int iterations = 100;
+
+static const struct option options[] = {
+ OPT_UINTEGER('i', "iterations", &iterations,
+ "Number of iterations used to compute average"),
+ OPT_END()
+};
+
+static const char *const bench_usage[] = {
+ "perf bench internals kallsyms-parse <options>",
+ NULL
+};
+
+static int bench_process_symbol(void *arg __maybe_unused,
+ const char *name __maybe_unused,
+ char type __maybe_unused,
+ u64 start __maybe_unused)
+{
+ return 0;
+}
+
+static int do_kallsyms_parse(void)
+{
+ struct timeval start, end, diff;
+ u64 runtime_us;
+ unsigned int i;
+ double time_average, time_stddev;
+ int err;
+ struct stats time_stats;
+
+ init_stats(&time_stats);
+
+ for (i = 0; i < iterations; i++) {
+ gettimeofday(&start, NULL);
+ err = kallsyms__parse("/proc/kallsyms", NULL,
+ bench_process_symbol);
+ if (err)
+ return err;
+
+ gettimeofday(&end, NULL);
+ timersub(&end, &start, &diff);
+ runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
+ update_stats(&time_stats, runtime_us);
+ }
+
+ time_average = avg_stats(&time_stats) / USEC_PER_MSEC;
+ time_stddev = stddev_stats(&time_stats) / USEC_PER_MSEC;
+ printf(" Average kallsyms__parse took: %.3f ms (+- %.3f ms)\n",
+ time_average, time_stddev);
+ return 0;
+}
+
+int bench_kallsyms_parse(int argc, const char **argv)
+{
+ argc = parse_options(argc, argv, options, bench_usage, 0);
+ if (argc) {
+ usage_with_options(bench_usage, options);
+ exit(EXIT_FAILURE);
+ }
+
+ return do_kallsyms_parse();
+}
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Benchmark synthesis of perf events such as at the start of a 'perf
+ * record'. Synthesis is done on the current process and the 'dummy' event
+ * handlers are invoked that support dump_trace but otherwise do nothing.
+ *
+ * Copyright 2019 Google LLC.
+ */
+#include <stdio.h>
+#include "bench.h"
+#include "../util/debug.h"
+#include "../util/session.h"
+#include "../util/stat.h"
+#include "../util/synthetic-events.h"
+#include "../util/target.h"
+#include "../util/thread_map.h"
+#include "../util/tool.h"
+#include "../util/util.h"
+#include <linux/atomic.h>
+#include <linux/err.h>
+#include <linux/time64.h>
+#include <subcmd/parse-options.h>
+
+static unsigned int min_threads = 1;
+static unsigned int max_threads = UINT_MAX;
+static unsigned int single_iterations = 10000;
+static unsigned int multi_iterations = 10;
+static bool run_st;
+static bool run_mt;
+
+static const struct option options[] = {
+ OPT_BOOLEAN('s', "st", &run_st, "Run single threaded benchmark"),
+ OPT_BOOLEAN('t', "mt", &run_mt, "Run multi-threaded benchmark"),
+ OPT_UINTEGER('m', "min-threads", &min_threads,
+ "Minimum number of threads in multithreaded bench"),
+ OPT_UINTEGER('M', "max-threads", &max_threads,
+ "Maximum number of threads in multithreaded bench"),
+ OPT_UINTEGER('i', "single-iterations", &single_iterations,
+ "Number of iterations used to compute single-threaded average"),
+ OPT_UINTEGER('I', "multi-iterations", &multi_iterations,
+ "Number of iterations used to compute multi-threaded average"),
+ OPT_END()
+};
+
+static const char *const bench_usage[] = {
+ "perf bench internals synthesize <options>",
+ NULL
+};
+
+static atomic_t event_count;
+
+static int process_synthesized_event(struct perf_tool *tool __maybe_unused,
+ union perf_event *event __maybe_unused,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+ atomic_inc(&event_count);
+ return 0;
+}
+
+static int do_run_single_threaded(struct perf_session *session,
+ struct perf_thread_map *threads,
+ struct target *target, bool data_mmap)
+{
+ const unsigned int nr_threads_synthesize = 1;
+ struct timeval start, end, diff;
+ u64 runtime_us;
+ unsigned int i;
+ double time_average, time_stddev, event_average, event_stddev;
+ int err;
+ struct stats time_stats, event_stats;
+
+ init_stats(&time_stats);
+ init_stats(&event_stats);
+
+ for (i = 0; i < single_iterations; i++) {
+ atomic_set(&event_count, 0);
+ gettimeofday(&start, NULL);
+ err = __machine__synthesize_threads(&session->machines.host,
+ NULL,
+ target, threads,
+ process_synthesized_event,
+ data_mmap,
+ nr_threads_synthesize);
+ if (err)
+ return err;
+
+ gettimeofday(&end, NULL);
+ timersub(&end, &start, &diff);
+ runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
+ update_stats(&time_stats, runtime_us);
+ update_stats(&event_stats, atomic_read(&event_count));
+ }
+
+ time_average = avg_stats(&time_stats);
+ time_stddev = stddev_stats(&time_stats);
+ printf(" Average %ssynthesis took: %.3f usec (+- %.3f usec)\n",
+ data_mmap ? "data " : "", time_average, time_stddev);
+
+ event_average = avg_stats(&event_stats);
+ event_stddev = stddev_stats(&event_stats);
+ printf(" Average num. events: %.3f (+- %.3f)\n",
+ event_average, event_stddev);
+
+ printf(" Average time per event %.3f usec\n",
+ time_average / event_average);
+ return 0;
+}
+
+static int run_single_threaded(void)
+{
+ struct perf_session *session;
+ struct target target = {
+ .pid = "self",
+ };
+ struct perf_thread_map *threads;
+ int err;
+
+ perf_set_singlethreaded();
+ session = perf_session__new(NULL, false, NULL);
+ if (IS_ERR(session)) {
+ pr_err("Session creation failed.\n");
+ return PTR_ERR(session);
+ }
+ threads = thread_map__new_by_pid(getpid());
+ if (!threads) {
+ pr_err("Thread map creation failed.\n");
+ err = -ENOMEM;
+ goto err_out;
+ }
+
+ puts(
+"Computing performance of single threaded perf event synthesis by\n"
+"synthesizing events on the perf process itself:");
+
+ err = do_run_single_threaded(session, threads, &target, false);
+ if (err)
+ goto err_out;
+
+ err = do_run_single_threaded(session, threads, &target, true);
+
+err_out:
+ if (threads)
+ perf_thread_map__put(threads);
+
+ perf_session__delete(session);
+ return err;
+}
+
+static int do_run_multi_threaded(struct target *target,
+ unsigned int nr_threads_synthesize)
+{
+ struct timeval start, end, diff;
+ u64 runtime_us;
+ unsigned int i;
+ double time_average, time_stddev, event_average, event_stddev;
+ int err;
+ struct stats time_stats, event_stats;
+ struct perf_session *session;
+
+ init_stats(&time_stats);
+ init_stats(&event_stats);
+ for (i = 0; i < multi_iterations; i++) {
+ session = perf_session__new(NULL, false, NULL);
+ if (!session)
+ return -ENOMEM;
+
+ atomic_set(&event_count, 0);
+ gettimeofday(&start, NULL);
+ err = __machine__synthesize_threads(&session->machines.host,
+ NULL,
+ target, NULL,
+ process_synthesized_event,
+ false,
+ nr_threads_synthesize);
+ if (err) {
+ perf_session__delete(session);
+ return err;
+ }
+
+ gettimeofday(&end, NULL);
+ timersub(&end, &start, &diff);
+ runtime_us = diff.tv_sec * USEC_PER_SEC + diff.tv_usec;
+ update_stats(&time_stats, runtime_us);
+ update_stats(&event_stats, atomic_read(&event_count));
+ perf_session__delete(session);
+ }
+
+ time_average = avg_stats(&time_stats);
+ time_stddev = stddev_stats(&time_stats);
+ printf(" Average synthesis took: %.3f usec (+- %.3f usec)\n",
+ time_average, time_stddev);
+
+ event_average = avg_stats(&event_stats);
+ event_stddev = stddev_stats(&event_stats);
+ printf(" Average num. events: %.3f (+- %.3f)\n",
+ event_average, event_stddev);
+
+ printf(" Average time per event %.3f usec\n",
+ time_average / event_average);
+ return 0;
+}
+
+static int run_multi_threaded(void)
+{
+ struct target target = {
+ .cpu_list = "0"
+ };
+ unsigned int nr_threads_synthesize;
+ int err;
+
+ if (max_threads == UINT_MAX)
+ max_threads = sysconf(_SC_NPROCESSORS_ONLN);
+
+ puts(
+"Computing performance of multi threaded perf event synthesis by\n"
+"synthesizing events on CPU 0:");
+
+ for (nr_threads_synthesize = min_threads;
+ nr_threads_synthesize <= max_threads;
+ nr_threads_synthesize++) {
+ if (nr_threads_synthesize == 1)
+ perf_set_singlethreaded();
+ else
+ perf_set_multithreaded();
+
+ printf(" Number of synthesis threads: %u\n",
+ nr_threads_synthesize);
+
+ err = do_run_multi_threaded(&target, nr_threads_synthesize);
+ if (err)
+ return err;
+ }
+ perf_set_singlethreaded();
+ return 0;
+}
+
+int bench_synthesize(int argc, const char **argv)
+{
+ int err = 0;
+
+ argc = parse_options(argc, argv, options, bench_usage, 0);
+ if (argc) {
+ usage_with_options(bench_usage, options);
+ exit(EXIT_FAILURE);
+ }
+
+ /*
+ * If neither single threaded or multi-threaded are specified, default
+ * to running just single threaded.
+ */
+ if (!run_st && !run_mt)
+ run_st = true;
+
+ if (run_st)
+ err = run_single_threaded();
+
+ if (!err && run_mt)
+ err = run_multi_threaded();
+
+ return err;
+}
return ui__has_annotation() || ann->use_stdio2;
}
-static int perf_evsel__add_sample(struct evsel *evsel,
- struct perf_sample *sample,
- struct addr_location *al,
- struct perf_annotate *ann,
- struct machine *machine)
+static int evsel__add_sample(struct evsel *evsel, struct perf_sample *sample,
+ struct addr_location *al, struct perf_annotate *ann,
+ struct machine *machine)
{
struct hists *hists = evsel__hists(evsel);
struct hist_entry *he;
goto out_put;
if (!al.filtered &&
- perf_evsel__add_sample(evsel, sample, &al, ann, machine)) {
+ evsel__add_sample(evsel, sample, &al, ann, machine)) {
pr_warning("problem incrementing symbol count, "
"skipping event\n");
ret = -1;
total_nr_samples += nr_samples;
hists__collapse_resort(hists, NULL);
/* Don't sort callchain */
- perf_evsel__reset_sample_bit(pos, CALLCHAIN);
+ evsel__reset_sample_bit(pos, CALLCHAIN);
perf_evsel__output_resort(pos, NULL);
- if (symbol_conf.event_group &&
- !perf_evsel__is_group_leader(pos))
+ if (symbol_conf.event_group && !evsel__is_group_leader(pos))
continue;
hists__find_annotations(hists, pos, ann);
};
#endif // HAVE_EVENTFD
+static struct bench internals_benchmarks[] = {
+ { "synthesize", "Benchmark perf event synthesis", bench_synthesize },
+ { "kallsyms-parse", "Benchmark kallsyms parsing", bench_kallsyms_parse },
+ { NULL, NULL, NULL }
+};
+
struct collection {
const char *name;
const char *summary;
#ifdef HAVE_EVENTFD
{"epoll", "Epoll stressing benchmarks", epoll_benchmarks },
#endif
+ { "internals", "Perf-internals benchmarks", internals_benchmarks },
{ "all", "All benchmarks", NULL },
{ NULL, NULL, NULL }
};
bool use_stdio;
bool stats_only;
bool symbol_full;
+ bool stitch_lbr;
/* HITM shared clines stats */
struct c2c_stats hitm_stats;
return -1;
}
+ if (c2c.stitch_lbr)
+ al.thread->lbr_stitch_enable = true;
+
ret = sample__resolve_callchain(sample, &callchain_cursor, NULL,
evsel, &al, sysctl_perf_event_max_stack);
if (ret)
if (!strcmp(dim->name, name))
return dim;
- };
+ }
return NULL;
}
FILTER_HITM(tot_hitm);
default:
break;
- };
+ }
#undef FILTER_HITM
fprintf(out, "=================================================\n");
evlist__for_each_entry(evlist, evsel) {
- fprintf(out, "%-36s: %s\n", first ? " Events" : "",
- perf_evsel__name(evsel));
+ fprintf(out, "%-36s: %s\n", first ? " Events" : "", evsel__name(evsel));
first = false;
}
fprintf(out, " Cachelines sort on : %s HITMs\n",
}
}
+ if (c2c.stitch_lbr && (mode != CALLCHAIN_LBR)) {
+ ui__warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
+ "Please apply --call-graph lbr when recording.\n");
+ c2c.stitch_lbr = false;
+ }
+
callchain_param.record_mode = mode;
callchain_param.min_percent = 0;
return 0;
OPT_STRING('c', "coalesce", &coalesce, "coalesce fields",
"coalesce fields: pid,tid,iaddr,dso"),
OPT_BOOLEAN('f', "force", &symbol_conf.force, "don't complain, do it"),
+ OPT_BOOLEAN(0, "stitch-lbr", &c2c.stitch_lbr,
+ "Enable LBR callgraph stitching approach"),
OPT_PARENT(c2c_options),
OPT_END()
};
rec_argv[i++] = "-e";
rec_argv[i++] = perf_mem_events__name(j);
- };
+ }
if (all_user)
rec_argv[i++] = "--all-user";
struct evsel *e;
evlist__for_each_entry(evlist, e) {
- if (perf_evsel__match2(evsel, e))
+ if (evsel__match2(evsel, e))
return e;
}
if (!quiet) {
fprintf(stdout, "%s# Event '%s'\n#\n", first ? "" : "\n",
- perf_evsel__name(evsel_base));
+ evsel__name(evsel_base));
}
first = false;
data__fprintf();
/* Don't sort callchain for perf diff */
- perf_evsel__reset_sample_bit(evsel_base, CALLCHAIN);
+ evsel__reset_sample_bit(evsel_base, CALLCHAIN);
hists__process(hists_base);
}
default:
BUG_ON(1);
- };
+ }
}
static void
.events = POLLIN,
};
- if (!perf_cap__capable(CAP_SYS_ADMIN)) {
+ if (!(perf_cap__capable(CAP_PERFMON) ||
+ perf_cap__capable(CAP_SYS_ADMIN))) {
pr_err("ftrace only works for %s!\n",
#ifdef HAVE_LIBCAP_SUPPORT
- "users with the SYS_ADMIN capability"
+ "users with the CAP_PERFMON or CAP_SYS_ADMIN capability"
#else
"root"
#endif
union perf_event *event_sw;
struct perf_sample sample_sw;
struct perf_inject *inject = container_of(tool, struct perf_inject, tool);
- u32 pid = perf_evsel__intval(evsel, sample, "pid");
+ u32 pid = evsel__intval(evsel, sample, "pid");
list_for_each_entry(ent, &inject->samples, node) {
if (pid == ent->tid)
return 0;
found:
event_sw = &ent->event[0];
- perf_evsel__parse_sample(evsel, event_sw, &sample_sw);
+ evsel__parse_sample(evsel, event_sw, &sample_sw);
sample_sw.period = sample->period;
sample_sw.time = sample->time;
session_done = 1;
}
-static int perf_evsel__check_stype(struct evsel *evsel,
- u64 sample_type, const char *sample_msg)
+static int evsel__check_stype(struct evsel *evsel, u64 sample_type, const char *sample_msg)
{
struct perf_event_attr *attr = &evsel->core.attr;
- const char *name = perf_evsel__name(evsel);
+ const char *name = evsel__name(evsel);
if (!(attr->sample_type & sample_type)) {
pr_err("Samples for %s event do not have %s attribute set.",
struct evsel *evsel;
evlist__for_each_entry(session->evlist, evsel) {
- const char *name = perf_evsel__name(evsel);
+ const char *name = evsel__name(evsel);
if (!strcmp(name, "sched:sched_switch")) {
- if (perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID"))
+ if (evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID"))
return -EINVAL;
evsel->handler = perf_inject__sched_switch;
perf_header__clear_feat(&session->header,
HEADER_AUXTRACE);
- if (inject->itrace_synth_opts.last_branch)
+ if (inject->itrace_synth_opts.last_branch ||
+ inject->itrace_synth_opts.add_last_branch)
perf_header__set_feat(&session->header,
HEADER_BRANCH_STACK);
evsel = perf_evlist__id2evsel_strict(session->evlist,
inject->aux_id);
if (evsel) {
- pr_debug("Deleting %s\n",
- perf_evsel__name(evsel));
+ pr_debug("Deleting %s\n", evsel__name(evsel));
evlist__remove(session->evlist, evsel);
evsel__delete(evsel);
}
return 0;
}
-static int perf_evsel__process_alloc_event(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_alloc_event(struct evsel *evsel, struct perf_sample *sample)
{
- unsigned long ptr = perf_evsel__intval(evsel, sample, "ptr"),
- call_site = perf_evsel__intval(evsel, sample, "call_site");
- int bytes_req = perf_evsel__intval(evsel, sample, "bytes_req"),
- bytes_alloc = perf_evsel__intval(evsel, sample, "bytes_alloc");
+ unsigned long ptr = evsel__intval(evsel, sample, "ptr"),
+ call_site = evsel__intval(evsel, sample, "call_site");
+ int bytes_req = evsel__intval(evsel, sample, "bytes_req"),
+ bytes_alloc = evsel__intval(evsel, sample, "bytes_alloc");
if (insert_alloc_stat(call_site, ptr, bytes_req, bytes_alloc, sample->cpu) ||
insert_caller_stat(call_site, bytes_req, bytes_alloc))
return 0;
}
-static int perf_evsel__process_alloc_node_event(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_alloc_node_event(struct evsel *evsel, struct perf_sample *sample)
{
- int ret = perf_evsel__process_alloc_event(evsel, sample);
+ int ret = evsel__process_alloc_event(evsel, sample);
if (!ret) {
int node1 = cpu__get_node(sample->cpu),
- node2 = perf_evsel__intval(evsel, sample, "node");
+ node2 = evsel__intval(evsel, sample, "node");
if (node1 != node2)
nr_cross_allocs++;
return NULL;
}
-static int perf_evsel__process_free_event(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_free_event(struct evsel *evsel, struct perf_sample *sample)
{
- unsigned long ptr = perf_evsel__intval(evsel, sample, "ptr");
+ unsigned long ptr = evsel__intval(evsel, sample, "ptr");
struct alloc_stat *s_alloc, *s_caller;
s_alloc = search_alloc_stat(ptr, 0, &root_alloc_stat, ptr_cmp);
return 0;
}
-static int perf_evsel__process_page_alloc_event(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_page_alloc_event(struct evsel *evsel, struct perf_sample *sample)
{
u64 page;
- unsigned int order = perf_evsel__intval(evsel, sample, "order");
- unsigned int gfp_flags = perf_evsel__intval(evsel, sample, "gfp_flags");
- unsigned int migrate_type = perf_evsel__intval(evsel, sample,
+ unsigned int order = evsel__intval(evsel, sample, "order");
+ unsigned int gfp_flags = evsel__intval(evsel, sample, "gfp_flags");
+ unsigned int migrate_type = evsel__intval(evsel, sample,
"migratetype");
u64 bytes = kmem_page_size << order;
u64 callsite;
};
if (use_pfn)
- page = perf_evsel__intval(evsel, sample, "pfn");
+ page = evsel__intval(evsel, sample, "pfn");
else
- page = perf_evsel__intval(evsel, sample, "page");
+ page = evsel__intval(evsel, sample, "page");
nr_page_allocs++;
total_page_alloc_bytes += bytes;
return 0;
}
-static int perf_evsel__process_page_free_event(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_page_free_event(struct evsel *evsel, struct perf_sample *sample)
{
u64 page;
- unsigned int order = perf_evsel__intval(evsel, sample, "order");
+ unsigned int order = evsel__intval(evsel, sample, "order");
u64 bytes = kmem_page_size << order;
struct page_stat *pstat;
struct page_stat this = {
};
if (use_pfn)
- page = perf_evsel__intval(evsel, sample, "pfn");
+ page = evsel__intval(evsel, sample, "pfn");
else
- page = perf_evsel__intval(evsel, sample, "page");
+ page = evsel__intval(evsel, sample, "page");
nr_page_frees++;
total_page_free_bytes += bytes;
struct evsel *evsel;
const struct evsel_str_handler kmem_tracepoints[] = {
/* slab allocator */
- { "kmem:kmalloc", perf_evsel__process_alloc_event, },
- { "kmem:kmem_cache_alloc", perf_evsel__process_alloc_event, },
- { "kmem:kmalloc_node", perf_evsel__process_alloc_node_event, },
- { "kmem:kmem_cache_alloc_node", perf_evsel__process_alloc_node_event, },
- { "kmem:kfree", perf_evsel__process_free_event, },
- { "kmem:kmem_cache_free", perf_evsel__process_free_event, },
+ { "kmem:kmalloc", evsel__process_alloc_event, },
+ { "kmem:kmem_cache_alloc", evsel__process_alloc_event, },
+ { "kmem:kmalloc_node", evsel__process_alloc_node_event, },
+ { "kmem:kmem_cache_alloc_node", evsel__process_alloc_node_event, },
+ { "kmem:kfree", evsel__process_free_event, },
+ { "kmem:kmem_cache_free", evsel__process_free_event, },
/* page allocator */
- { "kmem:mm_page_alloc", perf_evsel__process_page_alloc_event, },
- { "kmem:mm_page_free", perf_evsel__process_page_free_event, },
+ { "kmem:mm_page_alloc", evsel__process_page_alloc_event, },
+ { "kmem:mm_page_free", evsel__process_page_free_event, },
};
if (!perf_session__has_traces(session, "kmem record"))
}
evlist__for_each_entry(session->evlist, evsel) {
- if (!strcmp(perf_evsel__name(evsel), "kmem:mm_page_alloc") &&
- perf_evsel__field(evsel, "pfn")) {
+ if (!strcmp(evsel__name(evsel), "kmem:mm_page_alloc") &&
+ evsel__field(evsel, "pfn")) {
use_pfn = true;
break;
}
struct event_key *key)
{
key->info = 0;
- key->key = perf_evsel__intval(evsel, sample, kvm_exit_reason);
+ key->key = evsel__intval(evsel, sample, kvm_exit_reason);
}
bool kvm_exit_event(struct evsel *evsel)
return NULL;
}
- vcpu_record->vcpu_id = perf_evsel__intval(evsel, sample,
- vcpu_id_str);
+ vcpu_record->vcpu_id = evsel__intval(evsel, sample, vcpu_id_str);
thread__set_priv(thread, vcpu_record);
}
struct perf_event_attr *attr = &pos->core.attr;
/* make sure these *are* set */
- perf_evsel__set_sample_bit(pos, TID);
- perf_evsel__set_sample_bit(pos, TIME);
- perf_evsel__set_sample_bit(pos, CPU);
- perf_evsel__set_sample_bit(pos, RAW);
+ evsel__set_sample_bit(pos, TID);
+ evsel__set_sample_bit(pos, TIME);
+ evsel__set_sample_bit(pos, CPU);
+ evsel__set_sample_bit(pos, RAW);
/* make sure these are *not*; want as small a sample as possible */
- perf_evsel__reset_sample_bit(pos, PERIOD);
- perf_evsel__reset_sample_bit(pos, IP);
- perf_evsel__reset_sample_bit(pos, CALLCHAIN);
- perf_evsel__reset_sample_bit(pos, ADDR);
- perf_evsel__reset_sample_bit(pos, READ);
+ evsel__reset_sample_bit(pos, PERIOD);
+ evsel__reset_sample_bit(pos, IP);
+ evsel__reset_sample_bit(pos, CALLCHAIN);
+ evsel__reset_sample_bit(pos, ADDR);
+ evsel__reset_sample_bit(pos, READ);
attr->mmap = 0;
attr->comm = 0;
attr->task = 0;
struct rb_node rb; /* used for sorting */
/*
- * FIXME: perf_evsel__intval() returns u64,
+ * FIXME: evsel__intval() returns u64,
* so address of lockdep_map should be dealed as 64bit.
* Is there more better solution?
*/
struct lock_stat *ls;
struct thread_stat *ts;
struct lock_seq_stat *seq;
- const char *name = perf_evsel__strval(evsel, sample, "name");
- u64 tmp = perf_evsel__intval(evsel, sample, "lockdep_addr");
- int flag = perf_evsel__intval(evsel, sample, "flag");
+ const char *name = evsel__strval(evsel, sample, "name");
+ u64 tmp = evsel__intval(evsel, sample, "lockdep_addr");
+ int flag = evsel__intval(evsel, sample, "flag");
memcpy(&addr, &tmp, sizeof(void *));
struct thread_stat *ts;
struct lock_seq_stat *seq;
u64 contended_term;
- const char *name = perf_evsel__strval(evsel, sample, "name");
- u64 tmp = perf_evsel__intval(evsel, sample, "lockdep_addr");
+ const char *name = evsel__strval(evsel, sample, "name");
+ u64 tmp = evsel__intval(evsel, sample, "lockdep_addr");
memcpy(&addr, &tmp, sizeof(void *));
struct lock_stat *ls;
struct thread_stat *ts;
struct lock_seq_stat *seq;
- const char *name = perf_evsel__strval(evsel, sample, "name");
- u64 tmp = perf_evsel__intval(evsel, sample, "lockdep_addr");
+ const char *name = evsel__strval(evsel, sample, "name");
+ u64 tmp = evsel__intval(evsel, sample, "lockdep_addr");
memcpy(&addr, &tmp, sizeof(void *));
struct lock_stat *ls;
struct thread_stat *ts;
struct lock_seq_stat *seq;
- const char *name = perf_evsel__strval(evsel, sample, "name");
- u64 tmp = perf_evsel__intval(evsel, sample, "lockdep_addr");
+ const char *name = evsel__strval(evsel, sample, "name");
+ u64 tmp = evsel__intval(evsel, sample, "lockdep_addr");
memcpy(&addr, &tmp, sizeof(void *));
static struct trace_lock_handler *trace_handler;
-static int perf_evsel__process_lock_acquire(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_lock_acquire(struct evsel *evsel, struct perf_sample *sample)
{
if (trace_handler->acquire_event)
return trace_handler->acquire_event(evsel, sample);
return 0;
}
-static int perf_evsel__process_lock_acquired(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_lock_acquired(struct evsel *evsel, struct perf_sample *sample)
{
if (trace_handler->acquired_event)
return trace_handler->acquired_event(evsel, sample);
return 0;
}
-static int perf_evsel__process_lock_contended(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_lock_contended(struct evsel *evsel, struct perf_sample *sample)
{
if (trace_handler->contended_event)
return trace_handler->contended_event(evsel, sample);
return 0;
}
-static int perf_evsel__process_lock_release(struct evsel *evsel,
- struct perf_sample *sample)
+static int evsel__process_lock_release(struct evsel *evsel, struct perf_sample *sample)
{
if (trace_handler->release_event)
return trace_handler->release_event(evsel, sample);
pr_info("%10d: %s\n", st->tid, thread__comm_str(t));
node = rb_next(node);
thread__put(t);
- };
+ }
}
static void dump_map(void)
}
static const struct evsel_str_handler lock_tracepoints[] = {
- { "lock:lock_acquire", perf_evsel__process_lock_acquire, }, /* CONFIG_LOCKDEP */
- { "lock:lock_acquired", perf_evsel__process_lock_acquired, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
- { "lock:lock_contended", perf_evsel__process_lock_contended, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
- { "lock:lock_release", perf_evsel__process_lock_release, }, /* CONFIG_LOCKDEP */
+ { "lock:lock_acquire", evsel__process_lock_acquire, }, /* CONFIG_LOCKDEP */
+ { "lock:lock_acquired", evsel__process_lock_acquired, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
+ { "lock:lock_contended", evsel__process_lock_contended, }, /* CONFIG_LOCKDEP, CONFIG_LOCK_STAT */
+ { "lock:lock_release", evsel__process_lock_release, }, /* CONFIG_LOCKDEP */
};
static bool force;
rec_argv[i++] = "-e";
rec_argv[i++] = perf_mem_events__name(j);
- };
+ }
if (all_user)
rec_argv[i++] = "--all-user";
#include "util/tsc.h"
#include "util/parse-branch-options.h"
#include "util/parse-regs-options.h"
+#include "util/perf_api_probe.h"
#include "util/llvm-utils.h"
#include "util/bpf-loader.h"
#include "util/trigger.h"
#include "util/time-utils.h"
#include "util/units.h"
#include "util/bpf-event.h"
+#include "util/util.h"
#include "asm/bug.h"
#include "perf.h"
#include <inttypes.h>
#include <locale.h>
#include <poll.h>
+#include <pthread.h>
#include <unistd.h>
#include <sched.h>
#include <signal.h>
struct auxtrace_record *itr;
struct evlist *evlist;
struct perf_session *session;
+ struct evlist *sb_evlist;
+ pthread_t thread_id;
int realtime_prio;
+ bool switch_output_event_set;
bool no_buildid;
bool no_buildid_set;
bool no_buildid_cache;
return record__write(rec, NULL, event, event->header.size);
}
+static int process_locked_synthesized_event(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample __maybe_unused,
+ struct machine *machine __maybe_unused)
+{
+ static pthread_mutex_t synth_lock = PTHREAD_MUTEX_INITIALIZER;
+ int ret;
+
+ pthread_mutex_lock(&synth_lock);
+ ret = process_synthesized_event(tool, event, sample, machine);
+ pthread_mutex_unlock(&synth_lock);
+ return ret;
+}
+
static int record__pushfn(struct mmap *map, void *to, void *bf, size_t size)
{
struct record *rec = to;
evlist__for_each_entry(evlist, pos) {
try_again:
if (evsel__open(pos, pos->core.cpus, pos->core.threads) < 0) {
- if (perf_evsel__fallback(pos, errno, msg, sizeof(msg))) {
+ if (evsel__fallback(pos, errno, msg, sizeof(msg))) {
if (verbose > 0)
ui__warning("%s\n", msg);
goto try_again;
goto try_again;
}
rc = -errno;
- perf_evsel__open_strerror(pos, &opts->target,
- errno, msg, sizeof(msg));
+ evsel__open_strerror(pos, &opts->target, errno, msg, sizeof(msg));
ui__error("%s\n", msg);
goto out;
}
if (perf_evlist__apply_filters(evlist, &pos)) {
pr_err("failed to set filter \"%s\" on event %s with %d (%s)\n",
- pos->filter, perf_evsel__name(pos), errno,
+ pos->filter, evsel__name(pos), errno,
str_error_r(errno, msg, sizeof(msg)));
rc = -1;
goto out;
struct perf_tool *tool = &rec->tool;
int fd = perf_data__fd(data);
int err = 0;
+ event_op f = process_synthesized_event;
if (rec->opts.tail_synthesize != tail)
return 0;
if (err < 0)
pr_warning("Couldn't synthesize cgroup events.\n");
+ if (rec->opts.nr_threads_synthesize > 1) {
+ perf_set_multithreaded();
+ f = process_locked_synthesized_event;
+ }
+
err = __machine__synthesize_threads(machine, tool, &opts->target, rec->evlist->core.threads,
- process_synthesized_event, opts->sample_address,
- 1);
+ f, opts->sample_address,
+ rec->opts.nr_threads_synthesize);
+
+ if (rec->opts.nr_threads_synthesize > 1)
+ perf_set_singlethreaded();
+
out:
return err;
}
+static int record__process_signal_event(union perf_event *event __maybe_unused, void *data)
+{
+ struct record *rec = data;
+ pthread_kill(rec->thread_id, SIGUSR2);
+ return 0;
+}
+
+static int record__setup_sb_evlist(struct record *rec)
+{
+ struct record_opts *opts = &rec->opts;
+
+ if (rec->sb_evlist != NULL) {
+ /*
+ * We get here if --switch-output-event populated the
+ * sb_evlist, so associate a callback that will send a SIGUSR2
+ * to the main thread.
+ */
+ evlist__set_cb(rec->sb_evlist, record__process_signal_event, rec);
+ rec->thread_id = pthread_self();
+ }
+
+ if (!opts->no_bpf_event) {
+ if (rec->sb_evlist == NULL) {
+ rec->sb_evlist = evlist__new();
+
+ if (rec->sb_evlist == NULL) {
+ pr_err("Couldn't create side band evlist.\n.");
+ return -1;
+ }
+ }
+
+ if (evlist__add_bpf_sb_event(rec->sb_evlist, &rec->session->header.env)) {
+ pr_err("Couldn't ask for PERF_RECORD_BPF_EVENT side band events.\n.");
+ return -1;
+ }
+ }
+
+ if (perf_evlist__start_sb_thread(rec->sb_evlist, &rec->opts.target)) {
+ pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n");
+ opts->no_bpf_event = true;
+ }
+
+ return 0;
+}
+
static int __cmd_record(struct record *rec, int argc, const char **argv)
{
int err;
struct perf_data *data = &rec->data;
struct perf_session *session;
bool disabled = false, draining = false;
- struct evlist *sb_evlist = NULL;
int fd;
float ratio = 0;
goto out_child;
}
+ err = -1;
if (!rec->no_buildid
&& !perf_header__has_feat(&session->header, HEADER_BUILD_ID)) {
pr_err("Couldn't generate buildids. "
"Use --no-buildid to profile anyway.\n");
- err = -1;
goto out_child;
}
- if (!opts->no_bpf_event)
- bpf_event__add_sb_event(&sb_evlist, &session->header.env);
-
- if (perf_evlist__start_sb_thread(sb_evlist, &rec->opts.target)) {
- pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n");
- opts->no_bpf_event = true;
- }
+ err = record__setup_sb_evlist(rec);
+ if (err)
+ goto out_child;
err = record__synthesize(rec, false);
if (err < 0)
perf_session__delete(session);
if (!opts->no_bpf_event)
- perf_evlist__stop_sb_thread(sb_evlist);
+ perf_evlist__stop_sb_thread(rec->sb_evlist);
return status;
}
};
unsigned long val;
+ /*
+ * If we're using --switch-output-events, then we imply its
+ * --switch-output=signal, as we'll send a SIGUSR2 from the side band
+ * thread to its parent.
+ */
+ if (rec->switch_output_event_set)
+ goto do_signal;
+
if (!s->set)
return 0;
if (!strcmp(s->str, "signal")) {
+do_signal:
s->signal = true;
pr_debug("switch-output with SIGUSR2 signal\n");
goto enabled;
.default_per_cpu = true,
},
.mmap_flush = MMAP_FLUSH_DEFAULT,
+ .nr_threads_synthesize = 1,
},
.tool = {
.sample = process_sample_event,
&record.switch_output.set, "signal or size[BKMG] or time[smhd]",
"Switch output when receiving SIGUSR2 (signal) or cross a size or time threshold",
"signal"),
+ OPT_CALLBACK_SET(0, "switch-output-event", &record.sb_evlist, &record.switch_output_event_set, "switch output event",
+ "switch output event selector. use 'perf list' to list available events",
+ parse_events_option_new_evlist),
OPT_INTEGER(0, "switch-max-files", &record.switch_output.num_files,
"Limit number of switch output generated files"),
OPT_BOOLEAN(0, "dry-run", &dry_run,
#endif
OPT_CALLBACK(0, "max-size", &record.output_max_size,
"size", "Limit the maximum size of the output file", parse_output_max_size),
+ OPT_UINTEGER(0, "num-thread-synthesize",
+ &record.opts.nr_threads_synthesize,
+ "number of threads to run for event synthesis"),
OPT_END()
};
bool header_only;
bool nonany_branch_mode;
bool group_set;
+ bool stitch_lbr;
int max_stack;
struct perf_read_values show_threads_values;
struct annotation_options annotation_opts;
return -1;
}
+ if (rep->stitch_lbr)
+ al.thread->lbr_stitch_enable = true;
+
if (symbol_conf.hide_unresolved && al.sym == NULL)
goto out_put;
struct report *rep = container_of(tool, struct report, tool);
if (rep->show_threads) {
- const char *name = perf_evsel__name(evsel);
+ const char *name = evsel__name(evsel);
int err = perf_read_values_add_value(&rep->show_threads_values,
event->read.pid, event->read.tid,
evsel->idx,
bool is_pipe = perf_data__is_pipe(session->data);
if (session->itrace_synth_opts->callchain ||
+ session->itrace_synth_opts->add_callchain ||
(!is_pipe &&
perf_header__has_feat(&session->header, HEADER_AUXTRACE) &&
!session->itrace_synth_opts->set))
sample_type |= PERF_SAMPLE_CALLCHAIN;
- if (session->itrace_synth_opts->last_branch)
+ if (session->itrace_synth_opts->last_branch ||
+ session->itrace_synth_opts->add_last_branch)
sample_type |= PERF_SAMPLE_BRANCH_STACK;
if (!is_pipe && !(sample_type & PERF_SAMPLE_CALLCHAIN)) {
callchain_param.record_mode = CALLCHAIN_FP;
}
+ if (rep->stitch_lbr && (callchain_param.record_mode != CALLCHAIN_LBR)) {
+ ui__warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
+ "Please apply --call-graph lbr when recording.\n");
+ rep->stitch_lbr = false;
+ }
+
/* ??? handle more cases than just ANY? */
if (!(perf_evlist__combined_branch_type(session->evlist) &
PERF_SAMPLE_BRANCH_ANY))
nr_events = hists->stats.total_non_filtered_period;
}
- if (perf_evsel__is_group_event(evsel)) {
+ if (evsel__is_group_event(evsel)) {
struct evsel *pos;
- perf_evsel__group_desc(evsel, buf, size);
+ evsel__group_desc(evsel, buf, size);
evname = buf;
for_each_group_member(pos, evsel) {
evlist__for_each_entry(evlist, pos) {
struct hists *hists = evsel__hists(pos);
- const char *evname = perf_evsel__name(pos);
+ const char *evname = evsel__name(pos);
- if (symbol_conf.event_group &&
- !perf_evsel__is_group_leader(pos))
+ if (symbol_conf.event_group && !evsel__is_group_leader(pos))
continue;
hists__fprintf_nr_sample_events(hists, rep, evname, stdout);
break;
/* Non-group events are considered as leader */
- if (symbol_conf.event_group &&
- !perf_evsel__is_group_leader(pos)) {
+ if (symbol_conf.event_group && !evsel__is_group_leader(pos)) {
struct hists *leader_hists = evsel__hists(pos->leader);
hists__match(leader_hists, hists);
"Show full source file name path for source lines"),
OPT_BOOLEAN(0, "show-ref-call-graph", &symbol_conf.show_ref_callgraph,
"Show callgraph from reference event"),
+ OPT_BOOLEAN(0, "stitch-lbr", &report.stitch_lbr,
+ "Enable LBR callgraph stitching approach"),
OPT_INTEGER(0, "socket-filter", &report.socket_filter,
"only show processor socket that match with this filter"),
OPT_BOOLEAN(0, "raw-trace", &symbol_conf.raw_trace,
if (symbol_conf.cumulate_callchain && !callchain_param.order_set)
callchain_param.order = ORDER_CALLER;
- if (itrace_synth_opts.callchain &&
+ if ((itrace_synth_opts.callchain || itrace_synth_opts.add_callchain) &&
(int)itrace_synth_opts.callchain_sz > report.max_stack)
report.max_stack = itrace_synth_opts.callchain_sz;
goto error;
}
- if (itrace_synth_opts.last_branch)
+ if (itrace_synth_opts.last_branch || itrace_synth_opts.add_last_branch)
has_br_stack = true;
if (has_br_stack && branch_call_mode)
}
if (branch_call_mode) {
callchain_param.key = CCKEY_ADDRESS;
- callchain_param.branch_callstack = 1;
+ callchain_param.branch_callstack = true;
symbol_conf.use_callchain = true;
callchain_register_param(&callchain_param);
if (sort_order == NULL)
struct evsel *evsel, struct perf_sample *sample,
struct machine *machine __maybe_unused)
{
- const char *comm = perf_evsel__strval(evsel, sample, "comm");
- const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+ const char *comm = evsel__strval(evsel, sample, "comm");
+ const u32 pid = evsel__intval(evsel, sample, "pid");
struct task_desc *waker, *wakee;
if (verbose > 0) {
struct perf_sample *sample,
struct machine *machine __maybe_unused)
{
- const char *prev_comm = perf_evsel__strval(evsel, sample, "prev_comm"),
- *next_comm = perf_evsel__strval(evsel, sample, "next_comm");
- const u32 prev_pid = perf_evsel__intval(evsel, sample, "prev_pid"),
- next_pid = perf_evsel__intval(evsel, sample, "next_pid");
- const u64 prev_state = perf_evsel__intval(evsel, sample, "prev_state");
+ const char *prev_comm = evsel__strval(evsel, sample, "prev_comm"),
+ *next_comm = evsel__strval(evsel, sample, "next_comm");
+ const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
+ next_pid = evsel__intval(evsel, sample, "next_pid");
+ const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
struct task_desc *prev, __maybe_unused *next;
u64 timestamp0, timestamp = sample->time;
int cpu = sample->cpu;
struct perf_sample *sample,
struct machine *machine)
{
- const u32 prev_pid = perf_evsel__intval(evsel, sample, "prev_pid"),
- next_pid = perf_evsel__intval(evsel, sample, "next_pid");
- const u64 prev_state = perf_evsel__intval(evsel, sample, "prev_state");
+ const u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
+ next_pid = evsel__intval(evsel, sample, "next_pid");
+ const u64 prev_state = evsel__intval(evsel, sample, "prev_state");
struct work_atoms *out_events, *in_events;
struct thread *sched_out, *sched_in;
u64 timestamp0, timestamp = sample->time;
struct perf_sample *sample,
struct machine *machine)
{
- const u32 pid = perf_evsel__intval(evsel, sample, "pid");
- const u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
+ const u32 pid = evsel__intval(evsel, sample, "pid");
+ const u64 runtime = evsel__intval(evsel, sample, "runtime");
struct thread *thread = machine__findnew_thread(machine, -1, pid);
struct work_atoms *atoms = thread_atoms_search(&sched->atom_root, thread, &sched->cmp_pid);
u64 timestamp = sample->time;
struct perf_sample *sample,
struct machine *machine)
{
- const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+ const u32 pid = evsel__intval(evsel, sample, "pid");
struct work_atoms *atoms;
struct work_atom *atom;
struct thread *wakee;
struct perf_sample *sample,
struct machine *machine)
{
- const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+ const u32 pid = evsel__intval(evsel, sample, "pid");
u64 timestamp = sample->time;
struct work_atoms *atoms;
struct work_atom *atom;
static int map_switch_event(struct perf_sched *sched, struct evsel *evsel,
struct perf_sample *sample, struct machine *machine)
{
- const u32 next_pid = perf_evsel__intval(evsel, sample, "next_pid");
+ const u32 next_pid = evsel__intval(evsel, sample, "next_pid");
struct thread *sched_in;
struct thread_runtime *tr;
int new_shortname;
{
struct perf_sched *sched = container_of(tool, struct perf_sched, tool);
int this_cpu = sample->cpu, err = 0;
- u32 prev_pid = perf_evsel__intval(evsel, sample, "prev_pid"),
- next_pid = perf_evsel__intval(evsel, sample, "next_pid");
+ u32 prev_pid = evsel__intval(evsel, sample, "prev_pid"),
+ next_pid = evsel__intval(evsel, sample, "next_pid");
if (sched->curr_pid[this_cpu] != (u32)-1) {
/*
* returns runtime data for event, allocating memory for it the
* first time it is used.
*/
-static struct evsel_runtime *perf_evsel__get_runtime(struct evsel *evsel)
+static struct evsel_runtime *evsel__get_runtime(struct evsel *evsel)
{
struct evsel_runtime *r = evsel->priv;
/*
* save last time event was seen per cpu
*/
-static void perf_evsel__save_time(struct evsel *evsel,
- u64 timestamp, u32 cpu)
+static void evsel__save_time(struct evsel *evsel, u64 timestamp, u32 cpu)
{
- struct evsel_runtime *r = perf_evsel__get_runtime(evsel);
+ struct evsel_runtime *r = evsel__get_runtime(evsel);
if (r == NULL)
return;
}
/* returns last time this event was seen on the given cpu */
-static u64 perf_evsel__get_time(struct evsel *evsel, u32 cpu)
+static u64 evsel__get_time(struct evsel *evsel, u32 cpu)
{
- struct evsel_runtime *r = perf_evsel__get_runtime(evsel);
+ struct evsel_runtime *r = evsel__get_runtime(evsel);
if ((r == NULL) || (r->last_time == NULL) || (cpu >= r->ncpu))
return 0;
u64 t, int state)
{
struct thread_runtime *tr = thread__priv(thread);
- const char *next_comm = perf_evsel__strval(evsel, sample, "next_comm");
- const u32 next_pid = perf_evsel__intval(evsel, sample, "next_pid");
+ const char *next_comm = evsel__strval(evsel, sample, "next_comm");
+ const u32 next_pid = evsel__intval(evsel, sample, "next_pid");
u32 max_cpus = sched->max_cpu + 1;
char tstr[64];
char nstr[30];
struct evsel *evsel)
{
/* pid 0 == swapper == idle task */
- if (strcmp(perf_evsel__name(evsel), "sched:sched_switch") == 0)
- return perf_evsel__intval(evsel, sample, "prev_pid") == 0;
+ if (strcmp(evsel__name(evsel), "sched:sched_switch") == 0)
+ return evsel__intval(evsel, sample, "prev_pid") == 0;
return sample->pid == 0;
}
itr->last_thread = thread;
/* copy task callchain when entering to idle */
- if (perf_evsel__intval(evsel, sample, "next_pid") == 0)
+ if (evsel__intval(evsel, sample, "next_pid") == 0)
save_idle_callchain(sched, itr, sample);
}
}
}
if (sched->idle_hist) {
- if (strcmp(perf_evsel__name(evsel), "sched:sched_switch"))
+ if (strcmp(evsel__name(evsel), "sched:sched_switch"))
rc = true;
- else if (perf_evsel__intval(evsel, sample, "prev_pid") != 0 &&
- perf_evsel__intval(evsel, sample, "next_pid") != 0)
+ else if (evsel__intval(evsel, sample, "prev_pid") != 0 &&
+ evsel__intval(evsel, sample, "next_pid") != 0)
rc = true;
}
struct thread *thread;
struct thread_runtime *tr = NULL;
/* want pid of awakened task not pid in sample */
- const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+ const u32 pid = evsel__intval(evsel, sample, "pid");
thread = machine__findnew_thread(machine, 0, pid);
if (thread == NULL)
return;
max_cpus = sched->max_cpu + 1;
- ocpu = perf_evsel__intval(evsel, sample, "orig_cpu");
- dcpu = perf_evsel__intval(evsel, sample, "dest_cpu");
+ ocpu = evsel__intval(evsel, sample, "orig_cpu");
+ dcpu = evsel__intval(evsel, sample, "dest_cpu");
thread = machine__findnew_thread(machine, sample->pid, sample->tid);
if (thread == NULL)
struct thread *thread;
struct thread_runtime *tr = NULL;
/* want pid of migrated task not pid in sample */
- const u32 pid = perf_evsel__intval(evsel, sample, "pid");
+ const u32 pid = evsel__intval(evsel, sample, "pid");
thread = machine__findnew_thread(machine, 0, pid);
if (thread == NULL)
struct thread_runtime *tr = NULL;
u64 tprev, t = sample->time;
int rc = 0;
- int state = perf_evsel__intval(evsel, sample, "prev_state");
-
+ int state = evsel__intval(evsel, sample, "prev_state");
if (machine__resolve(machine, &al, sample) < 0) {
pr_err("problem processing %d event. skipping it\n",
goto out;
}
- tprev = perf_evsel__get_time(evsel, sample->cpu);
+ tprev = evsel__get_time(evsel, sample->cpu);
/*
* If start time given:
tr->ready_to_run = 0;
}
- perf_evsel__save_time(evsel, sample->time, sample->cpu);
+ evsel__save_time(evsel, sample->time, sample->cpu);
return rc;
}
struct evsel_runtime *er;
list_for_each_entry(evsel, &evlist->core.entries, core.node) {
- er = perf_evsel__get_runtime(evsel);
+ er = evsel__get_runtime(evsel);
if (er == NULL) {
pr_err("Failed to allocate memory for evsel runtime data\n");
return -1;
struct evsel_script *es = zalloc(sizeof(*es));
if (es != NULL) {
- if (asprintf(&es->filename, "%s.%s.dump", data->file.path, perf_evsel__name(evsel)) < 0)
+ if (asprintf(&es->filename, "%s.%s.dump", data->file.path, evsel__name(evsel)) < 0)
goto out_free;
es->fp = fopen(es->filename, "w");
if (es->fp == NULL)
#define PRINT_FIELD(x) (output[output_type(attr->type)].fields & PERF_OUTPUT_##x)
-static int perf_evsel__do_check_stype(struct evsel *evsel,
- u64 sample_type, const char *sample_msg,
- enum perf_output_field field,
- bool allow_user_set)
+static int evsel__do_check_stype(struct evsel *evsel, u64 sample_type, const char *sample_msg,
+ enum perf_output_field field, bool allow_user_set)
{
struct perf_event_attr *attr = &evsel->core.attr;
int type = output_type(attr->type);
if (output[type].user_set_fields & field) {
if (allow_user_set)
return 0;
- evname = perf_evsel__name(evsel);
+ evname = evsel__name(evsel);
pr_err("Samples for '%s' event do not have %s attribute set. "
"Cannot print '%s' field.\n",
evname, sample_msg, output_field2str(field));
/* user did not ask for it explicitly so remove from the default list */
output[type].fields &= ~field;
- evname = perf_evsel__name(evsel);
+ evname = evsel__name(evsel);
pr_debug("Samples for '%s' event do not have %s attribute set. "
"Skipping '%s' field.\n",
evname, sample_msg, output_field2str(field));
return 0;
}
-static int perf_evsel__check_stype(struct evsel *evsel,
- u64 sample_type, const char *sample_msg,
- enum perf_output_field field)
+static int evsel__check_stype(struct evsel *evsel, u64 sample_type, const char *sample_msg,
+ enum perf_output_field field)
{
- return perf_evsel__do_check_stype(evsel, sample_type, sample_msg, field,
- false);
+ return evsel__do_check_stype(evsel, sample_type, sample_msg, field, false);
}
-static int perf_evsel__check_attr(struct evsel *evsel,
- struct perf_session *session)
+static int perf_evsel__check_attr(struct evsel *evsel, struct perf_session *session)
{
struct perf_event_attr *attr = &evsel->core.attr;
bool allow_user_set;
HEADER_AUXTRACE);
if (PRINT_FIELD(TRACE) &&
- !perf_session__has_traces(session, "record -R"))
+ !perf_session__has_traces(session, "record -R"))
return -EINVAL;
if (PRINT_FIELD(IP)) {
- if (perf_evsel__check_stype(evsel, PERF_SAMPLE_IP, "IP",
- PERF_OUTPUT_IP))
+ if (evsel__check_stype(evsel, PERF_SAMPLE_IP, "IP", PERF_OUTPUT_IP))
return -EINVAL;
}
if (PRINT_FIELD(ADDR) &&
- perf_evsel__do_check_stype(evsel, PERF_SAMPLE_ADDR, "ADDR",
- PERF_OUTPUT_ADDR, allow_user_set))
+ evsel__do_check_stype(evsel, PERF_SAMPLE_ADDR, "ADDR", PERF_OUTPUT_ADDR, allow_user_set))
return -EINVAL;
if (PRINT_FIELD(DATA_SRC) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_DATA_SRC, "DATA_SRC",
- PERF_OUTPUT_DATA_SRC))
+ evsel__check_stype(evsel, PERF_SAMPLE_DATA_SRC, "DATA_SRC", PERF_OUTPUT_DATA_SRC))
return -EINVAL;
if (PRINT_FIELD(WEIGHT) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_WEIGHT, "WEIGHT",
- PERF_OUTPUT_WEIGHT))
+ evsel__check_stype(evsel, PERF_SAMPLE_WEIGHT, "WEIGHT", PERF_OUTPUT_WEIGHT))
return -EINVAL;
if (PRINT_FIELD(SYM) &&
- !(evsel->core.attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
+ !(evsel->core.attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
pr_err("Display of symbols requested but neither sample IP nor "
"sample address\navailable. Hence, no addresses to convert "
"to symbols.\n");
return -EINVAL;
}
if (PRINT_FIELD(DSO) &&
- !(evsel->core.attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
+ !(evsel->core.attr.sample_type & (PERF_SAMPLE_IP|PERF_SAMPLE_ADDR))) {
pr_err("Display of DSO requested but no address to convert.\n");
return -EINVAL;
}
return -EINVAL;
}
if ((PRINT_FIELD(PID) || PRINT_FIELD(TID)) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID",
- PERF_OUTPUT_TID|PERF_OUTPUT_PID))
+ evsel__check_stype(evsel, PERF_SAMPLE_TID, "TID", PERF_OUTPUT_TID|PERF_OUTPUT_PID))
return -EINVAL;
if (PRINT_FIELD(TIME) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_TIME, "TIME",
- PERF_OUTPUT_TIME))
+ evsel__check_stype(evsel, PERF_SAMPLE_TIME, "TIME", PERF_OUTPUT_TIME))
return -EINVAL;
if (PRINT_FIELD(CPU) &&
- perf_evsel__do_check_stype(evsel, PERF_SAMPLE_CPU, "CPU",
- PERF_OUTPUT_CPU, allow_user_set))
+ evsel__do_check_stype(evsel, PERF_SAMPLE_CPU, "CPU", PERF_OUTPUT_CPU, allow_user_set))
return -EINVAL;
if (PRINT_FIELD(IREGS) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS",
- PERF_OUTPUT_IREGS))
+ evsel__check_stype(evsel, PERF_SAMPLE_REGS_INTR, "IREGS", PERF_OUTPUT_IREGS))
return -EINVAL;
if (PRINT_FIELD(UREGS) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_REGS_USER, "UREGS",
- PERF_OUTPUT_UREGS))
+ evsel__check_stype(evsel, PERF_SAMPLE_REGS_USER, "UREGS", PERF_OUTPUT_UREGS))
return -EINVAL;
if (PRINT_FIELD(PHYS_ADDR) &&
- perf_evsel__check_stype(evsel, PERF_SAMPLE_PHYS_ADDR, "PHYS_ADDR",
- PERF_OUTPUT_PHYS_ADDR))
+ evsel__check_stype(evsel, PERF_SAMPLE_PHYS_ADDR, "PHYS_ADDR", PERF_OUTPUT_PHYS_ADDR))
return -EINVAL;
return 0;
printed += fprintf(fp, "%5s:0x%"PRIx64" ", perf_reg_name(r), val);
}
- fprintf(fp, "\n");
-
return printed;
}
bool show_cgroup_events;
bool allocated;
bool per_event_dump;
+ bool stitch_lbr;
struct evswitch evswitch;
struct perf_cpu_map *cpus;
struct perf_thread_map *threads;
int max = 0;
evlist__for_each_entry(evlist, evsel) {
- int len = strlen(perf_evsel__name(evsel));
+ int len = strlen(evsel__name(evsel));
max = MAX(len, max);
}
fprintf(fp, "%10" PRIu64 " ", sample->period);
if (PRINT_FIELD(EVNAME)) {
- const char *evname = perf_evsel__name(evsel);
+ const char *evname = evsel__name(evsel);
if (!script->name_width)
script->name_width = perf_evlist__max_name_len(script->session->evlist);
if (PRINT_FIELD(IP)) {
struct callchain_cursor *cursor = NULL;
+ if (script->stitch_lbr)
+ al->thread->lbr_stitch_enable = true;
+
if (symbol_conf.use_callchain && sample->callchain &&
thread__resolve_callchain(al->thread, &callchain_cursor, evsel,
sample, NULL, NULL, scripting_max_stack) == 0)
else if (PRINT_FIELD(BRSTACKOFF))
perf_sample__fprintf_brstackoff(sample, thread, attr, fp);
- if (perf_evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
+ if (evsel__is_bpf_output(evsel) && PRINT_FIELD(BPF_OUTPUT))
perf_sample__fprintf_bpf_output(sample, fp);
perf_sample__fprintf_insn(sample, attr, thread, machine, fp);
static void __process_stat(struct evsel *counter, u64 tstamp)
{
int nthreads = perf_thread_map__nr(counter->core.threads);
- int ncpus = perf_evsel__nr_cpus(counter);
+ int ncpus = evsel__nr_cpus(counter);
int cpu, thread;
static int header_printed;
counts->ena,
counts->run,
tstamp,
- perf_evsel__name(counter));
+ evsel__name(counter));
}
}
}
static bool filter_cpu(struct perf_sample *sample)
{
- if (cpu_list)
+ if (cpu_list && sample->cpu != (u32)-1)
return !test_bit(sample->cpu, cpu_bitmap);
return false;
}
return err;
}
-static int process_comm_event(struct perf_tool *tool,
- union perf_event *event,
- struct perf_sample *sample,
- struct machine *machine)
+static int print_event_with_time(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine,
+ pid_t pid, pid_t tid, u64 timestamp)
{
- struct thread *thread;
struct perf_script *script = container_of(tool, struct perf_script, tool);
struct perf_session *session = script->session;
struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
- int ret = -1;
+ struct thread *thread = NULL;
- thread = machine__findnew_thread(machine, event->comm.pid, event->comm.tid);
- if (thread == NULL) {
- pr_debug("problem processing COMM event, skipping it.\n");
- return -1;
+ if (evsel && !evsel->core.attr.sample_id_all) {
+ sample->cpu = 0;
+ sample->time = timestamp;
+ sample->pid = pid;
+ sample->tid = tid;
}
- if (perf_event__process_comm(tool, event, sample, machine) < 0)
- goto out;
+ if (filter_cpu(sample))
+ return 0;
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = 0;
- sample->tid = event->comm.tid;
- sample->pid = event->comm.pid;
- }
- if (!filter_cpu(sample)) {
+ if (tid != -1)
+ thread = machine__findnew_thread(machine, pid, tid);
+
+ if (thread && evsel) {
perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_COMM, stdout);
- perf_event__fprintf(event, stdout);
+ event->header.type, stdout);
}
- ret = 0;
-out:
+
+ perf_event__fprintf(event, stdout);
+
thread__put(thread);
- return ret;
+
+ return 0;
+}
+
+static int print_event(struct perf_tool *tool, union perf_event *event,
+ struct perf_sample *sample, struct machine *machine,
+ pid_t pid, pid_t tid)
+{
+ return print_event_with_time(tool, event, sample, machine, pid, tid, 0);
+}
+
+static int process_comm_event(struct perf_tool *tool,
+ union perf_event *event,
+ struct perf_sample *sample,
+ struct machine *machine)
+{
+ if (perf_event__process_comm(tool, event, sample, machine) < 0)
+ return -1;
+
+ return print_event(tool, event, sample, machine, event->comm.pid,
+ event->comm.tid);
}
static int process_namespaces_event(struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
- int ret = -1;
-
- thread = machine__findnew_thread(machine, event->namespaces.pid,
- event->namespaces.tid);
- if (thread == NULL) {
- pr_debug("problem processing NAMESPACES event, skipping it.\n");
- return -1;
- }
-
if (perf_event__process_namespaces(tool, event, sample, machine) < 0)
- goto out;
+ return -1;
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = 0;
- sample->tid = event->namespaces.tid;
- sample->pid = event->namespaces.pid;
- }
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_NAMESPACES, stdout);
- perf_event__fprintf(event, stdout);
- }
- ret = 0;
-out:
- thread__put(thread);
- return ret;
+ return print_event(tool, event, sample, machine, event->namespaces.pid,
+ event->namespaces.tid);
}
static int process_cgroup_event(struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
- int ret = -1;
-
- thread = machine__findnew_thread(machine, sample->pid, sample->tid);
- if (thread == NULL) {
- pr_debug("problem processing CGROUP event, skipping it.\n");
- return -1;
- }
-
if (perf_event__process_cgroup(tool, event, sample, machine) < 0)
- goto out;
+ return -1;
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = 0;
- }
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_CGROUP, stdout);
- perf_event__fprintf(event, stdout);
- }
- ret = 0;
-out:
- thread__put(thread);
- return ret;
+ return print_event(tool, event, sample, machine, sample->pid,
+ sample->tid);
}
static int process_fork_event(struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
-
if (perf_event__process_fork(tool, event, sample, machine) < 0)
return -1;
- thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid);
- if (thread == NULL) {
- pr_debug("problem processing FORK event, skipping it.\n");
- return -1;
- }
-
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = event->fork.time;
- sample->tid = event->fork.tid;
- sample->pid = event->fork.pid;
- }
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_FORK, stdout);
- perf_event__fprintf(event, stdout);
- }
- thread__put(thread);
-
- return 0;
+ return print_event_with_time(tool, event, sample, machine,
+ event->fork.pid, event->fork.tid,
+ event->fork.time);
}
static int process_exit_event(struct perf_tool *tool,
union perf_event *event,
struct perf_sample *sample,
struct machine *machine)
{
- int err = 0;
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
-
- thread = machine__findnew_thread(machine, event->fork.pid, event->fork.tid);
- if (thread == NULL) {
- pr_debug("problem processing EXIT event, skipping it.\n");
+ /* Print before 'exit' deletes anything */
+ if (print_event_with_time(tool, event, sample, machine, event->fork.pid,
+ event->fork.tid, event->fork.time))
return -1;
- }
-
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = 0;
- sample->tid = event->fork.tid;
- sample->pid = event->fork.pid;
- }
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_EXIT, stdout);
- perf_event__fprintf(event, stdout);
- }
-
- if (perf_event__process_exit(tool, event, sample, machine) < 0)
- err = -1;
- thread__put(thread);
- return err;
+ return perf_event__process_exit(tool, event, sample, machine);
}
static int process_mmap_event(struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
-
if (perf_event__process_mmap(tool, event, sample, machine) < 0)
return -1;
- thread = machine__findnew_thread(machine, event->mmap.pid, event->mmap.tid);
- if (thread == NULL) {
- pr_debug("problem processing MMAP event, skipping it.\n");
- return -1;
- }
-
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = 0;
- sample->tid = event->mmap.tid;
- sample->pid = event->mmap.pid;
- }
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_MMAP, stdout);
- perf_event__fprintf(event, stdout);
- }
- thread__put(thread);
- return 0;
+ return print_event(tool, event, sample, machine, event->mmap.pid,
+ event->mmap.tid);
}
static int process_mmap2_event(struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
-
if (perf_event__process_mmap2(tool, event, sample, machine) < 0)
return -1;
- thread = machine__findnew_thread(machine, event->mmap2.pid, event->mmap2.tid);
- if (thread == NULL) {
- pr_debug("problem processing MMAP2 event, skipping it.\n");
- return -1;
- }
-
- if (!evsel->core.attr.sample_id_all) {
- sample->cpu = 0;
- sample->time = 0;
- sample->tid = event->mmap2.tid;
- sample->pid = event->mmap2.pid;
- }
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_MMAP2, stdout);
- perf_event__fprintf(event, stdout);
- }
- thread__put(thread);
- return 0;
+ return print_event(tool, event, sample, machine, event->mmap2.pid,
+ event->mmap2.tid);
}
static int process_switch_event(struct perf_tool *tool,
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
if (perf_event__process_switch(tool, event, sample, machine) < 0)
return -1;
if (!script->show_switch_events)
return 0;
- thread = machine__findnew_thread(machine, sample->pid,
- sample->tid);
- if (thread == NULL) {
- pr_debug("problem processing SWITCH event, skipping it.\n");
- return -1;
- }
-
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_SWITCH, stdout);
- perf_event__fprintf(event, stdout);
- }
- thread__put(thread);
- return 0;
+ return print_event(tool, event, sample, machine, sample->pid,
+ sample->tid);
}
static int
struct perf_sample *sample,
struct machine *machine)
{
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
- struct thread *thread;
-
- thread = machine__findnew_thread(machine, sample->pid,
- sample->tid);
- if (thread == NULL)
- return -1;
-
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- PERF_RECORD_LOST, stdout);
- perf_event__fprintf(event, stdout);
- }
- thread__put(thread);
- return 0;
+ return print_event(tool, event, sample, machine, sample->pid,
+ sample->tid);
}
static int
struct perf_sample *sample,
struct machine *machine)
{
- struct thread *thread;
- struct perf_script *script = container_of(tool, struct perf_script, tool);
- struct perf_session *session = script->session;
- struct evsel *evsel = perf_evlist__id2evsel(session->evlist, sample->id);
-
if (machine__process_ksymbol(machine, event, sample) < 0)
return -1;
- if (!evsel->core.attr.sample_id_all) {
- perf_event__fprintf(event, stdout);
- return 0;
- }
-
- thread = machine__findnew_thread(machine, sample->pid, sample->tid);
- if (thread == NULL) {
- pr_debug("problem processing MMAP event, skipping it.\n");
- return -1;
- }
-
- if (!filter_cpu(sample)) {
- perf_sample__fprintf_start(sample, thread, evsel,
- event->header.type, stdout);
- perf_event__fprintf(event, stdout);
- }
-
- thread__put(thread);
- return 0;
+ return print_event(tool, event, sample, machine, sample->pid,
+ sample->tid);
}
static void sig_handler(int sig __maybe_unused)
match = 0;
evlist__for_each_entry(session->evlist, pos) {
- if (!strcmp(perf_evsel__name(pos), evname)) {
+ if (!strcmp(evsel__name(pos), evname)) {
match = 1;
break;
}
else
callchain_param.record_mode = CALLCHAIN_FP;
}
+
+ if (script->stitch_lbr && (callchain_param.record_mode != CALLCHAIN_LBR)) {
+ pr_warning("Can't find LBR callchain. Switch off --stitch-lbr.\n"
+ "Please apply --call-graph lbr when recording.\n");
+ script->stitch_lbr = false;
+ }
}
static int process_stat_round_event(struct perf_session *session,
"file", "file saving guest os /proc/kallsyms"),
OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
"file", "file saving guest os /proc/modules"),
+ OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr,
+ "Enable LBR callgraph stitching approach"),
OPTS_EVSWITCH(&script.evswitch),
OPT_END()
};
return -1;
}
- if (itrace_synth_opts.callchain &&
+ if ((itrace_synth_opts.callchain || itrace_synth_opts.add_callchain) &&
itrace_synth_opts.callchain_sz > scripting_max_stack)
scripting_max_stack = itrace_synth_opts.callchain_sz;
#define SID(e, x, y) xyarray__entry(e->core.sample_id, x, y)
-static int
-perf_evsel__write_stat_event(struct evsel *counter, u32 cpu, u32 thread,
- struct perf_counts_values *count)
+static int evsel__write_stat_event(struct evsel *counter, u32 cpu, u32 thread,
+ struct perf_counts_values *count)
{
struct perf_sample_id *sid = SID(counter, cpu, thread);
count->val = val;
return 0;
}
- return perf_evsel__read_counter(counter, cpu, thread);
+ return evsel__read_counter(counter, cpu, thread);
}
/*
/*
* The leader's group read loads data into its group members
- * (via perf_evsel__read_counter()) and sets their count->loaded.
+ * (via evsel__read_counter()) and sets their count->loaded.
*/
if (!perf_counts__is_loaded(counter->counts, cpu, thread) &&
read_single_counter(counter, cpu, thread, rs)) {
perf_counts__set_loaded(counter->counts, cpu, thread, false);
if (STAT_RECORD) {
- if (perf_evsel__write_stat_event(counter, cpu, thread, count)) {
+ if (evsel__write_stat_event(counter, cpu, thread, count)) {
pr_err("failed to write stat event\n");
return -1;
}
if (verbose > 1) {
fprintf(stat_config.output,
"%s: %d: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
- perf_evsel__name(counter),
+ evsel__name(counter),
cpu,
count->val, count->ena, count->run);
}
clock_gettime(CLOCK_MONOTONIC, &ts);
diff_timespec(&rs, &ts, &ref_time);
+ perf_stat__reset_shadow_per_stat(&rt_stat);
read_counters(&rs);
if (STAT_RECORD) {
workload_exec_errno = info->si_value.sival_int;
}
-static bool perf_evsel__should_store_id(struct evsel *counter)
+static bool evsel__should_store_id(struct evsel *counter)
{
return STAT_RECORD || counter->core.attr.read_format & PERF_FORMAT_ID;
}
errno == ENXIO) {
if (verbose > 0)
ui__warning("%s event is not supported by the kernel.\n",
- perf_evsel__name(counter));
+ evsel__name(counter));
counter->supported = false;
/*
* errored is a sticky flag that means one of the counter's
if ((counter->leader != counter) ||
!(counter->leader->core.nr_members > 1))
return COUNTER_SKIP;
- } else if (perf_evsel__fallback(counter, errno, msg, sizeof(msg))) {
+ } else if (evsel__fallback(counter, errno, msg, sizeof(msg))) {
if (verbose > 0)
ui__warning("%s\n", msg);
return COUNTER_RETRY;
}
}
- perf_evsel__open_strerror(counter, &target,
- errno, msg, sizeof(msg));
+ evsel__open_strerror(counter, &target, errno, msg, sizeof(msg));
ui__error("%s\n", msg);
if (child_pid != -1)
if (!counter->reset_group)
continue;
try_again_reset:
- pr_debug2("reopening weak %s\n", perf_evsel__name(counter));
+ pr_debug2("reopening weak %s\n", evsel__name(counter));
if (create_perf_stat_counter(counter, &stat_config, &target,
counter->cpu_iter - 1) < 0) {
if (l > stat_config.unit_width)
stat_config.unit_width = l;
- if (perf_evsel__should_store_id(counter) &&
- perf_evsel__store_ids(counter, evsel_list))
+ if (evsel__should_store_id(counter) &&
+ evsel__store_ids(counter, evsel_list))
return -1;
}
if (perf_evlist__apply_filters(evsel_list, &counter)) {
pr_err("failed to set filter \"%s\" on event %s with %d (%s)\n",
- counter->filter, perf_evsel__name(counter), errno,
+ counter->filter, evsel__name(counter), errno,
str_error_r(errno, msg, sizeof(msg)));
return -1;
}
break;
}
}
- if (child_pid != -1)
+ if (child_pid != -1) {
+ if (timeout)
+ kill(child_pid, SIGTERM);
wait4(child_pid, &status, 0, &stat_config.ru_data);
+ }
if (workload_exec_errno) {
const char *emsg = str_error_r(workload_exec_errno, msg, sizeof(msg));
struct perf_sample *sample,
const char *backtrace __maybe_unused)
{
- u32 state = perf_evsel__intval(evsel, sample, "state");
- u32 cpu_id = perf_evsel__intval(evsel, sample, "cpu_id");
+ u32 state = evsel__intval(evsel, sample, "state");
+ u32 cpu_id = evsel__intval(evsel, sample, "cpu_id");
if (state == (u32)PWR_EVENT_EXIT)
c_state_end(tchart, cpu_id, sample->time);
struct perf_sample *sample,
const char *backtrace __maybe_unused)
{
- u32 state = perf_evsel__intval(evsel, sample, "state");
- u32 cpu_id = perf_evsel__intval(evsel, sample, "cpu_id");
+ u32 state = evsel__intval(evsel, sample, "state");
+ u32 cpu_id = evsel__intval(evsel, sample, "cpu_id");
p_state_change(tchart, cpu_id, sample->time, state);
return 0;
struct perf_sample *sample,
const char *backtrace)
{
- u8 flags = perf_evsel__intval(evsel, sample, "common_flags");
- int waker = perf_evsel__intval(evsel, sample, "common_pid");
- int wakee = perf_evsel__intval(evsel, sample, "pid");
+ u8 flags = evsel__intval(evsel, sample, "common_flags");
+ int waker = evsel__intval(evsel, sample, "common_pid");
+ int wakee = evsel__intval(evsel, sample, "pid");
sched_wakeup(tchart, sample->cpu, sample->time, waker, wakee, flags, backtrace);
return 0;
struct perf_sample *sample,
const char *backtrace)
{
- int prev_pid = perf_evsel__intval(evsel, sample, "prev_pid");
- int next_pid = perf_evsel__intval(evsel, sample, "next_pid");
- u64 prev_state = perf_evsel__intval(evsel, sample, "prev_state");
+ int prev_pid = evsel__intval(evsel, sample, "prev_pid");
+ int next_pid = evsel__intval(evsel, sample, "next_pid");
+ u64 prev_state = evsel__intval(evsel, sample, "prev_state");
sched_switch(tchart, sample->cpu, sample->time, prev_pid, next_pid,
prev_state, backtrace);
struct perf_sample *sample,
const char *backtrace __maybe_unused)
{
- u64 cpu_id = perf_evsel__intval(evsel, sample, "cpu_id");
- u64 value = perf_evsel__intval(evsel, sample, "value");
+ u64 cpu_id = evsel__intval(evsel, sample, "cpu_id");
+ u64 value = evsel__intval(evsel, sample, "value");
c_state_start(cpu_id, sample->time, value);
return 0;
struct perf_sample *sample,
const char *backtrace __maybe_unused)
{
- u64 cpu_id = perf_evsel__intval(evsel, sample, "cpu_id");
- u64 value = perf_evsel__intval(evsel, sample, "value");
+ u64 cpu_id = evsel__intval(evsel, sample, "cpu_id");
+ u64 value = evsel__intval(evsel, sample, "value");
p_state_change(tchart, cpu_id, sample->time, value);
return 0;
struct evsel *evsel,
struct perf_sample *sample)
{
- long fd = perf_evsel__intval(evsel, sample, "fd");
+ long fd = evsel__intval(evsel, sample, "fd");
return pid_begin_io_sample(tchart, sample->tid, IOTYPE_READ,
sample->time, fd);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long ret = perf_evsel__intval(evsel, sample, "ret");
+ long ret = evsel__intval(evsel, sample, "ret");
return pid_end_io_sample(tchart, sample->tid, IOTYPE_READ,
sample->time, ret);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long fd = perf_evsel__intval(evsel, sample, "fd");
+ long fd = evsel__intval(evsel, sample, "fd");
return pid_begin_io_sample(tchart, sample->tid, IOTYPE_WRITE,
sample->time, fd);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long ret = perf_evsel__intval(evsel, sample, "ret");
+ long ret = evsel__intval(evsel, sample, "ret");
return pid_end_io_sample(tchart, sample->tid, IOTYPE_WRITE,
sample->time, ret);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long fd = perf_evsel__intval(evsel, sample, "fd");
+ long fd = evsel__intval(evsel, sample, "fd");
return pid_begin_io_sample(tchart, sample->tid, IOTYPE_SYNC,
sample->time, fd);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long ret = perf_evsel__intval(evsel, sample, "ret");
+ long ret = evsel__intval(evsel, sample, "ret");
return pid_end_io_sample(tchart, sample->tid, IOTYPE_SYNC,
sample->time, ret);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long fd = perf_evsel__intval(evsel, sample, "fd");
+ long fd = evsel__intval(evsel, sample, "fd");
return pid_begin_io_sample(tchart, sample->tid, IOTYPE_TX,
sample->time, fd);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long ret = perf_evsel__intval(evsel, sample, "ret");
+ long ret = evsel__intval(evsel, sample, "ret");
return pid_end_io_sample(tchart, sample->tid, IOTYPE_TX,
sample->time, ret);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long fd = perf_evsel__intval(evsel, sample, "fd");
+ long fd = evsel__intval(evsel, sample, "fd");
return pid_begin_io_sample(tchart, sample->tid, IOTYPE_RX,
sample->time, fd);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long ret = perf_evsel__intval(evsel, sample, "ret");
+ long ret = evsel__intval(evsel, sample, "ret");
return pid_end_io_sample(tchart, sample->tid, IOTYPE_RX,
sample->time, ret);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long fd = perf_evsel__intval(evsel, sample, "fd");
+ long fd = evsel__intval(evsel, sample, "fd");
return pid_begin_io_sample(tchart, sample->tid, IOTYPE_POLL,
sample->time, fd);
}
struct evsel *evsel,
struct perf_sample *sample)
{
- long ret = perf_evsel__intval(evsel, sample, "ret");
+ long ret = evsel__intval(evsel, sample, "ret");
return pid_end_io_sample(tchart, sample->tid, IOTYPE_POLL,
sample->time, ret);
}
#include "util/map.h"
#include "util/mmap.h"
#include "util/session.h"
+#include "util/thread.h"
#include "util/symbol.h"
#include "util/synthetic-events.h"
#include "util/top.h"
if (notes->src == NULL)
goto out_unlock;
- printf("Showing %s for %s\n", perf_evsel__name(top->sym_evsel), symbol->name);
+ printf("Showing %s for %s\n", evsel__name(top->sym_evsel), symbol->name);
printf(" Events Pcnt (>=%d%%)\n", top->annotation_opts.min_pcnt);
more = symbol__annotate_printf(&he->ms, top->sym_evsel, &top->annotation_opts);
hists__collapse_resort(hists, NULL);
/* Non-group events are considered as leader */
- if (symbol_conf.event_group &&
- !perf_evsel__is_group_leader(pos)) {
+ if (symbol_conf.event_group && !evsel__is_group_leader(pos)) {
struct hists *leader_hists = evsel__hists(pos->leader);
hists__match(leader_hists, hists);
fprintf(stdout, "\t[e] display entries (lines). \t(%d)\n", top->print_entries);
if (top->evlist->core.nr_entries > 1)
- fprintf(stdout, "\t[E] active event counter. \t(%s)\n", perf_evsel__name(top->sym_evsel));
+ fprintf(stdout, "\t[E] active event counter. \t(%s)\n", evsel__name(top->sym_evsel));
fprintf(stdout, "\t[f] profile display filter (count). \t(%d)\n", top->count_filter);
fprintf(stderr, "\nAvailable events:");
evlist__for_each_entry(top->evlist, top->sym_evsel)
- fprintf(stderr, "\n\t%d %s", top->sym_evsel->idx, perf_evsel__name(top->sym_evsel));
+ fprintf(stderr, "\n\t%d %s", top->sym_evsel->idx, evsel__name(top->sym_evsel));
prompt_integer(&counter, "Enter details event counter");
if (counter >= top->evlist->core.nr_entries) {
top->sym_evsel = evlist__first(top->evlist);
- fprintf(stderr, "Sorry, no such event, using %s.\n", perf_evsel__name(top->sym_evsel));
+ fprintf(stderr, "Sorry, no such event, using %s.\n", evsel__name(top->sym_evsel));
sleep(1);
break;
}
if (machine__resolve(machine, &al, sample) < 0)
return;
+ if (top->stitch_lbr)
+ al.thread->lbr_stitch_enable = true;
+
if (!machine->kptr_restrict_warned &&
symbol_conf.kptr_restrict &&
al.cpumode == PERF_RECORD_MISC_KERNEL) {
perf_top_overwrite_fallback(top, counter))
goto try_again;
- if (perf_evsel__fallback(counter, errno, msg, sizeof(msg))) {
+ if (evsel__fallback(counter, errno, msg, sizeof(msg))) {
if (verbose > 0)
ui__warning("%s\n", msg);
goto try_again;
}
- perf_evsel__open_strerror(counter, &opts->target,
- errno, msg, sizeof(msg));
+ evsel__open_strerror(counter, &opts->target, errno, msg, sizeof(msg));
ui__error("%s\n", msg);
goto out_err;
}
"Sort the output by the event at the index n in group. "
"If n is invalid, sort by the first event. "
"WARNING: should be used on grouped events."),
+ OPT_BOOLEAN(0, "stitch-lbr", &top.stitch_lbr,
+ "Enable LBR callgraph stitching approach"),
OPTS_EVSWITCH(&top.evswitch),
OPT_END()
};
- struct evlist *sb_evlist = NULL;
const char * const top_usage[] = {
"perf top [<options>]",
NULL
}
}
+ if (top.stitch_lbr && !(callchain_param.record_mode == CALLCHAIN_LBR)) {
+ pr_err("Error: --stitch-lbr must be used with --call-graph lbr\n");
+ goto out_delete_evlist;
+ }
+
if (opts->branch_stack && callchain_param.enabled)
symbol_conf.show_branchflag_count = true;
goto out_delete_evlist;
}
- if (!top.record_opts.no_bpf_event)
- bpf_event__add_sb_event(&sb_evlist, &perf_env);
+ if (!top.record_opts.no_bpf_event) {
+ top.sb_evlist = evlist__new();
+
+ if (top.sb_evlist == NULL) {
+ pr_err("Couldn't create side band evlist.\n.");
+ goto out_delete_evlist;
+ }
+
+ if (evlist__add_bpf_sb_event(top.sb_evlist, &perf_env)) {
+ pr_err("Couldn't ask for PERF_RECORD_BPF_EVENT side band events.\n.");
+ goto out_delete_evlist;
+ }
+ }
- if (perf_evlist__start_sb_thread(sb_evlist, target)) {
+ if (perf_evlist__start_sb_thread(top.sb_evlist, target)) {
pr_debug("Couldn't start the BPF side band thread:\nBPF programs starting from now on won't be annotatable\n");
opts->no_bpf_event = true;
}
status = __cmd_top(&top);
if (!opts->no_bpf_event)
- perf_evlist__stop_sb_thread(sb_evlist);
+ perf_evlist__stop_sb_thread(top.sb_evlist);
out_delete_evlist:
evlist__delete(top.evlist);
return NULL;
}
-static int perf_evsel__init_tp_uint_field(struct evsel *evsel,
- struct tp_field *field,
- const char *name)
+static int evsel__init_tp_uint_field(struct evsel *evsel, struct tp_field *field, const char *name)
{
- struct tep_format_field *format_field = perf_evsel__field(evsel, name);
+ struct tep_format_field *format_field = evsel__field(evsel, name);
if (format_field == NULL)
return -1;
#define perf_evsel__init_sc_tp_uint_field(evsel, name) \
({ struct syscall_tp *sc = __evsel__syscall_tp(evsel);\
- perf_evsel__init_tp_uint_field(evsel, &sc->name, #name); })
+ evsel__init_tp_uint_field(evsel, &sc->name, #name); })
-static int perf_evsel__init_tp_ptr_field(struct evsel *evsel,
- struct tp_field *field,
- const char *name)
+static int evsel__init_tp_ptr_field(struct evsel *evsel, struct tp_field *field, const char *name)
{
- struct tep_format_field *format_field = perf_evsel__field(evsel, name);
+ struct tep_format_field *format_field = evsel__field(evsel, name);
if (format_field == NULL)
return -1;
#define perf_evsel__init_sc_tp_ptr_field(evsel, name) \
({ struct syscall_tp *sc = __evsel__syscall_tp(evsel);\
- perf_evsel__init_tp_ptr_field(evsel, &sc->name, #name); })
+ evsel__init_tp_ptr_field(evsel, &sc->name, #name); })
static void evsel__delete_priv(struct evsel *evsel)
{
evsel__delete(evsel);
}
-static int perf_evsel__init_syscall_tp(struct evsel *evsel)
+static int evsel__init_syscall_tp(struct evsel *evsel)
{
struct syscall_tp *sc = evsel__syscall_tp(evsel);
if (sc != NULL) {
- if (perf_evsel__init_tp_uint_field(evsel, &sc->id, "__syscall_nr") &&
- perf_evsel__init_tp_uint_field(evsel, &sc->id, "nr"))
+ if (evsel__init_tp_uint_field(evsel, &sc->id, "__syscall_nr") &&
+ evsel__init_tp_uint_field(evsel, &sc->id, "nr"))
return -ENOENT;
return 0;
}
return -ENOMEM;
}
-static int perf_evsel__init_augmented_syscall_tp(struct evsel *evsel, struct evsel *tp)
+static int evsel__init_augmented_syscall_tp(struct evsel *evsel, struct evsel *tp)
{
struct syscall_tp *sc = evsel__syscall_tp(evsel);
if (sc != NULL) {
- struct tep_format_field *syscall_id = perf_evsel__field(tp, "id");
+ struct tep_format_field *syscall_id = evsel__field(tp, "id");
if (syscall_id == NULL)
- syscall_id = perf_evsel__field(tp, "__syscall_nr");
+ syscall_id = evsel__field(tp, "__syscall_nr");
if (syscall_id == NULL ||
__tp_field__init_uint(&sc->id, syscall_id->size, syscall_id->offset, evsel->needs_swap))
return -EINVAL;
return -ENOMEM;
}
-static int perf_evsel__init_augmented_syscall_tp_args(struct evsel *evsel)
+static int evsel__init_augmented_syscall_tp_args(struct evsel *evsel)
{
struct syscall_tp *sc = __evsel__syscall_tp(evsel);
return __tp_field__init_ptr(&sc->args, sc->id.offset + sizeof(u64));
}
-static int perf_evsel__init_augmented_syscall_tp_ret(struct evsel *evsel)
+static int evsel__init_augmented_syscall_tp_ret(struct evsel *evsel)
{
struct syscall_tp *sc = __evsel__syscall_tp(evsel);
return __tp_field__init_uint(&sc->ret, sizeof(u64), sc->id.offset + sizeof(u64), evsel->needs_swap);
}
-static int perf_evsel__init_raw_syscall_tp(struct evsel *evsel, void *handler)
+static int evsel__init_raw_syscall_tp(struct evsel *evsel, void *handler)
{
if (evsel__syscall_tp(evsel) != NULL) {
if (perf_evsel__init_sc_tp_uint_field(evsel, id))
if (IS_ERR(evsel))
return NULL;
- if (perf_evsel__init_raw_syscall_tp(evsel, handler))
+ if (evsel__init_raw_syscall_tp(evsel, handler))
goto out_delete;
return evsel;
return syscall__set_arg_fmts(sc);
}
-static int perf_evsel__init_tp_arg_scnprintf(struct evsel *evsel)
+static int evsel__init_tp_arg_scnprintf(struct evsel *evsel)
{
struct syscall_arg_fmt *fmt = evsel__syscall_arg_fmt(evsel);
if (verbose > 1) {
static u64 n;
fprintf(trace->output, "Invalid syscall %d id, skipping (%s, %" PRIu64 ") ...\n",
- id, perf_evsel__name(evsel), ++n);
+ id, evsel__name(evsel), ++n);
}
return NULL;
}
double ts = (double)sample->time / NSEC_PER_MSEC;
printed += fprintf(trace->output, "%22s %10.3f %s %d/%d [%d]\n",
- perf_evsel__name(evsel), ts,
+ evsel__name(evsel), ts,
thread__comm_str(thread),
sample->pid, sample->tid, sample->cpu);
}
static const char *errno_to_name(struct evsel *evsel, int err)
{
- struct perf_env *env = perf_evsel__env(evsel);
+ struct perf_env *env = evsel__env(evsel);
const char *arch_name = perf_env__arch(env);
return arch_syscalls__strerrno(arch_name, err);
if (callchain_ret > 0)
trace__fprintf_callchain(trace, sample);
else if (callchain_ret < 0)
- pr_err("Problem processing %s callchain, skipping...\n", perf_evsel__name(evsel));
+ pr_err("Problem processing %s callchain, skipping...\n", evsel__name(evsel));
out:
ttrace->entry_pending = false;
err = 0;
size_t filename_len, entry_str_len, to_move;
ssize_t remaining_space;
char *pos;
- const char *filename = perf_evsel__rawptr(evsel, sample, "pathname");
+ const char *filename = evsel__rawptr(evsel, sample, "pathname");
if (!thread)
goto out;
union perf_event *event __maybe_unused,
struct perf_sample *sample)
{
- u64 runtime = perf_evsel__intval(evsel, sample, "runtime");
+ u64 runtime = evsel__intval(evsel, sample, "runtime");
double runtime_ms = (double)runtime / NSEC_PER_MSEC;
struct thread *thread = machine__findnew_thread(trace->host,
sample->pid,
out_dump:
fprintf(trace->output, "%s: comm=%s,pid=%u,runtime=%" PRIu64 ",vruntime=%" PRIu64 ")\n",
evsel->name,
- perf_evsel__strval(evsel, sample, "comm"),
- (pid_t)perf_evsel__intval(evsel, sample, "pid"),
+ evsel__strval(evsel, sample, "comm"),
+ (pid_t)evsel__intval(evsel, sample, "pid"),
runtime,
- perf_evsel__intval(evsel, sample, "vruntime"));
+ evsel__intval(evsel, sample, "vruntime"));
goto out_put;
}
fprintf(trace->output, "%s(", evsel->name);
- if (perf_evsel__is_bpf_output(evsel)) {
+ if (evsel__is_bpf_output(evsel)) {
bpf_output__fprintf(trace, sample);
} else if (evsel->tp_format) {
if (strncmp(evsel->tp_format->name, "sys_enter_", 10) ||
if (callchain_ret > 0)
trace__fprintf_callchain(trace, sample);
else if (callchain_ret < 0)
- pr_err("Problem processing %s callchain, skipping...\n", perf_evsel__name(evsel));
+ pr_err("Problem processing %s callchain, skipping...\n", evsel__name(evsel));
++trace->nr_events_printed;
if (callchain_ret > 0)
trace__fprintf_callchain(trace, sample);
else if (callchain_ret < 0)
- pr_err("Problem processing %s callchain, skipping...\n", perf_evsel__name(evsel));
+ pr_err("Problem processing %s callchain, skipping...\n", evsel__name(evsel));
++trace->nr_events_printed;
out:
}
evlist__for_each_entry_safe(evlist, evsel, tmp) {
- if (!strstarts(perf_evsel__name(evsel), "probe:vfs_getname"))
+ if (!strstarts(evsel__name(evsel), "probe:vfs_getname"))
continue;
- if (perf_evsel__field(evsel, "pathname")) {
+ if (evsel__field(evsel, "pathname")) {
evsel->handler = trace__vfs_getname;
found = true;
continue;
if (evsel->core.attr.type == PERF_TYPE_TRACEPOINT &&
sample->raw_data == NULL) {
fprintf(trace->output, "%s sample with no payload for tid: %d, cpu %d, raw_size=%d, skipping...\n",
- perf_evsel__name(evsel), sample->tid,
+ evsel__name(evsel), sample->tid,
sample->cpu, sample->raw_size);
} else {
tracepoint_handler handler = evsel->handler;
if (perf_evsel__init_sc_tp_uint_field(sys_exit, ret))
goto out_delete_sys_exit;
- perf_evsel__config_callchain(sys_enter, &trace->opts, &callchain_param);
- perf_evsel__config_callchain(sys_exit, &trace->opts, &callchain_param);
+ evsel__config_callchain(sys_enter, &trace->opts, &callchain_param);
+ evsel__config_callchain(sys_exit, &trace->opts, &callchain_param);
evlist__add(evlist, sys_enter);
evlist__add(evlist, sys_exit);
if (filter == NULL)
goto out_enomem;
- if (!perf_evsel__append_tp_filter(trace->syscalls.events.sys_enter,
- filter)) {
+ if (!evsel__append_tp_filter(trace->syscalls.events.sys_enter, filter)) {
sys_exit = trace->syscalls.events.sys_exit;
- err = perf_evsel__append_tp_filter(sys_exit, filter);
+ err = evsel__append_tp_filter(sys_exit, filter);
}
free(filter);
return __trace__deliver_event(trace, event->event);
}
-static struct syscall_arg_fmt *perf_evsel__syscall_arg_fmt(struct evsel *evsel, char *arg)
+static struct syscall_arg_fmt *evsel__find_syscall_arg_fmt_by_name(struct evsel *evsel, char *arg)
{
struct tep_format_field *field;
struct syscall_arg_fmt *fmt = __evsel__syscall_arg_fmt(evsel);
scnprintf(arg, sizeof(arg), "%.*s", left_size, left);
- fmt = perf_evsel__syscall_arg_fmt(evsel, arg);
+ fmt = evsel__find_syscall_arg_fmt_by_name(evsel, arg);
if (fmt == NULL) {
pr_err("\"%s\" not found in \"%s\", can't set filter \"%s\"\n",
arg, evsel->name, evsel->filter);
if (new_filter != evsel->filter) {
pr_debug("New filter for %s: %s\n", evsel->name, new_filter);
- perf_evsel__set_filter(evsel, new_filter);
+ evsel__set_filter(evsel, new_filter);
free(new_filter);
}
pgfault_maj = perf_evsel__new_pgfault(PERF_COUNT_SW_PAGE_FAULTS_MAJ);
if (pgfault_maj == NULL)
goto out_error_mem;
- perf_evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
+ evsel__config_callchain(pgfault_maj, &trace->opts, &callchain_param);
evlist__add(evlist, pgfault_maj);
}
pgfault_min = perf_evsel__new_pgfault(PERF_COUNT_SW_PAGE_FAULTS_MIN);
if (pgfault_min == NULL)
goto out_error_mem;
- perf_evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
+ evsel__config_callchain(pgfault_min, &trace->opts, &callchain_param);
evlist__add(evlist, pgfault_min);
}
out_error_apply_filters:
fprintf(trace->output,
"Failed to set filter \"%s\" on event %s with %d (%s)\n",
- evsel->filter, perf_evsel__name(evsel), errno,
+ evsel->filter, evsel__name(evsel), errno,
str_error_r(errno, errbuf, sizeof(errbuf)));
goto out_delete_evlist;
}
"syscalls:sys_enter");
if (evsel &&
- (perf_evsel__init_raw_syscall_tp(evsel, trace__sys_enter) < 0 ||
+ (evsel__init_raw_syscall_tp(evsel, trace__sys_enter) < 0 ||
perf_evsel__init_sc_tp_ptr_field(evsel, args))) {
pr_err("Error during initialize raw_syscalls:sys_enter event\n");
goto out;
evsel = perf_evlist__find_tracepoint_by_name(session->evlist,
"syscalls:sys_exit");
if (evsel &&
- (perf_evsel__init_raw_syscall_tp(evsel, trace__sys_exit) < 0 ||
+ (evsel__init_raw_syscall_tp(evsel, trace__sys_exit) < 0 ||
perf_evsel__init_sc_tp_uint_field(evsel, ret))) {
pr_err("Error during initialize raw_syscalls:sys_exit event\n");
goto out;
continue;
if (strcmp(evsel->tp_format->system, "syscalls")) {
- perf_evsel__init_tp_arg_scnprintf(evsel);
+ evsel__init_tp_arg_scnprintf(evsel);
continue;
}
- if (perf_evsel__init_syscall_tp(evsel))
+ if (evsel__init_syscall_tp(evsel))
return -1;
if (!strncmp(evsel->tp_format->name, "sys_enter_", 10)) {
*/
if (trace.syscalls.events.augmented) {
evlist__for_each_entry(trace.evlist, evsel) {
- bool raw_syscalls_sys_exit = strcmp(perf_evsel__name(evsel), "raw_syscalls:sys_exit") == 0;
+ bool raw_syscalls_sys_exit = strcmp(evsel__name(evsel), "raw_syscalls:sys_exit") == 0;
if (raw_syscalls_sys_exit) {
trace.raw_augmented_syscalls = true;
}
if (trace.syscalls.events.augmented->priv == NULL &&
- strstr(perf_evsel__name(evsel), "syscalls:sys_enter")) {
+ strstr(evsel__name(evsel), "syscalls:sys_enter")) {
struct evsel *augmented = trace.syscalls.events.augmented;
- if (perf_evsel__init_augmented_syscall_tp(augmented, evsel) ||
- perf_evsel__init_augmented_syscall_tp_args(augmented))
+ if (evsel__init_augmented_syscall_tp(augmented, evsel) ||
+ evsel__init_augmented_syscall_tp_args(augmented))
goto out;
/*
* Augmented is __augmented_syscalls__ BPF_OUTPUT event
* as not to filter it, then we'll handle it just like we would
* for the BPF_OUTPUT one:
*/
- if (perf_evsel__init_augmented_syscall_tp(evsel, evsel) ||
- perf_evsel__init_augmented_syscall_tp_args(evsel))
+ if (evsel__init_augmented_syscall_tp(evsel, evsel) ||
+ evsel__init_augmented_syscall_tp_args(evsel))
goto out;
evsel->handler = trace__sys_enter;
}
- if (strstarts(perf_evsel__name(evsel), "syscalls:sys_exit_")) {
+ if (strstarts(evsel__name(evsel), "syscalls:sys_exit_")) {
struct syscall_tp *sc;
init_augmented_syscall_tp:
- if (perf_evsel__init_augmented_syscall_tp(evsel, evsel))
+ if (evsel__init_augmented_syscall_tp(evsel, evsel))
goto out;
sc = __evsel__syscall_tp(evsel);
/*
*/
if (trace.raw_augmented_syscalls)
trace.raw_augmented_syscalls_args_size = (6 + 1) * sizeof(long) + sc->id.offset;
- perf_evsel__init_augmented_syscall_tp_ret(evsel);
+ evsel__init_augmented_syscall_tp_ret(evsel);
evsel->handler = trace__sys_exit;
}
}
their own tasks.
A 'pid == -1' and 'cpu == x' counter is a per CPU counter that counts
-all events on CPU-x. Per CPU counters need CAP_SYS_ADMIN privilege.
+all events on CPU-x. Per CPU counters need CAP_PERFMON or CAP_SYS_ADMIN
+privilege.
The 'flags' parameter is currently unused and must be zero.
--- /dev/null
+[
+ {
+ "MetricExpr": "(hv_24x7@PM_MCS01_128B_RD_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS01_128B_RD_DISP_PORT23\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_RD_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_RD_DISP_PORT23\\,chip\\=?@)",
+ "MetricName": "Memory_RD_BW_Chip",
+ "MetricGroup": "Memory_BW",
+ "ScaleUnit": "1.6e-2MB"
+ },
+ {
+ "MetricExpr": "(hv_24x7@PM_MCS01_128B_WR_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS01_128B_WR_DISP_PORT23\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_WR_DISP_PORT01\\,chip\\=?@ + hv_24x7@PM_MCS23_128B_WR_DISP_PORT23\\,chip\\=?@ )",
+ "MetricName": "Memory_WR_BW_Chip",
+ "MetricGroup": "Memory_BW",
+ "ScaleUnit": "1.6e-2MB"
+ },
+ {
+ "MetricExpr": "(hv_24x7@PM_PB_CYC\\,chip\\=?@ )",
+ "MetricName": "PowerBUS_Frequency",
+ "ScaleUnit": "2.5e-7GHz"
+ }
+]
* Map a CPU to its table of PMU events. The CPU is identified by the
* cpuid field, which is an arch-specific identifier for the CPU.
* The identifier specified in tools/perf/pmu-events/arch/xxx/mapfile
- * must match the get_cpustr() in tools/perf/arch/xxx/util/header.c)
+ * must match the get_cpuid_str() in tools/perf/arch/xxx/util/header.c)
*
* The cpuid can contain any character other than the comma.
*/
--- /dev/null
+#!/bin/bash
+perf record -g "$@"
--- /dev/null
+#!/bin/bash
+# description: create flame graphs
+perf script -s "$PERF_EXEC_PATH"/scripts/python/flamegraph.py -- "$@"
--- /dev/null
+# flamegraph.py - create flame graphs from perf samples
+# SPDX-License-Identifier: GPL-2.0
+#
+# Usage:
+#
+# perf record -a -g -F 99 sleep 60
+# perf script report flamegraph
+#
+# Combined:
+#
+# perf script flamegraph -a -F 99 sleep 60
+#
+# Written by Andreas Gerstmayr <agerstmayr@redhat.com>
+# Flame Graphs invented by Brendan Gregg <bgregg@netflix.com>
+# Works in tandem with d3-flame-graph by Martin Spier <mspier@netflix.com>
+
+from __future__ import print_function
+import sys
+import os
+import argparse
+import json
+
+
+class Node:
+ def __init__(self, name, libtype=""):
+ self.name = name
+ self.libtype = libtype
+ self.value = 0
+ self.children = []
+
+ def toJSON(self):
+ return {
+ "n": self.name,
+ "l": self.libtype,
+ "v": self.value,
+ "c": self.children
+ }
+
+
+class FlameGraphCLI:
+ def __init__(self, args):
+ self.args = args
+ self.stack = Node("root")
+
+ if self.args.format == "html" and \
+ not os.path.isfile(self.args.template):
+ print("Flame Graph template {} does not exist. Please install "
+ "the js-d3-flame-graph (RPM) or libjs-d3-flame-graph (deb) "
+ "package, specify an existing flame graph template "
+ "(--template PATH) or another output format "
+ "(--format FORMAT).".format(self.args.template),
+ file=sys.stderr)
+ sys.exit(1)
+
+ def find_or_create_node(self, node, name, dso):
+ libtype = "kernel" if dso == "[kernel.kallsyms]" else ""
+ if name is None:
+ name = "[unknown]"
+
+ for child in node.children:
+ if child.name == name and child.libtype == libtype:
+ return child
+
+ child = Node(name, libtype)
+ node.children.append(child)
+ return child
+
+ def process_event(self, event):
+ node = self.find_or_create_node(self.stack, event["comm"], None)
+ if "callchain" in event:
+ for entry in reversed(event['callchain']):
+ node = self.find_or_create_node(
+ node, entry.get("sym", {}).get("name"), event.get("dso"))
+ else:
+ node = self.find_or_create_node(
+ node, entry.get("symbol"), event.get("dso"))
+ node.value += 1
+
+ def trace_end(self):
+ json_str = json.dumps(self.stack, default=lambda x: x.toJSON())
+
+ if self.args.format == "html":
+ try:
+ with open(self.args.template) as f:
+ output_str = f.read().replace("/** @flamegraph_json **/",
+ json_str)
+ except IOError as e:
+ print("Error reading template file: {}".format(e), file=sys.stderr)
+ sys.exit(1)
+ output_fn = self.args.output or "flamegraph.html"
+ else:
+ output_str = json_str
+ output_fn = self.args.output or "stacks.json"
+
+ if output_fn == "-":
+ sys.stdout.write(output_str)
+ else:
+ print("dumping data to {}".format(output_fn))
+ try:
+ with open(output_fn, "w") as out:
+ out.write(output_str)
+ except IOError as e:
+ print("Error writing output file: {}".format(e), file=sys.stderr)
+ sys.exit(1)
+
+
+if __name__ == "__main__":
+ parser = argparse.ArgumentParser(description="Create flame graphs.")
+ parser.add_argument("-f", "--format",
+ default="html", choices=["json", "html"],
+ help="output file format")
+ parser.add_argument("-o", "--output",
+ help="output file name")
+ parser.add_argument("--template",
+ default="/usr/share/d3-flame-graph/d3-flamegraph-base.html",
+ help="path to flamegraph HTML template")
+ parser.add_argument("-i", "--input",
+ help=argparse.SUPPRESS)
+
+ args = parser.parse_args()
+ cli = FlameGraphCLI(args)
+
+ process_event = cli.process_event
+ trace_end = cli.trace_end
perf-y += maps.o
perf-y += time-utils-test.o
perf-y += genelf.o
+perf-y += api-io.o
$(OUTPUT)tests/llvm-src-base.c: tests/bpf-script-example.c tests/Build
$(call rule_mkdir)
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+#include <sys/types.h>
+#include <sys/stat.h>
+#include <fcntl.h>
+#include <limits.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <string.h>
+#include <unistd.h>
+
+#include "debug.h"
+#include "tests.h"
+#include <api/io.h>
+#include <linux/kernel.h>
+
+#define TEMPL "/tmp/perf-test-XXXXXX"
+
+#define EXPECT_EQUAL(val, expected) \
+do { \
+ if (val != expected) { \
+ pr_debug("%s:%d: %d != %d\n", \
+ __FILE__, __LINE__, val, expected); \
+ ret = -1; \
+ } \
+} while (0)
+
+#define EXPECT_EQUAL64(val, expected) \
+do { \
+ if (val != expected) { \
+ pr_debug("%s:%d: %lld != %lld\n", \
+ __FILE__, __LINE__, val, expected); \
+ ret = -1; \
+ } \
+} while (0)
+
+static int make_test_file(char path[PATH_MAX], const char *contents)
+{
+ ssize_t contents_len = strlen(contents);
+ int fd;
+
+ strcpy(path, TEMPL);
+ fd = mkstemp(path);
+ if (fd < 0) {
+ pr_debug("mkstemp failed");
+ return -1;
+ }
+ if (write(fd, contents, contents_len) < contents_len) {
+ pr_debug("short write");
+ close(fd);
+ unlink(path);
+ return -1;
+ }
+ close(fd);
+ return 0;
+}
+
+static int setup_test(char path[PATH_MAX], const char *contents,
+ size_t buf_size, struct io *io)
+{
+ if (make_test_file(path, contents))
+ return -1;
+
+ io->fd = open(path, O_RDONLY);
+ if (io->fd < 0) {
+ pr_debug("Failed to open '%s'\n", path);
+ unlink(path);
+ return -1;
+ }
+ io->buf = malloc(buf_size);
+ if (io->buf == NULL) {
+ pr_debug("Failed to allocate memory");
+ close(io->fd);
+ unlink(path);
+ return -1;
+ }
+ io__init(io, io->fd, io->buf, buf_size);
+ return 0;
+}
+
+static void cleanup_test(char path[PATH_MAX], struct io *io)
+{
+ free(io->buf);
+ close(io->fd);
+ unlink(path);
+}
+
+static int do_test_get_char(const char *test_string, size_t buf_size)
+{
+ char path[PATH_MAX];
+ struct io io;
+ int ch, ret = 0;
+ size_t i;
+
+ if (setup_test(path, test_string, buf_size, &io))
+ return -1;
+
+ for (i = 0; i < strlen(test_string); i++) {
+ ch = io__get_char(&io);
+
+ EXPECT_EQUAL(ch, test_string[i]);
+ EXPECT_EQUAL(io.eof, false);
+ }
+ ch = io__get_char(&io);
+ EXPECT_EQUAL(ch, -1);
+ EXPECT_EQUAL(io.eof, true);
+
+ cleanup_test(path, &io);
+ return ret;
+}
+
+static int test_get_char(void)
+{
+ int i, ret = 0;
+ size_t j;
+
+ static const char *const test_strings[] = {
+ "12345678abcdef90",
+ "a\nb\nc\nd\n",
+ "\a\b\t\v\f\r",
+ };
+ for (i = 0; i <= 10; i++) {
+ for (j = 0; j < ARRAY_SIZE(test_strings); j++) {
+ if (do_test_get_char(test_strings[j], 1 << i))
+ ret = -1;
+ }
+ }
+ return ret;
+}
+
+static int do_test_get_hex(const char *test_string,
+ __u64 val1, int ch1,
+ __u64 val2, int ch2,
+ __u64 val3, int ch3,
+ bool end_eof)
+{
+ char path[PATH_MAX];
+ struct io io;
+ int ch, ret = 0;
+ __u64 hex;
+
+ if (setup_test(path, test_string, 4, &io))
+ return -1;
+
+ ch = io__get_hex(&io, &hex);
+ EXPECT_EQUAL64(hex, val1);
+ EXPECT_EQUAL(ch, ch1);
+
+ ch = io__get_hex(&io, &hex);
+ EXPECT_EQUAL64(hex, val2);
+ EXPECT_EQUAL(ch, ch2);
+
+ ch = io__get_hex(&io, &hex);
+ EXPECT_EQUAL64(hex, val3);
+ EXPECT_EQUAL(ch, ch3);
+
+ EXPECT_EQUAL(io.eof, end_eof);
+
+ cleanup_test(path, &io);
+ return ret;
+}
+
+static int test_get_hex(void)
+{
+ int ret = 0;
+
+ if (do_test_get_hex("12345678abcdef90",
+ 0x12345678abcdef90, -1,
+ 0, -1,
+ 0, -1,
+ true))
+ ret = -1;
+
+ if (do_test_get_hex("1\n2\n3\n",
+ 1, '\n',
+ 2, '\n',
+ 3, '\n',
+ false))
+ ret = -1;
+
+ if (do_test_get_hex("12345678ABCDEF90;a;b",
+ 0x12345678abcdef90, ';',
+ 0xa, ';',
+ 0xb, -1,
+ true))
+ ret = -1;
+
+ if (do_test_get_hex("0x1x2x",
+ 0, 'x',
+ 1, 'x',
+ 2, 'x',
+ false))
+ ret = -1;
+
+ if (do_test_get_hex("x1x",
+ 0, -2,
+ 1, 'x',
+ 0, -1,
+ true))
+ ret = -1;
+
+ if (do_test_get_hex("10000000000000000000000000000abcdefgh99i",
+ 0xabcdef, 'g',
+ 0, -2,
+ 0x99, 'i',
+ false))
+ ret = -1;
+
+ return ret;
+}
+
+static int do_test_get_dec(const char *test_string,
+ __u64 val1, int ch1,
+ __u64 val2, int ch2,
+ __u64 val3, int ch3,
+ bool end_eof)
+{
+ char path[PATH_MAX];
+ struct io io;
+ int ch, ret = 0;
+ __u64 dec;
+
+ if (setup_test(path, test_string, 4, &io))
+ return -1;
+
+ ch = io__get_dec(&io, &dec);
+ EXPECT_EQUAL64(dec, val1);
+ EXPECT_EQUAL(ch, ch1);
+
+ ch = io__get_dec(&io, &dec);
+ EXPECT_EQUAL64(dec, val2);
+ EXPECT_EQUAL(ch, ch2);
+
+ ch = io__get_dec(&io, &dec);
+ EXPECT_EQUAL64(dec, val3);
+ EXPECT_EQUAL(ch, ch3);
+
+ EXPECT_EQUAL(io.eof, end_eof);
+
+ cleanup_test(path, &io);
+ return ret;
+}
+
+static int test_get_dec(void)
+{
+ int ret = 0;
+
+ if (do_test_get_dec("12345678abcdef90",
+ 12345678, 'a',
+ 0, -2,
+ 0, -2,
+ false))
+ ret = -1;
+
+ if (do_test_get_dec("1\n2\n3\n",
+ 1, '\n',
+ 2, '\n',
+ 3, '\n',
+ false))
+ ret = -1;
+
+ if (do_test_get_dec("12345678;1;2",
+ 12345678, ';',
+ 1, ';',
+ 2, -1,
+ true))
+ ret = -1;
+
+ if (do_test_get_dec("0x1x2x",
+ 0, 'x',
+ 1, 'x',
+ 2, 'x',
+ false))
+ ret = -1;
+
+ if (do_test_get_dec("x1x",
+ 0, -2,
+ 1, 'x',
+ 0, -1,
+ true))
+ ret = -1;
+
+ if (do_test_get_dec("10000000000000000000000000000000000000000000000000000000000123456789ab99c",
+ 123456789, 'a',
+ 0, -2,
+ 99, 'c',
+ false))
+ ret = -1;
+
+ return ret;
+}
+
+int test__api_io(struct test *test __maybe_unused,
+ int subtest __maybe_unused)
+{
+ int ret = 0;
+
+ if (test_get_char())
+ ret = TEST_FAIL;
+ if (test_get_hex())
+ ret = TEST_FAIL;
+ if (test_get_dec())
+ ret = TEST_FAIL;
+ return ret;
+}
.desc = "Test jit_write_elf",
.func = test__jit_write_elf,
},
+ {
+ .desc = "Test api io",
+ .func = test__api_io,
+ },
{
.desc = "maps__merge_in",
.func = test__maps__merge_in,
evsel->core.attr.disabled = 1;
- err = perf_evsel__open_per_thread(evsel, threads);
+ err = evsel__open_per_thread(evsel, threads);
if (err) {
pr_debug("Failed to open event cpu-clock:u\n");
return err;
return -1;
}
- err = perf_evsel__open_per_thread(evsel, threads);
+ err = evsel__open_per_thread(evsel, threads);
perf_thread_map__put(threads);
return err == 0 ? TEST_OK : TEST_FAIL;
evsel->core.attr.disabled = 1;
- err = perf_evsel__open_per_cpu(evsel, cpus, -1);
+ err = evsel__open_per_cpu(evsel, cpus, -1);
if (err) {
if (err == -EACCES)
return TEST_SKIP;
return -1;
}
- err = perf_evsel__open_per_cpu(evsel, cpus, -1);
+ err = evsel__open_per_cpu(evsel, cpus, -1);
if (err == -EACCES)
return TEST_SKIP;
TEST_ASSERT_VAL("failed to synthesize attr update scale",
!perf_event__synthesize_event_update_scale(NULL, evsel, process_event_scale));
- tmp.name = perf_evsel__name(evsel);
+ tmp.name = evsel__name(evsel);
TEST_ASSERT_VAL("failed to synthesize attr update name",
!perf_event__synthesize_event_update_name(&tmp.tool, evsel, process_event_name));
for (type = 0; type < PERF_COUNT_HW_CACHE_MAX; type++) {
for (op = 0; op < PERF_COUNT_HW_CACHE_OP_MAX; op++) {
/* skip invalid cache type */
- if (!perf_evsel__is_cache_op_valid(type, op))
+ if (!evsel__is_cache_op_valid(type, op))
continue;
for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) {
- __perf_evsel__hw_cache_type_op_res_name(type, op, i,
- name, sizeof(name));
+ __evsel__hw_cache_type_op_res_name(type, op, i, name, sizeof(name));
err = parse_events(evlist, name, NULL);
if (err)
ret = err;
for (type = 0; type < PERF_COUNT_HW_CACHE_MAX; type++) {
for (op = 0; op < PERF_COUNT_HW_CACHE_OP_MAX; op++) {
/* skip invalid cache type */
- if (!perf_evsel__is_cache_op_valid(type, op))
+ if (!evsel__is_cache_op_valid(type, op))
continue;
for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) {
- __perf_evsel__hw_cache_type_op_res_name(type, op, i,
- name, sizeof(name));
+ __evsel__hw_cache_type_op_res_name(type, op, i, name, sizeof(name));
if (evsel->idx != idx)
continue;
++idx;
- if (strcmp(perf_evsel__name(evsel), name)) {
- pr_debug("%s != %s\n", perf_evsel__name(evsel), name);
+ if (strcmp(evsel__name(evsel), name)) {
+ pr_debug("%s != %s\n", evsel__name(evsel), name);
ret = -1;
}
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
}
}
}
err = 0;
evlist__for_each_entry(evlist, evsel) {
- if (strcmp(perf_evsel__name(evsel), names[evsel->idx])) {
+ if (strcmp(evsel__name(evsel), names[evsel->idx])) {
--err;
- pr_debug("%s != %s\n", perf_evsel__name(evsel), names[evsel->idx]);
+ pr_debug("%s != %s\n", evsel__name(evsel), names[evsel->idx]);
}
}
static int perf_evsel__test_field(struct evsel *evsel, const char *name,
int size, bool should_be_signed)
{
- struct tep_format_field *field = perf_evsel__field(evsel, name);
+ struct tep_format_field *field = evsel__field(evsel, name);
int is_signed;
int ret = 0;
#include <string.h>
#include <linux/zalloc.h>
-static int test(struct parse_ctx *ctx, const char *e, double val2)
+static int test(struct expr_parse_ctx *ctx, const char *e, double val2)
{
double val;
- if (expr__parse(&val, ctx, e))
+ if (expr__parse(&val, ctx, e, 1))
TEST_ASSERT_VAL("parse test failed", 0);
TEST_ASSERT_VAL("unexpected value", val == val2);
return 0;
const char **other;
double val;
int i, ret;
- struct parse_ctx ctx;
+ struct expr_parse_ctx ctx;
int num_other;
expr__ctx_init(&ctx);
return ret;
p = "FOO/0";
- ret = expr__parse(&val, &ctx, p);
+ ret = expr__parse(&val, &ctx, p, 1);
TEST_ASSERT_VAL("division by zero", ret == -1);
p = "BAR/";
- ret = expr__parse(&val, &ctx, p);
+ ret = expr__parse(&val, &ctx, p, 1);
TEST_ASSERT_VAL("missing operand", ret == -1);
TEST_ASSERT_VAL("find other",
- expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", &other, &num_other) == 0);
+ expr__find_other("FOO + BAR + BAZ + BOZO", "FOO", &other, &num_other, 1) == 0);
TEST_ASSERT_VAL("find other", num_other == 3);
TEST_ASSERT_VAL("find other", !strcmp(other[0], "BAR"));
TEST_ASSERT_VAL("find other", !strcmp(other[1], "BAZ"));
TEST_ASSERT_VAL("find other", !strcmp(other[2], "BOZO"));
TEST_ASSERT_VAL("find other", other[3] == NULL);
+ TEST_ASSERT_VAL("find other",
+ expr__find_other("EVENT1\\,param\\=?@ + EVENT2\\,param\\=?@", NULL,
+ &other, &num_other, 3) == 0);
+ TEST_ASSERT_VAL("find other", num_other == 2);
+ TEST_ASSERT_VAL("find other", !strcmp(other[0], "EVENT1,param=3/"));
+ TEST_ASSERT_VAL("find other", !strcmp(other[1], "EVENT2,param=3/"));
+ TEST_ASSERT_VAL("find other", other[2] == NULL);
+
for (i = 0; i < num_other; i++)
zfree(&other[i]);
free((void *)other);
symbol_conf.use_callchain = false;
symbol_conf.cumulate_callchain = false;
- perf_evsel__reset_sample_bit(evsel, CALLCHAIN);
+ evsel__reset_sample_bit(evsel, CALLCHAIN);
setup_sorting(NULL);
callchain_register_param(&callchain_param);
symbol_conf.use_callchain = true;
symbol_conf.cumulate_callchain = false;
- perf_evsel__set_sample_bit(evsel, CALLCHAIN);
+ evsel__set_sample_bit(evsel, CALLCHAIN);
setup_sorting(NULL);
callchain_register_param(&callchain_param);
symbol_conf.use_callchain = false;
symbol_conf.cumulate_callchain = true;
- perf_evsel__reset_sample_bit(evsel, CALLCHAIN);
+ evsel__reset_sample_bit(evsel, CALLCHAIN);
setup_sorting(NULL);
callchain_register_param(&callchain_param);
symbol_conf.use_callchain = true;
symbol_conf.cumulate_callchain = true;
- perf_evsel__set_sample_bit(evsel, CALLCHAIN);
+ evsel__set_sample_bit(evsel, CALLCHAIN);
setup_sorting(NULL);
}
evsels[i]->core.attr.wakeup_events = 1;
- perf_evsel__set_sample_id(evsels[i], false);
+ evsel__set_sample_id(evsels[i], false);
evlist__add(evlist, evsels[i]);
if (nr_events[evsel->idx] != expected_nr_events[evsel->idx]) {
pr_debug("expected %d %s events, got %d\n",
expected_nr_events[evsel->idx],
- perf_evsel__name(evsel), nr_events[evsel->idx]);
+ evsel__name(evsel), nr_events[evsel->idx]);
err = -1;
goto out_delete_evlist;
}
if (cpus->map[cpu] >= CPU_SETSIZE)
continue;
- if (perf_evsel__read_on_cpu(evsel, cpu, 0) < 0) {
- pr_debug("perf_evsel__read_on_cpu\n");
+ if (evsel__read_on_cpu(evsel, cpu, 0) < 0) {
+ pr_debug("evsel__read_on_cpu\n");
err = -1;
break;
}
expected = nr_openat_calls + cpu;
if (perf_counts(evsel->counts, cpu, 0)->val != expected) {
- pr_debug("perf_evsel__read_on_cpu: expected to intercept %d calls on cpu %d, got %" PRIu64 "\n",
+ pr_debug("evsel__read_on_cpu: expected to intercept %d calls on cpu %d, got %" PRIu64 "\n",
expected, cpus->map[cpu], perf_counts(evsel->counts, cpu, 0)->val);
err = -1;
}
goto out_delete_evlist;
}
- perf_evsel__config(evsel, &opts, NULL);
+ evsel__config(evsel, &opts, NULL);
perf_thread_map__set_pid(evlist->core.threads, 0, getpid());
continue;
}
- err = perf_evsel__parse_sample(evsel, event, &sample);
+ err = evsel__parse_sample(evsel, event, &sample);
if (err) {
pr_debug("Can't parse sample, err = %d\n", err);
goto out_delete_evlist;
}
- tp_flags = perf_evsel__intval(evsel, &sample, "flags");
+ tp_flags = evsel__intval(evsel, &sample, "flags");
if (flags != tp_flags) {
pr_debug("%s: Expected flags=%#x, got %#x\n",
goto out_thread_map_delete;
}
- if (perf_evsel__open_per_thread(evsel, threads) < 0) {
+ if (evsel__open_per_thread(evsel, threads) < 0) {
pr_debug("failed to open counter: %s, "
"tweak /proc/sys/kernel/perf_event_paranoid?\n",
str_error_r(errno, sbuf, sizeof(sbuf)));
close(fd);
}
- if (perf_evsel__read_on_cpu(evsel, 0, 0) < 0) {
- pr_debug("perf_evsel__read_on_cpu\n");
+ if (evsel__read_on_cpu(evsel, 0, 0) < 0) {
+ pr_debug("evsel__read_on_cpu\n");
goto out_close_fd;
}
if (perf_counts(evsel->counts, 0, 0)->val != nr_openat_calls) {
- pr_debug("perf_evsel__read_on_cpu: expected to intercept %d calls, got %" PRIu64 "\n",
+ pr_debug("evsel__read_on_cpu: expected to intercept %d calls, got %" PRIu64 "\n",
nr_openat_calls, perf_counts(evsel->counts, 0, 0)->val);
goto out_close_fd;
}
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong name",
- !strcmp(perf_evsel__name(evsel), "mem:0:u"));
+ !strcmp(evsel__name(evsel), "mem:0:u"));
return test__checkevent_breakpoint(evlist);
}
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong name",
- !strcmp(perf_evsel__name(evsel), "mem:0:x:k"));
+ !strcmp(evsel__name(evsel), "mem:0:x:k"));
return test__checkevent_breakpoint_x(evlist);
}
TEST_ASSERT_VAL("wrong exclude_hv", !evsel->core.attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong name",
- !strcmp(perf_evsel__name(evsel), "mem:0:r:hp"));
+ !strcmp(evsel__name(evsel), "mem:0:r:hp"));
return test__checkevent_breakpoint_r(evlist);
}
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong name",
- !strcmp(perf_evsel__name(evsel), "mem:0:w:up"));
+ !strcmp(evsel__name(evsel), "mem:0:w:up"));
return test__checkevent_breakpoint_w(evlist);
}
TEST_ASSERT_VAL("wrong exclude_hv", evsel->core.attr.exclude_hv);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong name",
- !strcmp(perf_evsel__name(evsel), "mem:0:rw:kp"));
+ !strcmp(evsel__name(evsel), "mem:0:rw:kp"));
return test__checkevent_breakpoint_rw(evlist);
}
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
/* syscalls:sys_enter_openat:k */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_TRACEPOINT == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong sample_type",
PERF_TP_SAMPLE_TYPE == evsel->core.attr.sample_type);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
/* 1:1:hp */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", 1 == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config", 1 == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude_user", evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config", 1 == evsel->core.attr.config);
- TEST_ASSERT_VAL("wrong name", !strcmp(perf_evsel__name(evsel), "krava"));
+ TEST_ASSERT_VAL("wrong name", !strcmp(evsel__name(evsel), "krava"));
/* cpu/config=2/u" */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config", 2 == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong name",
- !strcmp(perf_evsel__name(evsel), "cpu/config=2/u"));
+ !strcmp(evsel__name(evsel), "cpu/config=2/u"));
return 0;
}
TEST_ASSERT_VAL("wrong time", !(PERF_SAMPLE_TIME & evsel->core.attr.sample_type));
/* cpu/config=2,call-graph=no,time=0,period=2000/ */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config", 2 == evsel->core.attr.config);
/*
TEST_ASSERT_VAL("wrong pinned", !evsel->core.attr.pinned);
/* cpu/pmu-event/u*/
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong number of entries", 2 == evlist->core.nr_entries);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_RAW == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong exclude_user",
TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cycles:upp */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 2);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cache-references + :u modifier */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_REFERENCES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cycles:k */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong group name",
!strcmp(leader->group_name, "group1"));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* group1 cycles:kppp */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 3);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* group2 cycles + G modifier */
- evsel = leader = perf_evsel__next(evsel);
+ evsel = leader = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong group name",
!strcmp(leader->group_name, "group2"));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* group2 1:3 + G modifier */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", 1 == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config", 3 == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude_user", !evsel->core.attr.exclude_user);
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions:u */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_INSTRUCTIONS == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude guest", !evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 1);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions:kp + p */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_INSTRUCTIONS == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", evsel->core.attr.precise_ip == 2);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
return 0;
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions + G */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_INSTRUCTIONS == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* cycles:G */
- evsel = leader = perf_evsel__next(evsel);
+ evsel = leader = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
TEST_ASSERT_VAL("wrong sample_read", !evsel->sample_read);
/* instructions:G */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_INSTRUCTIONS == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
/* cycles */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CPU_CYCLES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude guest", evsel->core.attr.exclude_guest);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
return 0;
}
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
/* cache-misses:G + :H group modifier */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
return 0;
}
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
/* cache-misses:H + :G group modifier */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
return 0;
}
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
/* cache-misses:H + :u group modifier */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
return 0;
}
TEST_ASSERT_VAL("wrong exclude host", evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong group name", !evsel->group_name);
- TEST_ASSERT_VAL("wrong leader", perf_evsel__is_group_leader(evsel));
+ TEST_ASSERT_VAL("wrong leader", evsel__is_group_leader(evsel));
TEST_ASSERT_VAL("wrong core.nr_members", evsel->core.nr_members == 2);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 0);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 0);
/* cache-misses:H + :uG group modifier */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong exclude host", !evsel->core.attr.exclude_host);
TEST_ASSERT_VAL("wrong precise_ip", !evsel->core.attr.precise_ip);
TEST_ASSERT_VAL("wrong leader", evsel->leader == leader);
- TEST_ASSERT_VAL("wrong group_idx", perf_evsel__group_idx(evsel) == 1);
+ TEST_ASSERT_VAL("wrong group_idx", evsel__group_idx(evsel) == 1);
return 0;
}
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
/* cache-misses - not sampling */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
/* branch-misses - not sampling */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_BRANCH_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong sample_read", evsel->sample_read);
/* branch-misses - not sampling */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_BRANCH_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong pinned", evsel->core.attr.pinned);
/* cache-misses - can not be pinned, but will go on with the leader */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong type", PERF_TYPE_HARDWARE == evsel->core.attr.type);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_CACHE_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong pinned", !evsel->core.attr.pinned);
/* branch-misses - ditto */
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
TEST_ASSERT_VAL("wrong config",
PERF_COUNT_HW_BRANCH_MISSES == evsel->core.attr.config);
TEST_ASSERT_VAL("wrong pinned", !evsel->core.attr.pinned);
return 0;
}
+static int test__checkevent_raw_pmu(struct evlist *evlist)
+{
+ struct evsel *evsel = evlist__first(evlist);
+
+ TEST_ASSERT_VAL("wrong number of entries", 1 == evlist->core.nr_entries);
+ TEST_ASSERT_VAL("wrong type", PERF_TYPE_SOFTWARE == evsel->core.attr.type);
+ TEST_ASSERT_VAL("wrong config", 0x1a == evsel->core.attr.config);
+ return 0;
+}
+
static int test__sym_event_slash(struct evlist *evlist)
{
struct evsel *evsel = evlist__first(evlist);
.name = "cpu/name='COMPLEX_CYCLES_NAME:orig=cycles,desc=chip-clock-ticks',period=0x1,event=0x2/ukp",
.check = test__checkevent_complex_name,
.id = 3,
- }
+ },
+ {
+ .name = "software/r1a/",
+ .check = test__checkevent_raw_pmu,
+ .id = 4,
+ },
};
struct terms_test {
* Config the evsels, setting attr->comm on the first one, etc.
*/
evsel = evlist__first(evlist);
- perf_evsel__set_sample_bit(evsel, CPU);
- perf_evsel__set_sample_bit(evsel, TID);
- perf_evsel__set_sample_bit(evsel, TIME);
+ evsel__set_sample_bit(evsel, CPU);
+ evsel__set_sample_bit(evsel, TID);
+ evsel__set_sample_bit(evsel, TIME);
perf_evlist__config(evlist, &opts, NULL);
err = sched__get_first_possible_cpu(evlist->workload.pid, &cpu_mask);
goto out_free;
}
- evsel.sample_size = __perf_evsel__sample_size(sample_type);
+ evsel.sample_size = __evsel__sample_size(sample_type);
- err = perf_evsel__parse_sample(&evsel, event, &sample_out);
+ err = evsel__parse_sample(&evsel, event, &sample_out);
if (err) {
pr_debug("%s failed for sample_type %#"PRIx64", error %d\n",
- "perf_evsel__parse_sample", sample_type, err);
+ "evsel__parse_sample", sample_type, err);
goto out_free;
}
evsel = perf_evlist__id2evsel(evlist, sample.id);
if (evsel == switch_tracking->switch_evsel) {
- next_tid = perf_evsel__intval(evsel, &sample, "next_pid");
- prev_tid = perf_evsel__intval(evsel, &sample, "prev_pid");
+ next_tid = evsel__intval(evsel, &sample, "next_pid");
+ prev_tid = evsel__intval(evsel, &sample, "prev_pid");
cpu = sample.cpu;
pr_debug3("sched_switch: cpu: %d prev_tid %d next_tid %d\n",
cpu, prev_tid, next_tid);
switch_evsel = evlist__last(evlist);
- perf_evsel__set_sample_bit(switch_evsel, CPU);
- perf_evsel__set_sample_bit(switch_evsel, TIME);
+ evsel__set_sample_bit(switch_evsel, CPU);
+ evsel__set_sample_bit(switch_evsel, TIME);
switch_evsel->core.system_wide = true;
switch_evsel->no_aux_samples = true;
goto out_err;
}
- perf_evsel__set_sample_bit(cycles_evsel, CPU);
- perf_evsel__set_sample_bit(cycles_evsel, TIME);
+ evsel__set_sample_bit(cycles_evsel, CPU);
+ evsel__set_sample_bit(cycles_evsel, TIME);
/* Fourth event */
err = parse_events(evlist, "dummy:u", NULL);
tracking_evsel->core.attr.freq = 0;
tracking_evsel->core.attr.sample_period = 1;
- perf_evsel__set_sample_bit(tracking_evsel, TIME);
+ evsel__set_sample_bit(tracking_evsel, TIME);
/* Config events */
perf_evlist__config(evlist, &opts, NULL);
int test__maps__merge_in(struct test *t, int subtest);
int test__time_utils(struct test *t, int subtest);
int test__jit_write_elf(struct test *test, int subtest);
+int test__api_io(struct test *test, int subtest);
bool test__bp_signal_is_supported(void);
bool test__bp_account_is_supported(void);
{
struct perf_session *session;
struct perf_data data = {
- .file = {
- .path = path,
- },
- .mode = PERF_DATA_MODE_WRITE,
+ .path = path,
+ .mode = PERF_DATA_MODE_WRITE,
};
session = perf_session__new(&data, false, NULL);
{
struct perf_session *session;
struct perf_data data = {
- .file = {
- .path = path,
- },
- .mode = PERF_DATA_MODE_READ,
+ .path = path,
+ .mode = PERF_DATA_MODE_READ,
};
int i;
struct hists *hists = evsel__hists(evsel);
bool current_entry = ui_browser__is_current_entry(browser, row);
unsigned long nr_events = hists->stats.nr_events[PERF_RECORD_SAMPLE];
- const char *ev_name = perf_evsel__name(evsel);
+ const char *ev_name = evsel__name(evsel);
char bf[256], unit;
const char *warn = " ";
size_t printed;
ui_browser__set_color(browser, current_entry ? HE_COLORSET_SELECTED :
HE_COLORSET_NORMAL);
- if (perf_evsel__is_group_event(evsel)) {
+ if (evsel__is_group_event(evsel)) {
struct evsel *pos;
- ev_name = perf_evsel__group_name(evsel);
+ ev_name = evsel__group_name(evsel);
for_each_group_member(pos, evsel) {
struct hists *pos_hists = evsel__hists(pos);
if (pos->core.node.next == &evlist->core.entries)
pos = evlist__first(evlist);
else
- pos = perf_evsel__next(pos);
+ pos = evsel__next(pos);
goto browse_hists;
case K_UNTAB:
if (pos->core.node.prev == &evlist->core.entries)
pos = evlist__last(evlist);
else
- pos = perf_evsel__prev(pos);
+ pos = evsel__prev(pos);
goto browse_hists;
case K_SWITCH_INPUT_DATA:
case K_RELOAD:
{
struct evsel *evsel = list_entry(entry, struct evsel, core.node);
- if (symbol_conf.event_group && !perf_evsel__is_group_leader(evsel))
+ if (symbol_conf.event_group && !evsel__is_group_leader(evsel))
return true;
return false;
ui_helpline__push("Press ESC to exit");
evlist__for_each_entry(evlist, pos) {
- const char *ev_name = perf_evsel__name(pos);
+ const char *ev_name = evsel__name(pos);
size_t line_len = strlen(ev_name) + 7;
if (menu.b.width < line_len)
nr_entries = 0;
evlist__for_each_entry(evlist, pos) {
- if (perf_evsel__is_group_leader(pos))
+ if (evsel__is_group_leader(pos))
nr_entries++;
}
size_t size)
{
struct hists *hists = evsel__hists(browser->block_evsel);
- const char *evname = perf_evsel__name(browser->block_evsel);
+ const char *evname = evsel__name(browser->block_evsel);
unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
int ret;
gtk_list_store_append(store, &iter);
- if (perf_evsel__is_group_event(evsel)) {
+ if (evsel__is_group_event(evsel)) {
for (i = 0; i < evsel->core.nr_members; i++) {
ret += perf_gtk__get_percent(s + ret,
sizeof(s) - ret,
evlist__for_each_entry(evlist, pos) {
struct hists *hists = evsel__hists(pos);
- const char *evname = perf_evsel__name(pos);
+ const char *evname = evsel__name(pos);
GtkWidget *scrolled_window;
GtkWidget *tab_label;
char buf[512];
size_t size = sizeof(buf);
if (symbol_conf.event_group) {
- if (!perf_evsel__is_group_leader(pos))
+ if (!evsel__is_group_leader(pos))
continue;
if (pos->core.nr_members > 1) {
- perf_evsel__group_desc(pos, buf, size);
+ evsel__group_desc(pos, buf, size);
evname = buf;
}
}
} else
ret = hpp__call_print_fn(hpp, print_fn, fmt, len, get_field(he));
- if (perf_evsel__is_group_event(evsel)) {
+ if (evsel__is_group_event(evsel)) {
int prev_idx, idx_delta;
struct hist_entry *pair;
int nr_members = evsel->core.nr_members;
- prev_idx = perf_evsel__group_idx(evsel);
+ prev_idx = evsel__group_idx(evsel);
list_for_each_entry(pair, &he->pairs.head, pairs.node) {
u64 period = get_field(pair);
continue;
evsel = hists_to_evsel(pair->hists);
- idx_delta = perf_evsel__group_idx(evsel) - prev_idx - 1;
+ idx_delta = evsel__group_idx(evsel) - prev_idx - 1;
while (idx_delta--) {
/*
len, period);
}
- prev_idx = perf_evsel__group_idx(evsel);
+ prev_idx = evsel__group_idx(evsel);
}
idx_delta = nr_members - prev_idx - 1;
list_for_each_entry(pair, &a->pairs.head, pairs.node) {
struct evsel *evsel = hists_to_evsel(pair->hists);
- fa[perf_evsel__group_idx(evsel)] = get_field(pair);
+ fa[evsel__group_idx(evsel)] = get_field(pair);
}
list_for_each_entry(pair, &b->pairs.head, pairs.node) {
struct evsel *evsel = hists_to_evsel(pair->hists);
- fb[perf_evsel__group_idx(evsel)] = get_field(pair);
+ fb[evsel__group_idx(evsel)] = get_field(pair);
}
*fields_a = fa;
int cmp, nr_members, ret, i;
cmp = field_cmp(get_field(a), get_field(b));
- if (!perf_evsel__is_group_event(evsel))
+ if (!evsel__is_group_event(evsel))
return cmp;
nr_members = evsel->core.nr_members;
return ret;
evsel = hists_to_evsel(a->hists);
- if (!perf_evsel__is_group_event(evsel))
+ if (!evsel__is_group_event(evsel))
return ret;
nr_members = evsel->core.nr_members;
perf-y += env.o
perf-y += event.o
perf-y += evlist.o
+perf-y += sideband_evlist.o
perf-y += evsel.o
perf-y += evsel_fprintf.o
perf-y += perf_event_attr_fprintf.o
perf-y += stat.o
perf-y += stat-shadow.o
perf-y += stat-display.o
+perf-y += perf_api_probe.o
perf-y += record.o
perf-y += srcline.o
perf-y += srccode.o
struct disasm_line *dl = NULL;
int nr = 1;
- if (perf_evsel__is_group_event(args->evsel))
+ if (evsel__is_group_event(args->evsel))
nr = args->evsel->core.nr_members;
dl = zalloc(disasm_line_size(nr));
if (queue)
return -1;
- if (perf_evsel__is_group_event(evsel))
+ if (evsel__is_group_event(evsel))
width *= evsel->core.nr_members;
if (!*al->line)
}
#endif // defined(HAVE_LIBBFD_SUPPORT) && defined(HAVE_LIBBPF_SUPPORT)
+static int
+symbol__disassemble_bpf_image(struct symbol *sym,
+ struct annotate_args *args)
+{
+ struct annotation *notes = symbol__annotation(sym);
+ struct disasm_line *dl;
+
+ args->offset = -1;
+ args->line = strdup("to be implemented");
+ args->line_nr = 0;
+ dl = disasm_line__new(args);
+ if (dl)
+ annotation_line__add(&dl->al, ¬es->src->source);
+
+ free(args->line);
+ return 0;
+}
+
/*
* Possibly create a new version of line with tabs expanded. Returns the
* existing or new line, storage is updated if a new line is allocated. If
if (dso->binary_type == DSO_BINARY_TYPE__BPF_PROG_INFO) {
return symbol__disassemble_bpf(sym, args);
+ } else if (dso->binary_type == DSO_BINARY_TYPE__BPF_IMAGE) {
+ return symbol__disassemble_bpf_image(sym, args);
} else if (dso__is_kcore(dso)) {
kce.kcore_filename = symfs_filename;
kce.addr = map__rip_2objdump(map, sym->start);
.evsel = evsel,
.options = options,
};
- struct perf_env *env = perf_evsel__env(evsel);
+ struct perf_env *env = evsel__env(evsel);
const char *arch_name = perf_env__arch(env);
struct arch *arch;
int err;
struct dso *dso = map->dso;
char *filename;
const char *d_filename;
- const char *evsel_name = perf_evsel__name(evsel);
+ const char *evsel_name = evsel__name(evsel);
struct annotation *notes = symbol__annotation(sym);
struct sym_hist *h = annotation__histogram(notes, evsel->idx);
struct annotation_line *pos, *queue = NULL;
len = symbol__size(sym);
- if (perf_evsel__is_group_event(evsel)) {
+ if (evsel__is_group_event(evsel)) {
width *= evsel->core.nr_members;
- perf_evsel__group_desc(evsel, buf, sizeof(buf));
+ evsel__group_desc(evsel, buf, sizeof(buf));
evsel_name = buf;
}
int map_symbol__annotation_dump(struct map_symbol *ms, struct evsel *evsel,
struct annotation_options *opts)
{
- const char *ev_name = perf_evsel__name(evsel);
+ const char *ev_name = evsel__name(evsel);
char buf[1024];
char *filename;
int err = -1;
if (fp == NULL)
goto out_free_filename;
- if (perf_evsel__is_group_event(evsel)) {
- perf_evsel__group_desc(evsel, buf, sizeof(buf));
+ if (evsel__is_group_event(evsel)) {
+ evsel__group_desc(evsel, buf, sizeof(buf));
ev_name = buf;
}
if (notes->offsets == NULL)
return ENOMEM;
- if (perf_evsel__is_group_event(evsel))
+ if (evsel__is_group_event(evsel))
nr_pcnt = evsel->core.nr_members;
err = symbol__annotate(ms, evsel, options, parch);
free(spe);
}
+static bool arm_spe_evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel)
+{
+ struct arm_spe *spe = container_of(session->auxtrace, struct arm_spe, auxtrace);
+
+ return evsel->core.attr.type == spe->pmu_type;
+}
+
static const char * const arm_spe_info_fmts[] = {
[ARM_SPE_PMU_TYPE] = " PMU Type %"PRId64"\n",
};
spe->auxtrace.flush_events = arm_spe_flush;
spe->auxtrace.free_events = arm_spe_free_events;
spe->auxtrace.free = arm_spe_free;
+ spe->auxtrace.evsel_is_auxtrace = arm_spe_evsel_is_auxtrace;
session->auxtrace = &spe->auxtrace;
arm_spe_print_info(&auxtrace_info->priv[0]);
#include "evsel.h"
#include "evsel_config.h"
#include "symbol.h"
+#include "util/perf_api_probe.h"
#include "util/synthetic-events.h"
#include "thread_map.h"
#include "asm/bug.h"
#include "symbol/kallsyms.h"
#include <internal/lib.h>
-static struct perf_pmu *perf_evsel__find_pmu(struct evsel *evsel)
-{
- struct perf_pmu *pmu = NULL;
-
- while ((pmu = perf_pmu__scan(pmu)) != NULL) {
- if (pmu->type == evsel->core.attr.type)
- break;
- }
-
- return pmu;
-}
-
-static bool perf_evsel__is_aux_event(struct evsel *evsel)
-{
- struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
-
- return pmu && pmu->auxtrace;
-}
-
/*
* Make a group from 'leader' to 'last', requiring that the events were not
* already grouped to a different leader.
struct evsel *evsel;
bool grp;
- if (!perf_evsel__is_group_leader(leader))
+ if (!evsel__is_group_leader(leader))
return -EINVAL;
grp = false;
evlist__for_each_entry(evlist, evsel) {
sz = evsel->core.attr.aux_sample_size;
- if (perf_evsel__is_group_leader(evsel)) {
- has_aux_leader = perf_evsel__is_aux_event(evsel);
+ if (evsel__is_group_leader(evsel)) {
+ has_aux_leader = evsel__is_aux_event(evsel);
if (sz) {
if (has_aux_leader)
pr_err("Cannot add AUX area sampling to an AUX area event\n");
pr_err("Cannot add AUX area sampling because group leader is not an AUX area event\n");
return -EINVAL;
}
- perf_evsel__set_sample_bit(evsel, AUX);
+ evsel__set_sample_bit(evsel, AUX);
opts->auxtrace_sample_mode = true;
} else {
- perf_evsel__reset_sample_bit(evsel, AUX);
+ evsel__reset_sample_bit(evsel, AUX);
}
}
/* Set aux_sample_size based on --aux-sample option */
evlist__for_each_entry(evlist, evsel) {
- if (perf_evsel__is_group_leader(evsel)) {
- has_aux_leader = perf_evsel__is_aux_event(evsel);
+ if (evsel__is_group_leader(evsel)) {
+ has_aux_leader = evsel__is_aux_event(evsel);
} else if (has_aux_leader) {
evsel->core.attr.aux_sample_size = sz;
}
aux_evsel = NULL;
/* Override with aux_sample_size from config term */
evlist__for_each_entry(evlist, evsel) {
- if (perf_evsel__is_aux_event(evsel))
+ if (evsel__is_aux_event(evsel))
aux_evsel = evsel;
term = perf_evsel__get_config_term(evsel, AUX_SAMPLE_SIZE);
if (term) {
return err;
}
+static void unleader_evsel(struct evlist *evlist, struct evsel *leader)
+{
+ struct evsel *new_leader = NULL;
+ struct evsel *evsel;
+
+ /* Find new leader for the group */
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel->leader != leader || evsel == leader)
+ continue;
+ if (!new_leader)
+ new_leader = evsel;
+ evsel->leader = new_leader;
+ }
+
+ /* Update group information */
+ if (new_leader) {
+ zfree(&new_leader->group_name);
+ new_leader->group_name = leader->group_name;
+ leader->group_name = NULL;
+
+ new_leader->core.nr_members = leader->core.nr_members - 1;
+ leader->core.nr_members = 1;
+ }
+}
+
+static void unleader_auxtrace(struct perf_session *session)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(session->evlist, evsel) {
+ if (auxtrace__evsel_is_auxtrace(session, evsel) &&
+ evsel__is_group_leader(evsel)) {
+ unleader_evsel(session->evlist, evsel);
+ }
+ }
+}
+
int perf_event__process_auxtrace_info(struct perf_session *session,
union perf_event *event)
{
enum auxtrace_type type = event->auxtrace_info.type;
+ int err;
if (dump_trace)
fprintf(stdout, " type: %u\n", type);
switch (type) {
case PERF_AUXTRACE_INTEL_PT:
- return intel_pt_process_auxtrace_info(event, session);
+ err = intel_pt_process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_INTEL_BTS:
- return intel_bts_process_auxtrace_info(event, session);
+ err = intel_bts_process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_ARM_SPE:
- return arm_spe_process_auxtrace_info(event, session);
+ err = arm_spe_process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_CS_ETM:
- return cs_etm__process_auxtrace_info(event, session);
+ err = cs_etm__process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_S390_CPUMSF:
- return s390_cpumsf_process_auxtrace_info(event, session);
+ err = s390_cpumsf_process_auxtrace_info(event, session);
+ break;
case PERF_AUXTRACE_UNKNOWN:
default:
return -EINVAL;
}
+
+ if (err)
+ return err;
+
+ unleader_auxtrace(session);
+
+ return 0;
}
s64 perf_event__process_auxtrace(struct perf_session *session,
synth_opts->branches = true;
synth_opts->returns = true;
break;
+ case 'G':
case 'g':
- synth_opts->callchain = true;
+ if (p[-1] == 'G')
+ synth_opts->add_callchain = true;
+ else
+ synth_opts->callchain = true;
synth_opts->callchain_sz =
PERF_ITRACE_DEFAULT_CALLCHAIN_SZ;
while (*p == ' ' || *p == ',')
synth_opts->callchain_sz = val;
}
break;
+ case 'L':
case 'l':
- synth_opts->last_branch = true;
+ if (p[-1] == 'L')
+ synth_opts->add_last_branch = true;
+ else
+ synth_opts->last_branch = true;
synth_opts->last_branch_sz =
PERF_ITRACE_DEFAULT_LAST_BRANCH_SZ;
while (*p == ' ' || *p == ',')
goto out_exit;
}
- if (perf_evsel__append_addr_filter(evsel, new_filter)) {
+ if (evsel__append_addr_filter(evsel, new_filter)) {
err = -ENOMEM;
goto out_exit;
}
return err;
}
-static int perf_evsel__nr_addr_filter(struct evsel *evsel)
+static int evsel__nr_addr_filter(struct evsel *evsel)
{
- struct perf_pmu *pmu = perf_evsel__find_pmu(evsel);
+ struct perf_pmu *pmu = evsel__find_pmu(evsel);
int nr_addr_filters = 0;
if (!pmu)
evlist__for_each_entry(evlist, evsel) {
filter = evsel->filter;
- max_nr = perf_evsel__nr_addr_filter(evsel);
+ max_nr = evsel__nr_addr_filter(evsel);
if (!filter || !max_nr)
continue;
evsel->filter = NULL;
return session->auxtrace->free(session);
}
+
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel)
+{
+ if (!session->auxtrace || !session->auxtrace->evsel_is_auxtrace)
+ return false;
+
+ return session->auxtrace->evsel_is_auxtrace(session, evsel);
+}
union perf_event;
struct perf_session;
struct evlist;
+struct evsel;
struct perf_tool;
struct mmap;
struct perf_sample;
* @calls: limit branch samples to calls (can be combined with @returns)
* @returns: limit branch samples to returns (can be combined with @calls)
* @callchain: add callchain to 'instructions' events
+ * @add_callchain: add callchain to existing event records
* @thread_stack: feed branches to the thread_stack
* @last_branch: add branch context to 'instruction' events
+ * @add_last_branch: add branch context to existing event records
* @callchain_sz: maximum callchain size
* @last_branch_sz: branch context size
* @period: 'instructions' events period
bool calls;
bool returns;
bool callchain;
+ bool add_callchain;
bool thread_stack;
bool last_branch;
+ bool add_last_branch;
unsigned int callchain_sz;
unsigned int last_branch_sz;
unsigned long long period;
struct perf_tool *tool);
void (*free_events)(struct perf_session *session);
void (*free)(struct perf_session *session);
+ bool (*evsel_is_auxtrace)(struct perf_session *session,
+ struct evsel *evsel);
};
/**
int auxtrace__flush_events(struct perf_session *session, struct perf_tool *tool);
void auxtrace__free_events(struct perf_session *session);
void auxtrace__free(struct perf_session *session);
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel);
#define ITRACE_HELP \
" i: synthesize instructions events\n" \
{
}
+static inline
+bool auxtrace__evsel_is_auxtrace(struct perf_session *session __maybe_unused,
+ struct evsel *evsel __maybe_unused)
+{
+ return false;
+}
+
static inline
int auxtrace_parse_filters(struct evlist *evlist __maybe_unused)
{
#include <bpf/libbpf.h>
#include <linux/btf.h>
#include <linux/err.h>
+#include <linux/string.h>
+#include <internal/lib.h>
+#include <symbol/kallsyms.h>
#include "bpf-event.h"
#include "debug.h"
#include "dso.h"
return err ? -1 : 0;
}
+struct kallsyms_parse {
+ union perf_event *event;
+ perf_event__handler_t process;
+ struct machine *machine;
+ struct perf_tool *tool;
+};
+
+static int
+process_bpf_image(char *name, u64 addr, struct kallsyms_parse *data)
+{
+ struct machine *machine = data->machine;
+ union perf_event *event = data->event;
+ struct perf_record_ksymbol *ksymbol;
+ int len;
+
+ ksymbol = &event->ksymbol;
+
+ *ksymbol = (struct perf_record_ksymbol) {
+ .header = {
+ .type = PERF_RECORD_KSYMBOL,
+ .size = offsetof(struct perf_record_ksymbol, name),
+ },
+ .addr = addr,
+ .len = page_size,
+ .ksym_type = PERF_RECORD_KSYMBOL_TYPE_BPF,
+ .flags = 0,
+ };
+
+ len = scnprintf(ksymbol->name, KSYM_NAME_LEN, "%s", name);
+ ksymbol->header.size += PERF_ALIGN(len + 1, sizeof(u64));
+ memset((void *) event + event->header.size, 0, machine->id_hdr_size);
+ event->header.size += machine->id_hdr_size;
+
+ return perf_tool__process_synth_event(data->tool, event, machine,
+ data->process);
+}
+
+static int
+kallsyms_process_symbol(void *data, const char *_name,
+ char type __maybe_unused, u64 start)
+{
+ char disp[KSYM_NAME_LEN];
+ char *module, *name;
+ unsigned long id;
+ int err = 0;
+
+ module = strchr(_name, '\t');
+ if (!module)
+ return 0;
+
+ /* We are going after [bpf] module ... */
+ if (strcmp(module + 1, "[bpf]"))
+ return 0;
+
+ name = memdup(_name, (module - _name) + 1);
+ if (!name)
+ return -ENOMEM;
+
+ name[module - _name] = 0;
+
+ /* .. and only for trampolines and dispatchers */
+ if ((sscanf(name, "bpf_trampoline_%lu", &id) == 1) ||
+ (sscanf(name, "bpf_dispatcher_%s", disp) == 1))
+ err = process_bpf_image(name, start, data);
+
+ free(name);
+ return err;
+}
+
int perf_event__synthesize_bpf_events(struct perf_session *session,
perf_event__handler_t process,
struct machine *machine,
struct record_opts *opts)
{
+ const char *kallsyms_filename = "/proc/kallsyms";
+ struct kallsyms_parse arg;
union perf_event *event;
__u32 id = 0;
int err;
event = malloc(sizeof(event->bpf) + KSYM_NAME_LEN + machine->id_hdr_size);
if (!event)
return -1;
+
+ /* Synthesize all the bpf programs in system. */
while (true) {
err = bpf_prog_get_next_id(id, &id);
if (err) {
break;
}
}
+
+ /* Synthesize all the bpf images - trampolines/dispatchers. */
+ if (symbol_conf.kallsyms_name != NULL)
+ kallsyms_filename = symbol_conf.kallsyms_name;
+
+ arg = (struct kallsyms_parse) {
+ .event = event,
+ .process = process,
+ .machine = machine,
+ .tool = session->tool,
+ };
+
+ if (kallsyms__parse(kallsyms_filename, &arg, kallsyms_process_symbol)) {
+ pr_err("%s: failed to synthesize bpf images: %s\n",
+ __func__, strerror(errno));
+ }
+
free(event);
return err;
}
return 0;
}
-int bpf_event__add_sb_event(struct evlist **evlist,
- struct perf_env *env)
+int evlist__add_bpf_sb_event(struct evlist *evlist, struct perf_env *env)
{
struct perf_event_attr attr = {
.type = PERF_TYPE_SOFTWARE,
#ifdef HAVE_LIBBPF_SUPPORT
int machine__process_bpf(struct machine *machine, union perf_event *event,
struct perf_sample *sample);
-int bpf_event__add_sb_event(struct evlist **evlist,
- struct perf_env *env);
+int evlist__add_bpf_sb_event(struct evlist *evlist, struct perf_env *env);
void bpf_event__print_bpf_prog_info(struct bpf_prog_info *info,
struct perf_env *env,
FILE *fp);
return 0;
}
-static inline int bpf_event__add_sb_event(struct evlist **evlist __maybe_unused,
- struct perf_env *env __maybe_unused)
+static inline int evlist__add_bpf_sb_event(struct evlist *evlist __maybe_unused,
+ struct perf_env *env __maybe_unused)
{
return 0;
}
return -BPF_LOADER_ERRNO__OBJCONF_MAP_EVTINH;
}
- if (perf_evsel__is_bpf_output(evsel))
+ if (evsel__is_bpf_output(evsel))
check_pass = true;
if (attr->type == PERF_TYPE_RAW)
check_pass = true;
#include "event.h"
struct branch_flags {
- u64 mispred:1;
- u64 predicted:1;
- u64 in_tx:1;
- u64 abort:1;
- u64 cycles:16;
- u64 type:4;
- u64 reserved:40;
+ union {
+ u64 value;
+ struct {
+ u64 mispred:1;
+ u64 predicted:1;
+ u64 in_tx:1;
+ u64 abort:1;
+ u64 cycles:16;
+ u64 type:4;
+ u64 reserved:40;
+ };
+ };
};
struct branch_info {
u64 ip;
struct map_symbol ms;
const char *srcline;
+ /* Indicate valid cursor node for LBR stitch */
+ bool valid;
+
bool branch;
struct branch_flags branch_flags;
u64 branch_from;
struct callchain_cursor_node *next;
};
+struct stitch_list {
+ struct list_head node;
+ struct callchain_cursor_node cursor;
+};
+
struct callchain_cursor {
u64 nr;
struct callchain_cursor_node *first;
#define CAP_SYSLOG 34
#endif
+#ifndef CAP_PERFMON
+#define CAP_PERFMON 38
+#endif
+
#endif /* __PERF_CAP_H */
static void cgroup__delete(struct cgroup *cgroup)
{
- close(cgroup->fd);
+ if (cgroup->fd >= 0)
+ close(cgroup->fd);
zfree(&cgroup->name);
free(cgroup);
}
static int perf_flag_probe(void)
{
- /* use 'safest' configuration as used in perf_evsel__fallback() */
+ /* use 'safest' configuration as used in evsel__fallback() */
struct perf_event_attr attr = {
.type = PERF_TYPE_SOFTWARE,
.config = PERF_COUNT_SW_CPU_CLOCK,
resp = cs_etm_decoder__set_tid(etmq, packet_queue,
elem, trace_chan_id);
break;
+ /* Unused packet types */
+ case OCSD_GEN_TRC_ELEM_I_RANGE_NOPATH:
case OCSD_GEN_TRC_ELEM_ADDR_NACC:
case OCSD_GEN_TRC_ELEM_CYCLE_COUNT:
case OCSD_GEN_TRC_ELEM_ADDR_UNKNOWN:
struct cs_etm_traceid_queue **traceid_queues;
};
+/* RB tree for quick conversion between traceID and metadata pointers */
+static struct intlist *traceid_list;
+
static int cs_etm__update_queues(struct cs_etm_auxtrace *etm);
static int cs_etm__process_queues(struct cs_etm_auxtrace *etm);
static int cs_etm__process_timeless_queues(struct cs_etm_auxtrace *etm,
zfree(&aux);
}
+static bool cs_etm__evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel)
+{
+ struct cs_etm_auxtrace *aux = container_of(session->auxtrace,
+ struct cs_etm_auxtrace,
+ auxtrace);
+
+ return evsel->core.attr.type == aux->pmu_type;
+}
+
static u8 cs_etm__cpu_mode(struct cs_etm_queue *etmq, u64 address)
{
struct machine *machine;
etm->auxtrace.flush_events = cs_etm__flush_events;
etm->auxtrace.free_events = cs_etm__free_events;
etm->auxtrace.free = cs_etm__free;
+ etm->auxtrace.evsel_is_auxtrace = cs_etm__evsel_is_auxtrace;
session->auxtrace = &etm->auxtrace;
etm->unknown_thread = thread__new(999999999, 999999999);
CS_ETM_ISA_T32,
};
-/* RB tree for quick conversion between traceID and metadata pointers */
-struct intlist *traceid_list;
-
struct cs_etm_queue;
struct cs_etm_packet {
return -1;
}
- if (perf_evsel__is_bpf_output(evsel)) {
+ if (evsel__is_bpf_output(evsel)) {
ret = add_bpf_output_values(event_class, event, sample);
if (ret)
return -1;
{
struct bt_ctf_event_class *event_class;
struct evsel_priv *priv;
- const char *name = perf_evsel__name(evsel);
+ const char *name = evsel__name(evsel);
int ret;
pr("Adding event '%s' (type %d)\n", name, evsel->core.attr.type);
goto err;
}
- if (perf_evsel__is_bpf_output(evsel)) {
+ if (evsel__is_bpf_output(evsel)) {
ret = add_bpf_output_types(cw, event_class);
if (ret)
goto err;
case DSO_BINARY_TYPE__GUEST_KALLSYMS:
case DSO_BINARY_TYPE__JAVA_JIT:
case DSO_BINARY_TYPE__BPF_PROG_INFO:
+ case DSO_BINARY_TYPE__BPF_IMAGE:
case DSO_BINARY_TYPE__NOT_FOUND:
ret = -1;
break;
DSO_BINARY_TYPE__GUEST_KCORE,
DSO_BINARY_TYPE__OPENEMBEDDED_DEBUGINFO,
DSO_BINARY_TYPE__BPF_PROG_INFO,
+ DSO_BINARY_TYPE__BPF_IMAGE,
DSO_BINARY_TYPE__NOT_FOUND,
};
char *cpuid;
unsigned long long total_mem;
unsigned int msr_pmu_type;
+ unsigned int max_branches;
int nr_cmdline;
int nr_sibling_cores;
int nr_memory_nodes;
int nr_pmu_mappings;
int nr_groups;
+ int nr_cpu_pmu_caps;
char *cmdline;
const char **cmdline_argv;
char *sibling_cores;
char *sibling_dies;
char *sibling_threads;
char *pmu_mappings;
+ char *cpu_pmu_caps;
struct cpu_topology_map *cpu;
struct cpu_cache_level *caches;
int caches_cnt;
ret = strlist__has_entry(symbol_conf.sym_list,
al->sym->name);
}
- if (!(ret && al->sym)) {
+ if (!ret && al->sym) {
snprintf(al_addr_str, sz, "0x%"PRIx64,
al->map->unmap_ip(al->map, al->sym->start));
ret = strlist__has_entry(symbol_conf.sym_list,
#include "asm/bug.h"
#include "bpf-event.h"
#include "util/string2.h"
+#include "util/perf_api_probe.h"
#include <signal.h>
#include <unistd.h>
#include <sched.h>
struct evsel *evsel;
evlist__for_each_entry(evlist, evsel)
- perf_evsel__calc_id_pos(evsel);
+ evsel__calc_id_pos(evsel);
perf_evlist__set_id_pos(evlist);
}
evlist__for_each_entry(evlist, pos) {
if (evsel__cpu_iter_skip(pos, cpu))
continue;
- if (pos->disabled || !perf_evsel__is_group_leader(pos) || !pos->core.fd)
+ if (pos->disabled || !evsel__is_group_leader(pos) || !pos->core.fd)
continue;
evsel__disable_cpu(pos, pos->cpu_iter - 1);
}
}
affinity__cleanup(&affinity);
evlist__for_each_entry(evlist, pos) {
- if (!perf_evsel__is_group_leader(pos) || !pos->core.fd)
+ if (!evsel__is_group_leader(pos) || !pos->core.fd)
continue;
pos->disabled = true;
}
evlist__for_each_entry(evlist, pos) {
if (evsel__cpu_iter_skip(pos, cpu))
continue;
- if (!perf_evsel__is_group_leader(pos) || !pos->core.fd)
+ if (!evsel__is_group_leader(pos) || !pos->core.fd)
continue;
evsel__enable_cpu(pos, pos->cpu_iter - 1);
}
}
affinity__cleanup(&affinity);
evlist__for_each_entry(evlist, pos) {
- if (!perf_evsel__is_group_leader(pos) || !pos->core.fd)
+ if (!evsel__is_group_leader(pos) || !pos->core.fd)
continue;
pos->disabled = false;
}
struct evsel *evsel;
evlist__for_each_entry(evlist, evsel)
- __perf_evsel__set_sample_bit(evsel, bit);
+ __evsel__set_sample_bit(evsel, bit);
}
void __perf_evlist__reset_sample_bit(struct evlist *evlist,
struct evsel *evsel;
evlist__for_each_entry(evlist, evsel)
- __perf_evsel__reset_sample_bit(evsel, bit);
+ __evsel__reset_sample_bit(evsel, bit);
}
int perf_evlist__apply_filters(struct evlist *evlist, struct evsel **err_evsel)
if (evsel->core.attr.type != PERF_TYPE_TRACEPOINT)
continue;
- err = perf_evsel__set_filter(evsel, filter);
+ err = evsel__set_filter(evsel, filter);
if (err)
break;
}
if (evsel->core.attr.type != PERF_TYPE_TRACEPOINT)
continue;
- err = perf_evsel__append_tp_filter(evsel, filter);
+ err = evsel__append_tp_filter(evsel, filter);
if (err)
break;
}
u64 sample_type = first->core.attr.sample_type;
evlist__for_each_entry(evlist, pos) {
- if (read_format != pos->core.attr.read_format)
- return false;
+ if (read_format != pos->core.attr.read_format) {
+ pr_debug("Read format differs %#" PRIx64 " vs %#" PRIx64 "\n",
+ read_format, (u64)pos->core.attr.read_format);
+ }
}
/* PERF_SAMPLE_READ imples PERF_FORMAT_ID. */
if (!evsel)
return -EFAULT;
- return perf_evsel__parse_sample(evsel, event, sample);
+ return evsel__parse_sample(evsel, event, sample);
}
int perf_evlist__parse_sample_timestamp(struct evlist *evlist,
if (!evsel)
return -EFAULT;
- return perf_evsel__parse_sample_timestamp(evsel, event, timestamp);
+ return evsel__parse_sample_timestamp(evsel, event, timestamp);
}
int perf_evlist__strerror_open(struct evlist *evlist,
}
return leader;
}
-
-int perf_evlist__add_sb_event(struct evlist **evlist,
- struct perf_event_attr *attr,
- perf_evsel__sb_cb_t cb,
- void *data)
-{
- struct evsel *evsel;
- bool new_evlist = (*evlist) == NULL;
-
- if (*evlist == NULL)
- *evlist = evlist__new();
- if (*evlist == NULL)
- return -1;
-
- if (!attr->sample_id_all) {
- pr_warning("enabling sample_id_all for all side band events\n");
- attr->sample_id_all = 1;
- }
-
- evsel = perf_evsel__new_idx(attr, (*evlist)->core.nr_entries);
- if (!evsel)
- goto out_err;
-
- evsel->side_band.cb = cb;
- evsel->side_band.data = data;
- evlist__add(*evlist, evsel);
- return 0;
-
-out_err:
- if (new_evlist) {
- evlist__delete(*evlist);
- *evlist = NULL;
- }
- return -1;
-}
-
-static void *perf_evlist__poll_thread(void *arg)
-{
- struct evlist *evlist = arg;
- bool draining = false;
- int i, done = 0;
- /*
- * In order to read symbols from other namespaces perf to needs to call
- * setns(2). This isn't permitted if the struct_fs has multiple users.
- * unshare(2) the fs so that we may continue to setns into namespaces
- * that we're observing when, for instance, reading the build-ids at
- * the end of a 'perf record' session.
- */
- unshare(CLONE_FS);
-
- while (!done) {
- bool got_data = false;
-
- if (evlist->thread.done)
- draining = true;
-
- if (!draining)
- evlist__poll(evlist, 1000);
-
- for (i = 0; i < evlist->core.nr_mmaps; i++) {
- struct mmap *map = &evlist->mmap[i];
- union perf_event *event;
-
- if (perf_mmap__read_init(&map->core))
- continue;
- while ((event = perf_mmap__read_event(&map->core)) != NULL) {
- struct evsel *evsel = perf_evlist__event2evsel(evlist, event);
-
- if (evsel && evsel->side_band.cb)
- evsel->side_band.cb(event, evsel->side_band.data);
- else
- pr_warning("cannot locate proper evsel for the side band event\n");
-
- perf_mmap__consume(&map->core);
- got_data = true;
- }
- perf_mmap__read_done(&map->core);
- }
-
- if (draining && !got_data)
- break;
- }
- return NULL;
-}
-
-int perf_evlist__start_sb_thread(struct evlist *evlist,
- struct target *target)
-{
- struct evsel *counter;
-
- if (!evlist)
- return 0;
-
- if (perf_evlist__create_maps(evlist, target))
- goto out_delete_evlist;
-
- evlist__for_each_entry(evlist, counter) {
- if (evsel__open(counter, evlist->core.cpus,
- evlist->core.threads) < 0)
- goto out_delete_evlist;
- }
-
- if (evlist__mmap(evlist, UINT_MAX))
- goto out_delete_evlist;
-
- evlist__for_each_entry(evlist, counter) {
- if (evsel__enable(counter))
- goto out_delete_evlist;
- }
-
- evlist->thread.done = 0;
- if (pthread_create(&evlist->thread.th, NULL, perf_evlist__poll_thread, evlist))
- goto out_delete_evlist;
-
- return 0;
-
-out_delete_evlist:
- evlist__delete(evlist);
- evlist = NULL;
- return -1;
-}
-
-void perf_evlist__stop_sb_thread(struct evlist *evlist)
-{
- if (!evlist)
- return;
- evlist->thread.done = 1;
- pthread_join(evlist->thread.th, NULL);
- evlist__delete(evlist);
-}
int perf_evlist__add_dummy(struct evlist *evlist);
-int perf_evlist__add_sb_event(struct evlist **evlist,
+int perf_evlist__add_sb_event(struct evlist *evlist,
struct perf_event_attr *attr,
- perf_evsel__sb_cb_t cb,
+ evsel__sb_cb_t cb,
void *data);
+void evlist__set_cb(struct evlist *evlist, evsel__sb_cb_t cb, void *data);
int perf_evlist__start_sb_thread(struct evlist *evlist,
struct target *target);
void perf_evlist__stop_sb_thread(struct evlist *evlist);
struct callchain_param;
void perf_evlist__set_id_pos(struct evlist *evlist);
-bool perf_can_sample_identifier(void);
-bool perf_can_record_switch_events(void);
-bool perf_can_record_cpu_wide(void);
-bool perf_can_aux_sample(void);
void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
struct callchain_param *callchain);
int record_opts__config(struct record_opts *opts);
#define FD(e, x, y) (*(int *)xyarray__entry(e->core.fd, x, y))
-int __perf_evsel__sample_size(u64 sample_type)
+int __evsel__sample_size(u64 sample_type)
{
u64 mask = sample_type & PERF_SAMPLE_MASK;
int size = 0;
return idx;
}
-void perf_evsel__calc_id_pos(struct evsel *evsel)
+void evsel__calc_id_pos(struct evsel *evsel)
{
evsel->id_pos = __perf_evsel__calc_id_pos(evsel->core.attr.sample_type);
evsel->is_pos = __perf_evsel__calc_is_pos(evsel->core.attr.sample_type);
}
-void __perf_evsel__set_sample_bit(struct evsel *evsel,
+void __evsel__set_sample_bit(struct evsel *evsel,
enum perf_event_sample_format bit)
{
if (!(evsel->core.attr.sample_type & bit)) {
evsel->core.attr.sample_type |= bit;
evsel->sample_size += sizeof(u64);
- perf_evsel__calc_id_pos(evsel);
+ evsel__calc_id_pos(evsel);
}
}
-void __perf_evsel__reset_sample_bit(struct evsel *evsel,
+void __evsel__reset_sample_bit(struct evsel *evsel,
enum perf_event_sample_format bit)
{
if (evsel->core.attr.sample_type & bit) {
evsel->core.attr.sample_type &= ~bit;
evsel->sample_size -= sizeof(u64);
- perf_evsel__calc_id_pos(evsel);
+ evsel__calc_id_pos(evsel);
}
}
-void perf_evsel__set_sample_id(struct evsel *evsel,
+void evsel__set_sample_id(struct evsel *evsel,
bool can_sample_identifier)
{
if (can_sample_identifier) {
- perf_evsel__reset_sample_bit(evsel, ID);
- perf_evsel__set_sample_bit(evsel, IDENTIFIER);
+ evsel__reset_sample_bit(evsel, ID);
+ evsel__set_sample_bit(evsel, IDENTIFIER);
} else {
- perf_evsel__set_sample_bit(evsel, ID);
+ evsel__set_sample_bit(evsel, ID);
}
evsel->core.attr.read_format |= PERF_FORMAT_ID;
}
/**
- * perf_evsel__is_function_event - Return whether given evsel is a function
+ * evsel__is_function_event - Return whether given evsel is a function
* trace event
*
* @evsel - evsel selector to be tested
*
* Return %true if event is function trace event
*/
-bool perf_evsel__is_function_event(struct evsel *evsel)
+bool evsel__is_function_event(struct evsel *evsel)
{
#define FUNCTION_EVENT "ftrace:function"
evsel->bpf_fd = -1;
INIT_LIST_HEAD(&evsel->config_terms);
perf_evsel__object.init(evsel);
- evsel->sample_size = __perf_evsel__sample_size(attr->sample_type);
- perf_evsel__calc_id_pos(evsel);
+ evsel->sample_size = __evsel__sample_size(attr->sample_type);
+ evsel__calc_id_pos(evsel);
evsel->cmdline_group_boundary = false;
evsel->metric_expr = NULL;
evsel->metric_name = NULL;
return NULL;
evsel__init(evsel, attr, idx);
- if (perf_evsel__is_bpf_output(evsel)) {
+ if (evsel__is_bpf_output(evsel)) {
evsel->core.attr.sample_type |= (PERF_SAMPLE_RAW | PERF_SAMPLE_TIME |
PERF_SAMPLE_CPU | PERF_SAMPLE_PERIOD),
evsel->core.attr.sample_period = 1;
}
- if (perf_evsel__is_clock(evsel)) {
+ if (evsel__is_clock(evsel)) {
/*
* The evsel->unit points to static alias->unit
* so it's ok to use static string in here.
"ref-cycles",
};
-static const char *__perf_evsel__hw_name(u64 config)
+static const char *__evsel__hw_name(u64 config)
{
if (config < PERF_COUNT_HW_MAX && perf_evsel__hw_names[config])
return perf_evsel__hw_names[config];
return r;
}
-static int perf_evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
+static int evsel__hw_name(struct evsel *evsel, char *bf, size_t size)
{
- int r = scnprintf(bf, size, "%s", __perf_evsel__hw_name(evsel->core.attr.config));
+ int r = scnprintf(bf, size, "%s", __evsel__hw_name(evsel->core.attr.config));
return r + perf_evsel__add_modifiers(evsel, bf + r, size - r);
}
"dummy",
};
-static const char *__perf_evsel__sw_name(u64 config)
+static const char *__evsel__sw_name(u64 config)
{
if (config < PERF_COUNT_SW_MAX && perf_evsel__sw_names[config])
return perf_evsel__sw_names[config];
return "unknown-software";
}
-static int perf_evsel__sw_name(struct evsel *evsel, char *bf, size_t size)
+static int evsel__sw_name(struct evsel *evsel, char *bf, size_t size)
{
- int r = scnprintf(bf, size, "%s", __perf_evsel__sw_name(evsel->core.attr.config));
+ int r = scnprintf(bf, size, "%s", __evsel__sw_name(evsel->core.attr.config));
return r + perf_evsel__add_modifiers(evsel, bf + r, size - r);
}
-static int __perf_evsel__bp_name(char *bf, size_t size, u64 addr, u64 type)
+static int __evsel__bp_name(char *bf, size_t size, u64 addr, u64 type)
{
int r;
return r;
}
-static int perf_evsel__bp_name(struct evsel *evsel, char *bf, size_t size)
+static int evsel__bp_name(struct evsel *evsel, char *bf, size_t size)
{
struct perf_event_attr *attr = &evsel->core.attr;
- int r = __perf_evsel__bp_name(bf, size, attr->bp_addr, attr->bp_type);
+ int r = __evsel__bp_name(bf, size, attr->bp_addr, attr->bp_type);
return r + perf_evsel__add_modifiers(evsel, bf + r, size - r);
}
[C(NODE)] = (CACHE_READ | CACHE_WRITE | CACHE_PREFETCH),
};
-bool perf_evsel__is_cache_op_valid(u8 type, u8 op)
+bool evsel__is_cache_op_valid(u8 type, u8 op)
{
if (perf_evsel__hw_cache_stat[type] & COP(op))
return true; /* valid */
return false; /* invalid */
}
-int __perf_evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result,
- char *bf, size_t size)
+int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size)
{
if (result) {
return scnprintf(bf, size, "%s-%s-%s", perf_evsel__hw_cache[type][0],
perf_evsel__hw_cache_op[op][1]);
}
-static int __perf_evsel__hw_cache_name(u64 config, char *bf, size_t size)
+static int __evsel__hw_cache_name(u64 config, char *bf, size_t size)
{
u8 op, result, type = (config >> 0) & 0xff;
const char *err = "unknown-ext-hardware-cache-type";
goto out_err;
err = "invalid-cache";
- if (!perf_evsel__is_cache_op_valid(type, op))
+ if (!evsel__is_cache_op_valid(type, op))
goto out_err;
- return __perf_evsel__hw_cache_type_op_res_name(type, op, result, bf, size);
+ return __evsel__hw_cache_type_op_res_name(type, op, result, bf, size);
out_err:
return scnprintf(bf, size, "%s", err);
}
-static int perf_evsel__hw_cache_name(struct evsel *evsel, char *bf, size_t size)
+static int evsel__hw_cache_name(struct evsel *evsel, char *bf, size_t size)
{
- int ret = __perf_evsel__hw_cache_name(evsel->core.attr.config, bf, size);
+ int ret = __evsel__hw_cache_name(evsel->core.attr.config, bf, size);
return ret + perf_evsel__add_modifiers(evsel, bf + ret, size - ret);
}
-static int perf_evsel__raw_name(struct evsel *evsel, char *bf, size_t size)
+static int evsel__raw_name(struct evsel *evsel, char *bf, size_t size)
{
int ret = scnprintf(bf, size, "raw 0x%" PRIx64, evsel->core.attr.config);
return ret + perf_evsel__add_modifiers(evsel, bf + ret, size - ret);
}
-static int perf_evsel__tool_name(char *bf, size_t size)
+static int evsel__tool_name(char *bf, size_t size)
{
int ret = scnprintf(bf, size, "duration_time");
return ret;
}
-const char *perf_evsel__name(struct evsel *evsel)
+const char *evsel__name(struct evsel *evsel)
{
char bf[128];
switch (evsel->core.attr.type) {
case PERF_TYPE_RAW:
- perf_evsel__raw_name(evsel, bf, sizeof(bf));
+ evsel__raw_name(evsel, bf, sizeof(bf));
break;
case PERF_TYPE_HARDWARE:
- perf_evsel__hw_name(evsel, bf, sizeof(bf));
+ evsel__hw_name(evsel, bf, sizeof(bf));
break;
case PERF_TYPE_HW_CACHE:
- perf_evsel__hw_cache_name(evsel, bf, sizeof(bf));
+ evsel__hw_cache_name(evsel, bf, sizeof(bf));
break;
case PERF_TYPE_SOFTWARE:
if (evsel->tool_event)
- perf_evsel__tool_name(bf, sizeof(bf));
+ evsel__tool_name(bf, sizeof(bf));
else
- perf_evsel__sw_name(evsel, bf, sizeof(bf));
+ evsel__sw_name(evsel, bf, sizeof(bf));
break;
case PERF_TYPE_TRACEPOINT:
break;
case PERF_TYPE_BREAKPOINT:
- perf_evsel__bp_name(evsel, bf, sizeof(bf));
+ evsel__bp_name(evsel, bf, sizeof(bf));
break;
default:
return "unknown";
}
-const char *perf_evsel__group_name(struct evsel *evsel)
+const char *evsel__group_name(struct evsel *evsel)
{
return evsel->group_name ?: "anon group";
}
* For record -e 'cycles,instructions' and report --group
* 'cycles:u, instructions:u'
*/
-int perf_evsel__group_desc(struct evsel *evsel, char *buf, size_t size)
+int evsel__group_desc(struct evsel *evsel, char *buf, size_t size)
{
int ret = 0;
struct evsel *pos;
- const char *group_name = perf_evsel__group_name(evsel);
+ const char *group_name = evsel__group_name(evsel);
if (!evsel->forced_leader)
ret = scnprintf(buf, size, "%s { ", group_name);
- ret += scnprintf(buf + ret, size - ret, "%s",
- perf_evsel__name(evsel));
+ ret += scnprintf(buf + ret, size - ret, "%s", evsel__name(evsel));
for_each_group_member(pos, evsel)
- ret += scnprintf(buf + ret, size - ret, ", %s",
- perf_evsel__name(pos));
+ ret += scnprintf(buf + ret, size - ret, ", %s", evsel__name(pos));
if (!evsel->forced_leader)
ret += scnprintf(buf + ret, size - ret, " }");
return ret;
}
-static void __perf_evsel__config_callchain(struct evsel *evsel,
- struct record_opts *opts,
- struct callchain_param *param)
+static void __evsel__config_callchain(struct evsel *evsel, struct record_opts *opts,
+ struct callchain_param *param)
{
- bool function = perf_evsel__is_function_event(evsel);
+ bool function = evsel__is_function_event(evsel);
struct perf_event_attr *attr = &evsel->core.attr;
- perf_evsel__set_sample_bit(evsel, CALLCHAIN);
+ evsel__set_sample_bit(evsel, CALLCHAIN);
attr->sample_max_stack = param->max_stack;
"to get user callchain information. "
"Falling back to framepointers.\n");
} else {
- perf_evsel__set_sample_bit(evsel, BRANCH_STACK);
+ evsel__set_sample_bit(evsel, BRANCH_STACK);
attr->branch_sample_type = PERF_SAMPLE_BRANCH_USER |
PERF_SAMPLE_BRANCH_CALL_STACK |
PERF_SAMPLE_BRANCH_NO_CYCLES |
if (param->record_mode == CALLCHAIN_DWARF) {
if (!function) {
- perf_evsel__set_sample_bit(evsel, REGS_USER);
- perf_evsel__set_sample_bit(evsel, STACK_USER);
+ evsel__set_sample_bit(evsel, REGS_USER);
+ evsel__set_sample_bit(evsel, STACK_USER);
if (opts->sample_user_regs && DWARF_MINIMAL_REGS != PERF_REGS_MASK) {
attr->sample_regs_user |= DWARF_MINIMAL_REGS;
pr_warning("WARNING: The use of --call-graph=dwarf may require all the user registers, "
}
}
-void perf_evsel__config_callchain(struct evsel *evsel,
- struct record_opts *opts,
- struct callchain_param *param)
+void evsel__config_callchain(struct evsel *evsel, struct record_opts *opts,
+ struct callchain_param *param)
{
if (param->enabled)
- return __perf_evsel__config_callchain(evsel, opts, param);
+ return __evsel__config_callchain(evsel, opts, param);
}
static void
{
struct perf_event_attr *attr = &evsel->core.attr;
- perf_evsel__reset_sample_bit(evsel, CALLCHAIN);
+ evsel__reset_sample_bit(evsel, CALLCHAIN);
if (param->record_mode == CALLCHAIN_LBR) {
- perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
+ evsel__reset_sample_bit(evsel, BRANCH_STACK);
attr->branch_sample_type &= ~(PERF_SAMPLE_BRANCH_USER |
PERF_SAMPLE_BRANCH_CALL_STACK |
PERF_SAMPLE_BRANCH_HW_INDEX);
}
if (param->record_mode == CALLCHAIN_DWARF) {
- perf_evsel__reset_sample_bit(evsel, REGS_USER);
- perf_evsel__reset_sample_bit(evsel, STACK_USER);
+ evsel__reset_sample_bit(evsel, REGS_USER);
+ evsel__reset_sample_bit(evsel, STACK_USER);
}
}
if (!(term->weak && opts->user_interval != ULLONG_MAX)) {
attr->sample_period = term->val.period;
attr->freq = 0;
- perf_evsel__reset_sample_bit(evsel, PERIOD);
+ evsel__reset_sample_bit(evsel, PERIOD);
}
break;
case PERF_EVSEL__CONFIG_TERM_FREQ:
if (!(term->weak && opts->user_freq != UINT_MAX)) {
attr->sample_freq = term->val.freq;
attr->freq = 1;
- perf_evsel__set_sample_bit(evsel, PERIOD);
+ evsel__set_sample_bit(evsel, PERIOD);
}
break;
case PERF_EVSEL__CONFIG_TERM_TIME:
if (term->val.time)
- perf_evsel__set_sample_bit(evsel, TIME);
+ evsel__set_sample_bit(evsel, TIME);
else
- perf_evsel__reset_sample_bit(evsel, TIME);
+ evsel__reset_sample_bit(evsel, TIME);
break;
case PERF_EVSEL__CONFIG_TERM_CALLGRAPH:
callgraph_buf = term->val.str;
break;
case PERF_EVSEL__CONFIG_TERM_BRANCH:
if (term->val.str && strcmp(term->val.str, "no")) {
- perf_evsel__set_sample_bit(evsel, BRANCH_STACK);
+ evsel__set_sample_bit(evsel, BRANCH_STACK);
parse_branch_str(term->val.str,
&attr->branch_sample_type);
} else
- perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
+ evsel__reset_sample_bit(evsel, BRANCH_STACK);
break;
case PERF_EVSEL__CONFIG_TERM_STACK_USER:
dump_size = term->val.stack_user;
case PERF_EVSEL__CONFIG_TERM_INHERIT:
/*
* attr->inherit should has already been set by
- * perf_evsel__config. If user explicitly set
+ * evsel__config. If user explicitly set
* inherit using config terms, override global
* opt->no_inherit setting.
*/
/* set perf-event callgraph */
if (param.enabled) {
if (sample_address) {
- perf_evsel__set_sample_bit(evsel, ADDR);
- perf_evsel__set_sample_bit(evsel, DATA_SRC);
+ evsel__set_sample_bit(evsel, ADDR);
+ evsel__set_sample_bit(evsel, DATA_SRC);
evsel->core.attr.mmap_data = track;
}
- perf_evsel__config_callchain(evsel, opts, ¶m);
+ evsel__config_callchain(evsel, opts, ¶m);
}
}
}
* enable/disable events specifically, as there's no
* initial traced exec call.
*/
-void perf_evsel__config(struct evsel *evsel, struct record_opts *opts,
- struct callchain_param *callchain)
+void evsel__config(struct evsel *evsel, struct record_opts *opts,
+ struct callchain_param *callchain)
{
struct evsel *leader = evsel->leader;
struct perf_event_attr *attr = &evsel->core.attr;
attr->inherit = !opts->no_inherit;
attr->write_backward = opts->overwrite ? 1 : 0;
- perf_evsel__set_sample_bit(evsel, IP);
- perf_evsel__set_sample_bit(evsel, TID);
+ evsel__set_sample_bit(evsel, IP);
+ evsel__set_sample_bit(evsel, TID);
if (evsel->sample_read) {
- perf_evsel__set_sample_bit(evsel, READ);
+ evsel__set_sample_bit(evsel, READ);
/*
* We need ID even in case of single event, because
* PERF_SAMPLE_READ process ID specific data.
*/
- perf_evsel__set_sample_id(evsel, false);
+ evsel__set_sample_id(evsel, false);
/*
* Apply group format only if we belong to group
if (!attr->sample_period || (opts->user_freq != UINT_MAX ||
opts->user_interval != ULLONG_MAX)) {
if (opts->freq) {
- perf_evsel__set_sample_bit(evsel, PERIOD);
+ evsel__set_sample_bit(evsel, PERIOD);
attr->freq = 1;
attr->sample_freq = opts->freq;
} else {
}
}
- /*
- * Disable sampling for all group members other
- * than leader in case leader 'leads' the sampling.
- */
- if ((leader != evsel) && leader->sample_read) {
- attr->freq = 0;
- attr->sample_freq = 0;
- attr->sample_period = 0;
- attr->write_backward = 0;
-
- /*
- * We don't get sample for slave events, we make them
- * when delivering group leader sample. Set the slave
- * event to follow the master sample_type to ease up
- * report.
- */
- attr->sample_type = leader->core.attr.sample_type;
- }
-
if (opts->no_samples)
attr->sample_freq = 0;
}
if (opts->sample_address) {
- perf_evsel__set_sample_bit(evsel, ADDR);
+ evsel__set_sample_bit(evsel, ADDR);
attr->mmap_data = track;
}
* event, due to issues with page faults while tracing page
* fault handler and its overall trickiness nature.
*/
- if (perf_evsel__is_function_event(evsel))
+ if (evsel__is_function_event(evsel))
evsel->core.attr.exclude_callchain_user = 1;
if (callchain && callchain->enabled && !evsel->no_aux_samples)
- perf_evsel__config_callchain(evsel, opts, callchain);
+ evsel__config_callchain(evsel, opts, callchain);
if (opts->sample_intr_regs) {
attr->sample_regs_intr = opts->sample_intr_regs;
- perf_evsel__set_sample_bit(evsel, REGS_INTR);
+ evsel__set_sample_bit(evsel, REGS_INTR);
}
if (opts->sample_user_regs) {
attr->sample_regs_user |= opts->sample_user_regs;
- perf_evsel__set_sample_bit(evsel, REGS_USER);
+ evsel__set_sample_bit(evsel, REGS_USER);
}
if (target__has_cpu(&opts->target) || opts->sample_cpu)
- perf_evsel__set_sample_bit(evsel, CPU);
+ evsel__set_sample_bit(evsel, CPU);
/*
* When the user explicitly disabled time don't force it here.
(!perf_missing_features.sample_id_all &&
(!opts->no_inherit || target__has_cpu(&opts->target) || per_cpu ||
opts->sample_time_set)))
- perf_evsel__set_sample_bit(evsel, TIME);
+ evsel__set_sample_bit(evsel, TIME);
if (opts->raw_samples && !evsel->no_aux_samples) {
- perf_evsel__set_sample_bit(evsel, TIME);
- perf_evsel__set_sample_bit(evsel, RAW);
- perf_evsel__set_sample_bit(evsel, CPU);
+ evsel__set_sample_bit(evsel, TIME);
+ evsel__set_sample_bit(evsel, RAW);
+ evsel__set_sample_bit(evsel, CPU);
}
if (opts->sample_address)
- perf_evsel__set_sample_bit(evsel, DATA_SRC);
+ evsel__set_sample_bit(evsel, DATA_SRC);
if (opts->sample_phys_addr)
- perf_evsel__set_sample_bit(evsel, PHYS_ADDR);
+ evsel__set_sample_bit(evsel, PHYS_ADDR);
if (opts->no_buffering) {
attr->watermark = 0;
attr->wakeup_events = 1;
}
if (opts->branch_stack && !evsel->no_aux_samples) {
- perf_evsel__set_sample_bit(evsel, BRANCH_STACK);
+ evsel__set_sample_bit(evsel, BRANCH_STACK);
attr->branch_sample_type = opts->branch_stack;
}
if (opts->sample_weight)
- perf_evsel__set_sample_bit(evsel, WEIGHT);
+ evsel__set_sample_bit(evsel, WEIGHT);
attr->task = track;
attr->mmap = track;
if (opts->record_cgroup) {
attr->cgroup = track && !perf_missing_features.cgroup;
- perf_evsel__set_sample_bit(evsel, CGROUP);
+ evsel__set_sample_bit(evsel, CGROUP);
}
if (opts->record_switch_events)
attr->context_switch = track;
if (opts->sample_transaction)
- perf_evsel__set_sample_bit(evsel, TRANSACTION);
+ evsel__set_sample_bit(evsel, TRANSACTION);
if (opts->running_time) {
evsel->core.attr.read_format |=
* Disabling only independent events or group leaders,
* keeping group members enabled.
*/
- if (perf_evsel__is_group_leader(evsel))
+ if (evsel__is_group_leader(evsel))
attr->disabled = 1;
/*
* Setting enable_on_exec for independent events and
* group leaders for traced executed by perf.
*/
- if (target__none(&opts->target) && perf_evsel__is_group_leader(evsel) &&
- !opts->initial_delay)
+ if (target__none(&opts->target) && evsel__is_group_leader(evsel) &&
+ !opts->initial_delay)
attr->enable_on_exec = 1;
if (evsel->immediate) {
/* The --period option takes the precedence. */
if (opts->period_set) {
if (opts->period)
- perf_evsel__set_sample_bit(evsel, PERIOD);
+ evsel__set_sample_bit(evsel, PERIOD);
else
- perf_evsel__reset_sample_bit(evsel, PERIOD);
+ evsel__reset_sample_bit(evsel, PERIOD);
}
/*
* if BRANCH_STACK bit is set.
*/
if (opts->initial_delay && is_dummy_event(evsel))
- perf_evsel__reset_sample_bit(evsel, BRANCH_STACK);
+ evsel__reset_sample_bit(evsel, BRANCH_STACK);
}
-int perf_evsel__set_filter(struct evsel *evsel, const char *filter)
+int evsel__set_filter(struct evsel *evsel, const char *filter)
{
char *new_filter = strdup(filter);
return -1;
}
-static int perf_evsel__append_filter(struct evsel *evsel,
- const char *fmt, const char *filter)
+static int evsel__append_filter(struct evsel *evsel, const char *fmt, const char *filter)
{
char *new_filter;
if (evsel->filter == NULL)
- return perf_evsel__set_filter(evsel, filter);
+ return evsel__set_filter(evsel, filter);
if (asprintf(&new_filter, fmt, evsel->filter, filter) > 0) {
free(evsel->filter);
return -1;
}
-int perf_evsel__append_tp_filter(struct evsel *evsel, const char *filter)
+int evsel__append_tp_filter(struct evsel *evsel, const char *filter)
{
- return perf_evsel__append_filter(evsel, "(%s) && (%s)", filter);
+ return evsel__append_filter(evsel, "(%s) && (%s)", filter);
}
-int perf_evsel__append_addr_filter(struct evsel *evsel, const char *filter)
+int evsel__append_addr_filter(struct evsel *evsel, const char *filter)
{
- return perf_evsel__append_filter(evsel, "%s,%s", filter);
+ return evsel__append_filter(evsel, "%s,%s", filter);
}
/* Caller has to clear disabled after going through all CPUs. */
}
}
-void perf_evsel__exit(struct evsel *evsel)
+void evsel__exit(struct evsel *evsel)
{
assert(list_empty(&evsel->core.node));
assert(evsel->evlist == NULL);
void evsel__delete(struct evsel *evsel)
{
- perf_evsel__exit(evsel);
+ evsel__exit(evsel);
free(evsel);
}
-void perf_evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
- struct perf_counts_values *count)
+void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
+ struct perf_counts_values *count)
{
struct perf_counts_values tmp;
*pscaled = scaled;
}
-static int
-perf_evsel__read_one(struct evsel *evsel, int cpu, int thread)
+static int evsel__read_one(struct evsel *evsel, int cpu, int thread)
{
struct perf_counts_values *count = perf_counts(evsel->counts, cpu, thread);
return 0;
}
-static int
-perf_evsel__read_group(struct evsel *leader, int cpu, int thread)
+static int evsel__read_group(struct evsel *leader, int cpu, int thread)
{
struct perf_stat_evsel *ps = leader->stats;
u64 read_format = leader->core.attr.read_format;
if (!(read_format & PERF_FORMAT_ID))
return -EINVAL;
- if (!perf_evsel__is_group_leader(leader))
+ if (!evsel__is_group_leader(leader))
return -EINVAL;
if (!data) {
return perf_evsel__process_group_data(leader, cpu, thread, data);
}
-int perf_evsel__read_counter(struct evsel *evsel, int cpu, int thread)
+int evsel__read_counter(struct evsel *evsel, int cpu, int thread)
{
u64 read_format = evsel->core.attr.read_format;
if (read_format & PERF_FORMAT_GROUP)
- return perf_evsel__read_group(evsel, cpu, thread);
- else
- return perf_evsel__read_one(evsel, cpu, thread);
+ return evsel__read_group(evsel, cpu, thread);
+
+ return evsel__read_one(evsel, cpu, thread);
}
-int __perf_evsel__read_on_cpu(struct evsel *evsel,
- int cpu, int thread, bool scale)
+int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale)
{
struct perf_counts_values count;
size_t nv = scale ? 3 : 1;
if (readn(FD(evsel, cpu, thread), &count, nv * sizeof(u64)) <= 0)
return -errno;
- perf_evsel__compute_deltas(evsel, cpu, thread, &count);
+ evsel__compute_deltas(evsel, cpu, thread, &count);
perf_counts_values__scale(&count, scale, NULL);
*perf_counts(evsel->counts, cpu, thread) = count;
return 0;
struct evsel *leader = evsel->leader;
int fd;
- if (perf_evsel__is_group_leader(evsel))
+ if (evsel__is_group_leader(evsel))
return -1;
/*
/*
* If we succeeded but had to kill clockid, fail and
- * have perf_evsel__open_strerror() print us a nice
- * error.
+ * have evsel__open_strerror() print us a nice error.
*/
if (perf_missing_features.clockid ||
perf_missing_features.clockid_wrong) {
} else if (!perf_missing_features.group_read &&
evsel->core.attr.inherit &&
(evsel->core.attr.read_format & PERF_FORMAT_GROUP) &&
- perf_evsel__is_group_leader(evsel)) {
+ evsel__is_group_leader(evsel)) {
perf_missing_features.group_read = true;
pr_debug2_peo("switching off group read\n");
goto fallback_missing_features;
perf_evsel__free_id(&evsel->core);
}
-int perf_evsel__open_per_cpu(struct evsel *evsel,
- struct perf_cpu_map *cpus,
- int cpu)
+int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu)
{
if (cpu == -1)
return evsel__open_cpu(evsel, cpus, NULL, 0,
return evsel__open_cpu(evsel, cpus, NULL, cpu, cpu + 1);
}
-int perf_evsel__open_per_thread(struct evsel *evsel,
- struct perf_thread_map *threads)
+int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads)
{
return evsel__open(evsel, NULL, threads);
}
return 0;
}
-int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
- struct perf_sample *data)
+int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
+ struct perf_sample *data)
{
u64 type = evsel->core.attr.sample_type;
bool swapped = evsel->needs_swap;
}
}
- if (evsel__has_callchain(evsel)) {
+ if (type & PERF_SAMPLE_CALLCHAIN) {
const u64 max_callchain_nr = UINT64_MAX / sizeof(u64);
OVERFLOW_CHECK_u64(array);
return -EFAULT;
sz = data->branch_stack->nr * sizeof(struct branch_entry);
- if (perf_evsel__has_branch_hw_idx(evsel))
+ if (evsel__has_branch_hw_idx(evsel))
sz += sizeof(u64);
else
data->no_hw_idx = true;
return 0;
}
-int perf_evsel__parse_sample_timestamp(struct evsel *evsel,
- union perf_event *event,
- u64 *timestamp)
+int evsel__parse_sample_timestamp(struct evsel *evsel, union perf_event *event,
+ u64 *timestamp)
{
u64 type = evsel->core.attr.sample_type;
const __u64 *array;
return 0;
}
-struct tep_format_field *perf_evsel__field(struct evsel *evsel, const char *name)
+struct tep_format_field *evsel__field(struct evsel *evsel, const char *name)
{
return tep_find_field(evsel->tp_format, name);
}
-void *perf_evsel__rawptr(struct evsel *evsel, struct perf_sample *sample,
- const char *name)
+void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name)
{
- struct tep_format_field *field = perf_evsel__field(evsel, name);
+ struct tep_format_field *field = evsel__field(evsel, name);
int offset;
if (!field)
return 0;
}
-u64 perf_evsel__intval(struct evsel *evsel, struct perf_sample *sample,
- const char *name)
+u64 evsel__intval(struct evsel *evsel, struct perf_sample *sample, const char *name)
{
- struct tep_format_field *field = perf_evsel__field(evsel, name);
+ struct tep_format_field *field = evsel__field(evsel, name);
if (!field)
return 0;
return field ? format_field__intval(field, sample, evsel->needs_swap) : 0;
}
-bool perf_evsel__fallback(struct evsel *evsel, int err,
- char *msg, size_t msgsize)
+bool evsel__fallback(struct evsel *evsel, int err, char *msg, size_t msgsize)
{
int paranoid;
return true;
} else if (err == EACCES && !evsel->core.attr.exclude_kernel &&
(paranoid = perf_event_paranoid()) > 1) {
- const char *name = perf_evsel__name(evsel);
+ const char *name = evsel__name(evsel);
char *new_name;
const char *sep = ":";
+ /* If event has exclude user then don't exclude kernel. */
+ if (evsel->core.attr.exclude_user)
+ return false;
+
/* Is there already the separator in the name. */
if (strchr(name, '/') ||
strchr(name, ':'))
return ret ? false : true;
}
-int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
- int err, char *msg, size_t size)
+int evsel__open_strerror(struct evsel *evsel, struct target *target,
+ int err, char *msg, size_t size)
{
char sbuf[STRERR_BUFSIZE];
int printed = 0;
case EACCES:
if (err == EPERM)
printed = scnprintf(msg, size,
- "No permission to enable %s event.\n\n",
- perf_evsel__name(evsel));
+ "No permission to enable %s event.\n\n", evsel__name(evsel));
return scnprintf(msg + printed, size - printed,
"You may not have permission to collect %sstats.\n\n"
"Consider tweaking /proc/sys/kernel/perf_event_paranoid,\n"
"which controls use of the performance events system by\n"
- "unprivileged users (without CAP_SYS_ADMIN).\n\n"
+ "unprivileged users (without CAP_PERFMON or CAP_SYS_ADMIN).\n\n"
"The current value is %d:\n\n"
" -1: Allow use of (almost) all events by all users\n"
" Ignore mlock limit after perf_event_mlock_kb without CAP_IPC_LOCK\n"
- ">= 0: Disallow ftrace function tracepoint by users without CAP_SYS_ADMIN\n"
- " Disallow raw tracepoint access by users without CAP_SYS_ADMIN\n"
- ">= 1: Disallow CPU event access by users without CAP_SYS_ADMIN\n"
- ">= 2: Disallow kernel profiling by users without CAP_SYS_ADMIN\n\n"
+ ">= 0: Disallow ftrace function tracepoint by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
+ " Disallow raw tracepoint access by users without CAP_SYS_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 1: Disallow CPU event access by users without CAP_PERFMON or CAP_SYS_ADMIN\n"
+ ">= 2: Disallow kernel profiling by users without CAP_PERFMON or CAP_SYS_ADMIN\n\n"
"To make this setting permanent, edit /etc/sysctl.conf too, e.g.:\n\n"
" kernel.perf_event_paranoid = -1\n" ,
target->system_wide ? "system-wide " : "",
perf_event_paranoid());
case ENOENT:
- return scnprintf(msg, size, "The %s event is not supported.",
- perf_evsel__name(evsel));
+ return scnprintf(msg, size, "The %s event is not supported.", evsel__name(evsel));
case EMFILE:
return scnprintf(msg, size, "%s",
"Too many events are opened.\n"
if (evsel->core.attr.sample_period != 0)
return scnprintf(msg, size,
"%s: PMU Hardware doesn't support sampling/overflow-interrupts. Try 'perf stat'",
- perf_evsel__name(evsel));
+ evsel__name(evsel));
if (evsel->core.attr.precise_ip)
return scnprintf(msg, size, "%s",
"\'precise\' request may not be supported. Try removing 'p' modifier.");
return scnprintf(msg, size,
"The sys_perf_event_open() syscall returned with %d (%s) for event (%s).\n"
"/bin/dmesg | grep -i perf may provide additional information.\n",
- err, str_error_r(err, sbuf, sizeof(sbuf)),
- perf_evsel__name(evsel));
+ err, str_error_r(err, sbuf, sizeof(sbuf)), evsel__name(evsel));
}
-struct perf_env *perf_evsel__env(struct evsel *evsel)
+struct perf_env *evsel__env(struct evsel *evsel)
{
if (evsel && evsel->evlist)
return evsel->evlist->env;
return 0;
}
-int perf_evsel__store_ids(struct evsel *evsel, struct evlist *evlist)
+int evsel__store_ids(struct evsel *evsel, struct evlist *evlist)
{
struct perf_cpu_map *cpus = evsel->core.cpus;
struct perf_thread_map *threads = evsel->core.threads;
struct perf_stat_evsel;
union perf_event;
-typedef int (perf_evsel__sb_cb_t)(union perf_event *event, void *data);
+typedef int (evsel__sb_cb_t)(union perf_event *event, void *data);
enum perf_tool_event {
PERF_TOOL_NONE = 0,
int cpu_iter;
const char *pmu_name;
struct {
- perf_evsel__sb_cb_t *cb;
- void *data;
+ evsel__sb_cb_t *cb;
+ void *data;
} side_band;
+ /*
+ * For reporting purposes, an evsel sample can have a callchain
+ * synthesized from AUX area data. Keep track of synthesized sample
+ * types here. Note, the recorded sample_type cannot be changed because
+ * it is needed to continue to parse events.
+ * See also evsel__has_callchain().
+ */
+ __u64 synth_sample_type;
};
struct perf_missing_features {
return perf_evsel__cpus(&evsel->core);
}
-static inline int perf_evsel__nr_cpus(struct evsel *evsel)
+static inline int evsel__nr_cpus(struct evsel *evsel)
{
return evsel__cpus(evsel)->nr;
}
void perf_counts_values__scale(struct perf_counts_values *count,
bool scale, s8 *pscaled);
-void perf_evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
- struct perf_counts_values *count);
+void evsel__compute_deltas(struct evsel *evsel, int cpu, int thread,
+ struct perf_counts_values *count);
int perf_evsel__object_config(size_t object_size,
int (*init)(struct evsel *evsel),
void (*fini)(struct evsel *evsel));
+struct perf_pmu *evsel__find_pmu(struct evsel *evsel);
+bool evsel__is_aux_event(struct evsel *evsel);
+
struct evsel *perf_evsel__new_idx(struct perf_event_attr *attr, int idx);
static inline struct evsel *evsel__new(struct perf_event_attr *attr)
struct tep_event *event_format__new(const char *sys, const char *name);
void evsel__init(struct evsel *evsel, struct perf_event_attr *attr, int idx);
-void perf_evsel__exit(struct evsel *evsel);
+void evsel__exit(struct evsel *evsel);
void evsel__delete(struct evsel *evsel);
struct callchain_param;
-void perf_evsel__config(struct evsel *evsel,
- struct record_opts *opts,
- struct callchain_param *callchain);
-void perf_evsel__config_callchain(struct evsel *evsel,
- struct record_opts *opts,
- struct callchain_param *callchain);
+void evsel__config(struct evsel *evsel, struct record_opts *opts,
+ struct callchain_param *callchain);
+void evsel__config_callchain(struct evsel *evsel, struct record_opts *opts,
+ struct callchain_param *callchain);
-int __perf_evsel__sample_size(u64 sample_type);
-void perf_evsel__calc_id_pos(struct evsel *evsel);
+int __evsel__sample_size(u64 sample_type);
+void evsel__calc_id_pos(struct evsel *evsel);
-bool perf_evsel__is_cache_op_valid(u8 type, u8 op);
+bool evsel__is_cache_op_valid(u8 type, u8 op);
#define PERF_EVSEL__MAX_ALIASES 8
[PERF_EVSEL__MAX_ALIASES];
extern const char *perf_evsel__hw_names[PERF_COUNT_HW_MAX];
extern const char *perf_evsel__sw_names[PERF_COUNT_SW_MAX];
-int __perf_evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result,
- char *bf, size_t size);
-const char *perf_evsel__name(struct evsel *evsel);
+int __evsel__hw_cache_type_op_res_name(u8 type, u8 op, u8 result, char *bf, size_t size);
+const char *evsel__name(struct evsel *evsel);
-const char *perf_evsel__group_name(struct evsel *evsel);
-int perf_evsel__group_desc(struct evsel *evsel, char *buf, size_t size);
+const char *evsel__group_name(struct evsel *evsel);
+int evsel__group_desc(struct evsel *evsel, char *buf, size_t size);
-void __perf_evsel__set_sample_bit(struct evsel *evsel,
- enum perf_event_sample_format bit);
-void __perf_evsel__reset_sample_bit(struct evsel *evsel,
- enum perf_event_sample_format bit);
+void __evsel__set_sample_bit(struct evsel *evsel, enum perf_event_sample_format bit);
+void __evsel__reset_sample_bit(struct evsel *evsel, enum perf_event_sample_format bit);
-#define perf_evsel__set_sample_bit(evsel, bit) \
- __perf_evsel__set_sample_bit(evsel, PERF_SAMPLE_##bit)
+#define evsel__set_sample_bit(evsel, bit) \
+ __evsel__set_sample_bit(evsel, PERF_SAMPLE_##bit)
-#define perf_evsel__reset_sample_bit(evsel, bit) \
- __perf_evsel__reset_sample_bit(evsel, PERF_SAMPLE_##bit)
+#define evsel__reset_sample_bit(evsel, bit) \
+ __evsel__reset_sample_bit(evsel, PERF_SAMPLE_##bit)
-void perf_evsel__set_sample_id(struct evsel *evsel,
- bool use_sample_identifier);
+void evsel__set_sample_id(struct evsel *evsel, bool use_sample_identifier);
-int perf_evsel__set_filter(struct evsel *evsel, const char *filter);
-int perf_evsel__append_tp_filter(struct evsel *evsel, const char *filter);
-int perf_evsel__append_addr_filter(struct evsel *evsel,
- const char *filter);
+int evsel__set_filter(struct evsel *evsel, const char *filter);
+int evsel__append_tp_filter(struct evsel *evsel, const char *filter);
+int evsel__append_addr_filter(struct evsel *evsel, const char *filter);
int evsel__enable_cpu(struct evsel *evsel, int cpu);
int evsel__enable(struct evsel *evsel);
int evsel__disable(struct evsel *evsel);
int evsel__disable_cpu(struct evsel *evsel, int cpu);
-int perf_evsel__open_per_cpu(struct evsel *evsel,
- struct perf_cpu_map *cpus,
- int cpu);
-int perf_evsel__open_per_thread(struct evsel *evsel,
- struct perf_thread_map *threads);
+int evsel__open_per_cpu(struct evsel *evsel, struct perf_cpu_map *cpus, int cpu);
+int evsel__open_per_thread(struct evsel *evsel, struct perf_thread_map *threads);
int evsel__open(struct evsel *evsel, struct perf_cpu_map *cpus,
struct perf_thread_map *threads);
void evsel__close(struct evsel *evsel);
struct perf_sample;
-void *perf_evsel__rawptr(struct evsel *evsel, struct perf_sample *sample,
- const char *name);
-u64 perf_evsel__intval(struct evsel *evsel, struct perf_sample *sample,
- const char *name);
+void *evsel__rawptr(struct evsel *evsel, struct perf_sample *sample, const char *name);
+u64 evsel__intval(struct evsel *evsel, struct perf_sample *sample, const char *name);
-static inline char *perf_evsel__strval(struct evsel *evsel,
- struct perf_sample *sample,
- const char *name)
+static inline char *evsel__strval(struct evsel *evsel, struct perf_sample *sample, const char *name)
{
- return perf_evsel__rawptr(evsel, sample, name);
+ return evsel__rawptr(evsel, sample, name);
}
struct tep_format_field;
u64 format_field__intval(struct tep_format_field *field, struct perf_sample *sample, bool needs_swap);
-struct tep_format_field *perf_evsel__field(struct evsel *evsel, const char *name);
+struct tep_format_field *evsel__field(struct evsel *evsel, const char *name);
-#define perf_evsel__match(evsel, t, c) \
+#define evsel__match(evsel, t, c) \
(evsel->core.attr.type == PERF_TYPE_##t && \
evsel->core.attr.config == PERF_COUNT_##c)
-static inline bool perf_evsel__match2(struct evsel *e1,
- struct evsel *e2)
+static inline bool evsel__match2(struct evsel *e1, struct evsel *e2)
{
return (e1->core.attr.type == e2->core.attr.type) &&
(e1->core.attr.config == e2->core.attr.config);
}
-#define perf_evsel__cmp(a, b) \
- ((a) && \
- (b) && \
- (a)->core.attr.type == (b)->core.attr.type && \
- (a)->core.attr.config == (b)->core.attr.config)
-
-int perf_evsel__read_counter(struct evsel *evsel, int cpu, int thread);
+int evsel__read_counter(struct evsel *evsel, int cpu, int thread);
-int __perf_evsel__read_on_cpu(struct evsel *evsel,
- int cpu, int thread, bool scale);
+int __evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread, bool scale);
/**
- * perf_evsel__read_on_cpu - Read out the results on a CPU and thread
+ * evsel__read_on_cpu - Read out the results on a CPU and thread
*
* @evsel - event selector to read value
* @cpu - CPU of interest
* @thread - thread of interest
*/
-static inline int perf_evsel__read_on_cpu(struct evsel *evsel,
- int cpu, int thread)
+static inline int evsel__read_on_cpu(struct evsel *evsel, int cpu, int thread)
{
- return __perf_evsel__read_on_cpu(evsel, cpu, thread, false);
+ return __evsel__read_on_cpu(evsel, cpu, thread, false);
}
/**
- * perf_evsel__read_on_cpu_scaled - Read out the results on a CPU and thread, scaled
+ * evsel__read_on_cpu_scaled - Read out the results on a CPU and thread, scaled
*
* @evsel - event selector to read value
* @cpu - CPU of interest
* @thread - thread of interest
*/
-static inline int perf_evsel__read_on_cpu_scaled(struct evsel *evsel,
- int cpu, int thread)
+static inline int evsel__read_on_cpu_scaled(struct evsel *evsel, int cpu, int thread)
{
- return __perf_evsel__read_on_cpu(evsel, cpu, thread, true);
+ return __evsel__read_on_cpu(evsel, cpu, thread, true);
}
-int perf_evsel__parse_sample(struct evsel *evsel, union perf_event *event,
- struct perf_sample *sample);
+int evsel__parse_sample(struct evsel *evsel, union perf_event *event,
+ struct perf_sample *sample);
-int perf_evsel__parse_sample_timestamp(struct evsel *evsel,
- union perf_event *event,
- u64 *timestamp);
+int evsel__parse_sample_timestamp(struct evsel *evsel, union perf_event *event,
+ u64 *timestamp);
-static inline struct evsel *perf_evsel__next(struct evsel *evsel)
+static inline struct evsel *evsel__next(struct evsel *evsel)
{
return list_entry(evsel->core.node.next, struct evsel, core.node);
}
-static inline struct evsel *perf_evsel__prev(struct evsel *evsel)
+static inline struct evsel *evsel__prev(struct evsel *evsel)
{
return list_entry(evsel->core.node.prev, struct evsel, core.node);
}
/**
- * perf_evsel__is_group_leader - Return whether given evsel is a leader event
+ * evsel__is_group_leader - Return whether given evsel is a leader event
*
* @evsel - evsel selector to be tested
*
* Return %true if @evsel is a group leader or a stand-alone event
*/
-static inline bool perf_evsel__is_group_leader(const struct evsel *evsel)
+static inline bool evsel__is_group_leader(const struct evsel *evsel)
{
return evsel->leader == evsel;
}
/**
- * perf_evsel__is_group_event - Return whether given evsel is a group event
+ * evsel__is_group_event - Return whether given evsel is a group event
*
* @evsel - evsel selector to be tested
*
* Return %true iff event group view is enabled and @evsel is a actual group
* leader which has other members in the group
*/
-static inline bool perf_evsel__is_group_event(struct evsel *evsel)
+static inline bool evsel__is_group_event(struct evsel *evsel)
{
if (!symbol_conf.event_group)
return false;
- return perf_evsel__is_group_leader(evsel) && evsel->core.nr_members > 1;
+ return evsel__is_group_leader(evsel) && evsel->core.nr_members > 1;
}
-bool perf_evsel__is_function_event(struct evsel *evsel);
+bool evsel__is_function_event(struct evsel *evsel);
-static inline bool perf_evsel__is_bpf_output(struct evsel *evsel)
+static inline bool evsel__is_bpf_output(struct evsel *evsel)
{
- return perf_evsel__match(evsel, SOFTWARE, SW_BPF_OUTPUT);
+ return evsel__match(evsel, SOFTWARE, SW_BPF_OUTPUT);
}
-static inline bool perf_evsel__is_clock(struct evsel *evsel)
+static inline bool evsel__is_clock(struct evsel *evsel)
{
- return perf_evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK) ||
- perf_evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK);
+ return evsel__match(evsel, SOFTWARE, SW_CPU_CLOCK) ||
+ evsel__match(evsel, SOFTWARE, SW_TASK_CLOCK);
}
-bool perf_evsel__fallback(struct evsel *evsel, int err,
- char *msg, size_t msgsize);
-int perf_evsel__open_strerror(struct evsel *evsel, struct target *target,
- int err, char *msg, size_t size);
+bool evsel__fallback(struct evsel *evsel, int err, char *msg, size_t msgsize);
+int evsel__open_strerror(struct evsel *evsel, struct target *target,
+ int err, char *msg, size_t size);
-static inline int perf_evsel__group_idx(struct evsel *evsel)
+static inline int evsel__group_idx(struct evsel *evsel)
{
return evsel->idx - evsel->leader->idx;
}
(_evsel) && (_evsel)->leader == (_leader); \
(_evsel) = list_entry((_evsel)->core.node.next, struct evsel, core.node))
-static inline bool perf_evsel__has_branch_callstack(const struct evsel *evsel)
+static inline bool evsel__has_branch_callstack(const struct evsel *evsel)
{
return evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_CALL_STACK;
}
-static inline bool perf_evsel__has_branch_hw_idx(const struct evsel *evsel)
+static inline bool evsel__has_branch_hw_idx(const struct evsel *evsel)
{
return evsel->core.attr.branch_sample_type & PERF_SAMPLE_BRANCH_HW_INDEX;
}
static inline bool evsel__has_callchain(const struct evsel *evsel)
{
- return (evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN) != 0;
+ /*
+ * For reporting purposes, an evsel sample can have a recorded callchain
+ * or a callchain synthesized from AUX area data.
+ */
+ return evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN ||
+ evsel->synth_sample_type & PERF_SAMPLE_CALLCHAIN;
+}
+
+static inline bool evsel__has_br_stack(const struct evsel *evsel)
+{
+ /*
+ * For reporting purposes, an evsel sample can have a recorded branch
+ * stack or a branch stack synthesized from AUX area data.
+ */
+ return evsel->core.attr.sample_type & PERF_SAMPLE_BRANCH_STACK ||
+ evsel->synth_sample_type & PERF_SAMPLE_BRANCH_STACK;
}
-struct perf_env *perf_evsel__env(struct evsel *evsel);
+struct perf_env *evsel__env(struct evsel *evsel);
-int perf_evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
+int evsel__store_ids(struct evsel *evsel, struct evlist *evlist);
#endif /* __PERF_EVSEL_H */
/*
* The 'struct perf_evsel_config_term' is used to pass event
- * specific configuration data to perf_evsel__config routine.
+ * specific configuration data to evsel__config routine.
* It is allocated within event parsing and attached to
* perf_evsel::config_terms list head.
*/
if (details->event_group) {
struct evsel *pos;
- if (!perf_evsel__is_group_leader(evsel))
+ if (!evsel__is_group_leader(evsel))
return 0;
if (evsel->core.nr_members > 1)
printed += fprintf(fp, "%s{", evsel->group_name ?: "");
- printed += fprintf(fp, "%s", perf_evsel__name(evsel));
+ printed += fprintf(fp, "%s", evsel__name(evsel));
for_each_group_member(pos, evsel)
- printed += fprintf(fp, ",%s", perf_evsel__name(pos));
+ printed += fprintf(fp, ",%s", evsel__name(pos));
if (evsel->core.nr_members > 1)
printed += fprintf(fp, "}");
goto out;
}
- printed += fprintf(fp, "%s", perf_evsel__name(evsel));
+ printed += fprintf(fp, "%s", evsel__name(evsel));
if (details->verbose) {
printed += perf_event_attr__fprintf(fp, &evsel->core.attr,
#include <assert.h>
#include "expr.h"
#include "expr-bison.h"
-#define YY_EXTRA_TYPE int
#include "expr-flex.h"
#ifdef PARSER_DEBUG
#endif
/* Caller must make sure id is allocated */
-void expr__add_id(struct parse_ctx *ctx, const char *name, double val)
+void expr__add_id(struct expr_parse_ctx *ctx, const char *name, double val)
{
int idx;
ctx->ids[idx].val = val;
}
-void expr__ctx_init(struct parse_ctx *ctx)
+void expr__ctx_init(struct expr_parse_ctx *ctx)
{
ctx->num_ids = 0;
}
static int
-__expr__parse(double *val, struct parse_ctx *ctx, const char *expr,
- int start)
+__expr__parse(double *val, struct expr_parse_ctx *ctx, const char *expr,
+ int start, int runtime)
{
+ struct expr_scanner_ctx scanner_ctx = {
+ .start_token = start,
+ .runtime = runtime,
+ };
YY_BUFFER_STATE buffer;
void *scanner;
int ret;
- ret = expr_lex_init_extra(start, &scanner);
+ ret = expr_lex_init_extra(&scanner_ctx, &scanner);
if (ret)
return ret;
return ret;
}
-int expr__parse(double *final_val, struct parse_ctx *ctx, const char *expr)
+int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr, int runtime)
{
- return __expr__parse(final_val, ctx, expr, EXPR_PARSE) ? -1 : 0;
+ return __expr__parse(final_val, ctx, expr, EXPR_PARSE, runtime) ? -1 : 0;
}
static bool
}
int expr__find_other(const char *expr, const char *one, const char ***other,
- int *num_other)
+ int *num_other, int runtime)
{
int err, i = 0, j = 0;
- struct parse_ctx ctx;
+ struct expr_parse_ctx ctx;
expr__ctx_init(&ctx);
- err = __expr__parse(NULL, &ctx, expr, EXPR_OTHER);
+ err = __expr__parse(NULL, &ctx, expr, EXPR_OTHER, runtime);
if (err)
return -1;
#define EXPR_MAX_OTHER 20
#define MAX_PARSE_ID EXPR_MAX_OTHER
-struct parse_id {
+struct expr_parse_id {
const char *name;
double val;
};
-struct parse_ctx {
+struct expr_parse_ctx {
int num_ids;
- struct parse_id ids[MAX_PARSE_ID];
+ struct expr_parse_id ids[MAX_PARSE_ID];
};
-void expr__ctx_init(struct parse_ctx *ctx);
-void expr__add_id(struct parse_ctx *ctx, const char *id, double val);
-int expr__parse(double *final_val, struct parse_ctx *ctx, const char *expr);
+struct expr_scanner_ctx {
+ int start_token;
+ int runtime;
+};
+
+void expr__ctx_init(struct expr_parse_ctx *ctx);
+void expr__add_id(struct expr_parse_ctx *ctx, const char *id, double val);
+int expr__parse(double *final_val, struct expr_parse_ctx *ctx, const char *expr, int runtime);
int expr__find_other(const char *expr, const char *one, const char ***other,
- int *num_other);
+ int *num_other, int runtime);
#endif
* Allow @ instead of / to be able to specify pmu/event/ without
* conflicts with normal division.
*/
-static char *normalize(char *str)
+static char *normalize(char *str, int runtime)
{
char *ret = str;
char *dst = str;
*dst++ = '/';
else if (*str == '\\')
*dst++ = *++str;
+ else if (*str == '?') {
+ char *paramval;
+ int i = 0;
+ int size = asprintf(¶mval, "%d", runtime);
+
+ if (size < 0)
+ *dst++ = '0';
+ else {
+ while (i < size)
+ *dst++ = paramval[i++];
+ free(paramval);
+ }
+ }
else
*dst++ = *str;
str++;
return ret;
}
-static int str(yyscan_t scanner, int token)
+static int str(yyscan_t scanner, int token, int runtime)
{
YYSTYPE *yylval = expr_get_lval(scanner);
char *text = expr_get_text(scanner);
- yylval->str = normalize(strdup(text));
+ yylval->str = normalize(strdup(text), runtime);
if (!yylval->str)
return EXPR_ERROR;
- yylval->str = normalize(yylval->str);
+ yylval->str = normalize(yylval->str, runtime);
return token;
}
%}
sch [-,=]
spec \\{sch}
-sym [0-9a-zA-Z_\.:@]+
-symbol {spec}*{sym}*{spec}*{sym}*
+sym [0-9a-zA-Z_\.:@?]+
+symbol {spec}*{sym}*{spec}*{sym}*{spec}*{sym}
%%
- {
- int start_token;
+ struct expr_scanner_ctx *sctx = expr_get_extra(yyscanner);
- start_token = expr_get_extra(yyscanner);
+ {
+ int start_token = sctx->start_token;
- if (start_token) {
- expr_set_extra(NULL, yyscanner);
+ if (sctx->start_token) {
+ sctx->start_token = 0;
return start_token;
}
}
else { return ELSE; }
#smt_on { return SMT_ON; }
{number} { return value(yyscanner, 10); }
-{symbol} { return str(yyscanner, ID); }
+{symbol} { return str(yyscanner, ID, sctx->runtime); }
"|" { return '|'; }
"^" { return '^'; }
"&" { return '&'; }
%define api.pure full
%parse-param { double *final_val }
-%parse-param { struct parse_ctx *ctx }
+%parse-param { struct expr_parse_ctx *ctx }
%parse-param {void *scanner}
%lex-param {void* scanner}
%{
static void expr_error(double *final_val __maybe_unused,
- struct parse_ctx *ctx __maybe_unused,
+ struct expr_parse_ctx *ctx __maybe_unused,
void *scanner,
const char *s)
{
pr_debug("%s\n", s);
}
-static int lookup_id(struct parse_ctx *ctx, char *id, double *val)
+static int lookup_id(struct expr_parse_ctx *ctx, char *id, double *val)
{
int i;
/*
* write event string as passed on cmdline
*/
- ret = do_write_string(ff, perf_evsel__name(evsel));
+ ret = do_write_string(ff, evsel__name(evsel));
if (ret < 0)
return ret;
/*
return ret;
evlist__for_each_entry(evlist, evsel) {
- if (perf_evsel__is_group_leader(evsel) &&
- evsel->core.nr_members > 1) {
+ if (evsel__is_group_leader(evsel) && evsel->core.nr_members > 1) {
const char *name = evsel->group_name ?: "{anon_group}";
u32 leader_idx = evsel->idx;
u32 nr_members = evsel->core.nr_members;
return do_write(ff, &(ff->ph->env.comp_mmap_len), sizeof(ff->ph->env.comp_mmap_len));
}
+static int write_cpu_pmu_caps(struct feat_fd *ff,
+ struct evlist *evlist __maybe_unused)
+{
+ struct perf_pmu *cpu_pmu = perf_pmu__find("cpu");
+ struct perf_pmu_caps *caps = NULL;
+ int nr_caps;
+ int ret;
+
+ if (!cpu_pmu)
+ return -ENOENT;
+
+ nr_caps = perf_pmu__caps_parse(cpu_pmu);
+ if (nr_caps < 0)
+ return nr_caps;
+
+ ret = do_write(ff, &nr_caps, sizeof(nr_caps));
+ if (ret < 0)
+ return ret;
+
+ list_for_each_entry(caps, &cpu_pmu->caps, list) {
+ ret = do_write_string(ff, caps->name);
+ if (ret < 0)
+ return ret;
+
+ ret = do_write_string(ff, caps->value);
+ if (ret < 0)
+ return ret;
+ }
+
+ return ret;
+}
+
static void print_hostname(struct feat_fd *ff, FILE *fp)
{
fprintf(fp, "# hostname : %s\n", ff->ph->env.hostname);
ff->ph->env.comp_level, ff->ph->env.comp_ratio);
}
+static void print_cpu_pmu_caps(struct feat_fd *ff, FILE *fp)
+{
+ const char *delimiter = "# cpu pmu capabilities: ";
+ u32 nr_caps = ff->ph->env.nr_cpu_pmu_caps;
+ char *str;
+
+ if (!nr_caps) {
+ fprintf(fp, "# cpu pmu capabilities: not available\n");
+ return;
+ }
+
+ str = ff->ph->env.cpu_pmu_caps;
+ while (nr_caps--) {
+ fprintf(fp, "%s%s", delimiter, str);
+ delimiter = ", ";
+ str += strlen(str) + 1;
+ }
+
+ fprintf(fp, "\n");
+}
+
static void print_pmu_mappings(struct feat_fd *ff, FILE *fp)
{
const char *delimiter = "# pmu mappings: ";
session = container_of(ff->ph, struct perf_session, header);
evlist__for_each_entry(session->evlist, evsel) {
- if (perf_evsel__is_group_leader(evsel) &&
- evsel->core.nr_members > 1) {
- fprintf(fp, "# group: %s{%s", evsel->group_name ?: "",
- perf_evsel__name(evsel));
+ if (evsel__is_group_leader(evsel) && evsel->core.nr_members > 1) {
+ fprintf(fp, "# group: %s{%s", evsel->group_name ?: "", evsel__name(evsel));
nr = evsel->core.nr_members - 1;
} else if (nr) {
- fprintf(fp, ",%s", perf_evsel__name(evsel));
+ fprintf(fp, ",%s", evsel__name(evsel));
if (--nr == 0)
fprintf(fp, "}\n");
return 0;
}
+static int process_cpu_pmu_caps(struct feat_fd *ff,
+ void *data __maybe_unused)
+{
+ char *name, *value;
+ struct strbuf sb;
+ u32 nr_caps;
+
+ if (do_read_u32(ff, &nr_caps))
+ return -1;
+
+ if (!nr_caps) {
+ pr_debug("cpu pmu capabilities not available\n");
+ return 0;
+ }
+
+ ff->ph->env.nr_cpu_pmu_caps = nr_caps;
+
+ if (strbuf_init(&sb, 128) < 0)
+ return -1;
+
+ while (nr_caps--) {
+ name = do_read_string(ff);
+ if (!name)
+ goto error;
+
+ value = do_read_string(ff);
+ if (!value)
+ goto free_name;
+
+ if (strbuf_addf(&sb, "%s=%s", name, value) < 0)
+ goto free_value;
+
+ /* include a NULL character at the end */
+ if (strbuf_add(&sb, "", 1) < 0)
+ goto free_value;
+
+ if (!strcmp(name, "branches"))
+ ff->ph->env.max_branches = atoi(value);
+
+ free(value);
+ free(name);
+ }
+ ff->ph->env.cpu_pmu_caps = strbuf_detach(&sb, NULL);
+ return 0;
+
+free_value:
+ free(value);
+free_name:
+ free(name);
+error:
+ strbuf_release(&sb);
+ return -1;
+}
+
#define FEAT_OPR(n, func, __full_only) \
[HEADER_##n] = { \
.name = __stringify(n), \
FEAT_OPR(BPF_PROG_INFO, bpf_prog_info, false),
FEAT_OPR(BPF_BTF, bpf_btf, false),
FEAT_OPR(COMPRESSED, compressed, false),
+ FEAT_OPR(CPU_PMU_CAPS, cpu_pmu_caps, false),
};
struct header_print_data {
HEADER_BPF_PROG_INFO,
HEADER_BPF_BTF,
HEADER_COMPRESSED,
+ HEADER_CPU_PMU_CAPS,
HEADER_LAST_FEATURE,
HEADER_FEAT_BITS = 256,
};
return fill_callchain_info(al, node, iter->hide_unresolved);
}
+static bool
+hist_entry__fast__sym_diff(struct hist_entry *left,
+ struct hist_entry *right)
+{
+ struct symbol *sym_l = left->ms.sym;
+ struct symbol *sym_r = right->ms.sym;
+
+ if (!sym_l && !sym_r)
+ return left->ip != right->ip;
+
+ return !!_sort__sym_cmp(sym_l, sym_r);
+}
+
+
static int
iter_add_next_cumulative_entry(struct hist_entry_iter *iter,
struct addr_location *al)
};
int i;
struct callchain_cursor cursor;
+ bool fast = hists__has(he_tmp.hists, sym);
callchain_cursor_snapshot(&cursor, &callchain_cursor);
* It's possible that it has cycles or recursive calls.
*/
for (i = 0; i < iter->curr; i++) {
+ /*
+ * For most cases, there are no duplicate entries in callchain.
+ * The symbols are usually different. Do a quick check for
+ * symbols first.
+ */
+ if (fast && hist_entry__fast__sym_diff(he_cache[i], &he_tmp))
+ continue;
+
if (hist_entry__cmp(he_cache[i], &he_tmp) == 0) {
/* to avoid calling callback function */
iter->he = NULL;
size_t ret = 0;
evlist__for_each_entry(evlist, pos) {
- ret += fprintf(fp, "%s stats:\n", perf_evsel__name(pos));
+ ret += fprintf(fp, "%s stats:\n", evsel__name(pos));
ret += events_stats__fprintf(&evsel__hists(pos)->stats, fp);
}
unsigned long nr_samples = hists->stats.nr_events[PERF_RECORD_SAMPLE];
u64 nr_events = hists->stats.total_period;
struct evsel *evsel = hists_to_evsel(hists);
- const char *ev_name = perf_evsel__name(evsel);
+ const char *ev_name = evsel__name(evsel);
char buf[512], sample_freq_str[64] = "";
size_t buflen = sizeof(buf);
char ref[30] = " show reference callgraph, ";
nr_events = hists->stats.total_non_filtered_period;
}
- if (perf_evsel__is_group_event(evsel)) {
+ if (evsel__is_group_event(evsel)) {
struct evsel *pos;
- perf_evsel__group_desc(evsel, buf, buflen);
+ evsel__group_desc(evsel, buf, buflen);
ev_name = buf;
for_each_group_member(pos, evsel) {
le64_to_cpu(branch->from),
le64_to_cpu(branch->to),
btsq->intel_pt_insn.length,
- buffer->buffer_nr + 1);
+ buffer->buffer_nr + 1, true, 0, 0);
if (filter && !(filter & btsq->sample_flags))
continue;
err = intel_bts_synth_branch_sample(btsq, branch);
free(bts);
}
+static bool intel_bts_evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel)
+{
+ struct intel_bts *bts = container_of(session->auxtrace, struct intel_bts,
+ auxtrace);
+
+ return evsel->core.attr.type == bts->pmu_type;
+}
+
struct intel_bts_synth {
struct perf_tool dummy_tool;
struct perf_session *session;
bts->branches_id = id;
/*
* We only use sample types from PERF_SAMPLE_MASK so we can use
- * __perf_evsel__sample_size() here.
+ * __evsel__sample_size() here.
*/
bts->branches_event_size = sizeof(struct perf_record_sample) +
- __perf_evsel__sample_size(attr.sample_type);
+ __evsel__sample_size(attr.sample_type);
}
return 0;
bts->auxtrace.flush_events = intel_bts_flush;
bts->auxtrace.free_events = intel_bts_free_events;
bts->auxtrace.free = intel_bts_free;
+ bts->auxtrace.evsel_is_auxtrace = intel_bts_evsel_is_auxtrace;
session->auxtrace = &bts->auxtrace;
intel_bts_print_info(&auxtrace_info->priv[0], INTEL_BTS_PMU_TYPE,
break;
default:
break;
- };
+ }
if (!(byte & BIT(0))) {
if (byte == 0)
#include "tsc.h"
#include "intel-pt.h"
#include "config.h"
+#include "util/perf_api_probe.h"
#include "util/synthetic-events.h"
#include "time-utils.h"
bool est_tsc;
bool sync_switch;
bool mispred_all;
+ bool use_thread_stack;
+ bool callstack;
+ unsigned int br_stack_sz;
+ unsigned int br_stack_sz_plus;
int have_sched_switch;
u32 pmu_type;
u64 kernel_start;
struct range *time_ranges;
unsigned int range_cnt;
+
+ struct ip_callchain *chain;
+ struct branch_stack *br_stack;
};
enum switch_state {
const struct intel_pt_state *state;
struct ip_callchain *chain;
struct branch_stack *last_branch;
- struct branch_stack *last_branch_rb;
- size_t last_branch_pos;
union perf_event *event_buf;
bool on_heap;
bool stop;
pt->tc.time_mult;
}
+static struct ip_callchain *intel_pt_alloc_chain(struct intel_pt *pt)
+{
+ size_t sz = sizeof(struct ip_callchain);
+
+ /* Add 1 to callchain_sz for callchain context */
+ sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
+ return zalloc(sz);
+}
+
+static int intel_pt_callchain_init(struct intel_pt *pt)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(pt->session->evlist, evsel) {
+ if (!(evsel->core.attr.sample_type & PERF_SAMPLE_CALLCHAIN))
+ evsel->synth_sample_type |= PERF_SAMPLE_CALLCHAIN;
+ }
+
+ pt->chain = intel_pt_alloc_chain(pt);
+ if (!pt->chain)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void intel_pt_add_callchain(struct intel_pt *pt,
+ struct perf_sample *sample)
+{
+ struct thread *thread = machine__findnew_thread(pt->machine,
+ sample->pid,
+ sample->tid);
+
+ thread_stack__sample_late(thread, sample->cpu, pt->chain,
+ pt->synth_opts.callchain_sz + 1, sample->ip,
+ pt->kernel_start);
+
+ sample->callchain = pt->chain;
+}
+
+static struct branch_stack *intel_pt_alloc_br_stack(struct intel_pt *pt)
+{
+ size_t sz = sizeof(struct branch_stack);
+
+ sz += pt->br_stack_sz * sizeof(struct branch_entry);
+ return zalloc(sz);
+}
+
+static int intel_pt_br_stack_init(struct intel_pt *pt)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(pt->session->evlist, evsel) {
+ if (!(evsel->core.attr.sample_type & PERF_SAMPLE_BRANCH_STACK))
+ evsel->synth_sample_type |= PERF_SAMPLE_BRANCH_STACK;
+ }
+
+ pt->br_stack = intel_pt_alloc_br_stack(pt);
+ if (!pt->br_stack)
+ return -ENOMEM;
+
+ return 0;
+}
+
+static void intel_pt_add_br_stack(struct intel_pt *pt,
+ struct perf_sample *sample)
+{
+ struct thread *thread = machine__findnew_thread(pt->machine,
+ sample->pid,
+ sample->tid);
+
+ thread_stack__br_sample_late(thread, sample->cpu, pt->br_stack,
+ pt->br_stack_sz, sample->ip,
+ pt->kernel_start);
+
+ sample->branch_stack = pt->br_stack;
+}
+
static struct intel_pt_queue *intel_pt_alloc_queue(struct intel_pt *pt,
unsigned int queue_nr)
{
return NULL;
if (pt->synth_opts.callchain) {
- size_t sz = sizeof(struct ip_callchain);
-
- /* Add 1 to callchain_sz for callchain context */
- sz += (pt->synth_opts.callchain_sz + 1) * sizeof(u64);
- ptq->chain = zalloc(sz);
+ ptq->chain = intel_pt_alloc_chain(pt);
if (!ptq->chain)
goto out_free;
}
if (pt->synth_opts.last_branch) {
- size_t sz = sizeof(struct branch_stack);
-
- sz += pt->synth_opts.last_branch_sz *
- sizeof(struct branch_entry);
- ptq->last_branch = zalloc(sz);
+ ptq->last_branch = intel_pt_alloc_br_stack(pt);
if (!ptq->last_branch)
goto out_free;
- ptq->last_branch_rb = zalloc(sz);
- if (!ptq->last_branch_rb)
- goto out_free;
}
ptq->event_buf = malloc(PERF_SAMPLE_MAX_SIZE);
out_free:
zfree(&ptq->event_buf);
zfree(&ptq->last_branch);
- zfree(&ptq->last_branch_rb);
zfree(&ptq->chain);
free(ptq);
return NULL;
intel_pt_decoder_free(ptq->decoder);
zfree(&ptq->event_buf);
zfree(&ptq->last_branch);
- zfree(&ptq->last_branch_rb);
zfree(&ptq->chain);
free(ptq);
}
return 0;
}
-static inline void intel_pt_copy_last_branch_rb(struct intel_pt_queue *ptq)
-{
- struct branch_stack *bs_src = ptq->last_branch_rb;
- struct branch_stack *bs_dst = ptq->last_branch;
- size_t nr = 0;
-
- bs_dst->nr = bs_src->nr;
-
- if (!bs_src->nr)
- return;
-
- nr = ptq->pt->synth_opts.last_branch_sz - ptq->last_branch_pos;
- memcpy(&bs_dst->entries[0],
- &bs_src->entries[ptq->last_branch_pos],
- sizeof(struct branch_entry) * nr);
-
- if (bs_src->nr >= ptq->pt->synth_opts.last_branch_sz) {
- memcpy(&bs_dst->entries[nr],
- &bs_src->entries[0],
- sizeof(struct branch_entry) * ptq->last_branch_pos);
- }
-}
-
-static inline void intel_pt_reset_last_branch_rb(struct intel_pt_queue *ptq)
-{
- ptq->last_branch_pos = 0;
- ptq->last_branch_rb->nr = 0;
-}
-
-static void intel_pt_update_last_branch_rb(struct intel_pt_queue *ptq)
-{
- const struct intel_pt_state *state = ptq->state;
- struct branch_stack *bs = ptq->last_branch_rb;
- struct branch_entry *be;
-
- if (!ptq->last_branch_pos)
- ptq->last_branch_pos = ptq->pt->synth_opts.last_branch_sz;
-
- ptq->last_branch_pos -= 1;
-
- be = &bs->entries[ptq->last_branch_pos];
- be->from = state->from_ip;
- be->to = state->to_ip;
- be->flags.abort = !!(state->flags & INTEL_PT_ABORT_TX);
- be->flags.in_tx = !!(state->flags & INTEL_PT_IN_TX);
- /* No support for mispredict */
- be->flags.mispred = ptq->pt->mispred_all;
-
- if (bs->nr < ptq->pt->synth_opts.last_branch_sz)
- bs->nr += 1;
-}
-
static inline bool intel_pt_skip_event(struct intel_pt *pt)
{
return pt->synth_opts.initial_skip &&
return intel_pt_inject_event(event, sample, type);
}
-static int intel_pt_deliver_synth_b_event(struct intel_pt *pt,
- union perf_event *event,
- struct perf_sample *sample, u64 type)
+static int intel_pt_deliver_synth_event(struct intel_pt *pt,
+ union perf_event *event,
+ struct perf_sample *sample, u64 type)
{
int ret;
ptq->last_br_cyc_cnt = ptq->ipc_cyc_cnt;
}
- return intel_pt_deliver_synth_b_event(pt, event, &sample,
- pt->branches_sample_type);
+ return intel_pt_deliver_synth_event(pt, event, &sample,
+ pt->branches_sample_type);
}
static void intel_pt_prep_sample(struct intel_pt *pt,
}
if (pt->synth_opts.last_branch) {
- intel_pt_copy_last_branch_rb(ptq);
+ thread_stack__br_sample(ptq->thread, ptq->cpu, ptq->last_branch,
+ pt->br_stack_sz);
sample->branch_stack = ptq->last_branch;
}
}
-static inline int intel_pt_deliver_synth_event(struct intel_pt *pt,
- struct intel_pt_queue *ptq,
- union perf_event *event,
- struct perf_sample *sample,
- u64 type)
-{
- int ret;
-
- ret = intel_pt_deliver_synth_b_event(pt, event, sample, type);
-
- if (pt->synth_opts.last_branch)
- intel_pt_reset_last_branch_rb(ptq);
-
- return ret;
-}
-
static int intel_pt_synth_instruction_sample(struct intel_pt_queue *ptq)
{
struct intel_pt *pt = ptq->pt;
ptq->last_insn_cnt = ptq->state->tot_insn_cnt;
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->instructions_sample_type);
}
sample.id = ptq->pt->transactions_id;
sample.stream_id = ptq->pt->transactions_id;
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->transactions_sample_type);
}
sample.raw_size = perf_synth__raw_size(raw);
sample.raw_data = perf_synth__raw_data(&raw);
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->ptwrites_sample_type);
}
sample.raw_size = perf_synth__raw_size(raw);
sample.raw_data = perf_synth__raw_data(&raw);
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->pwr_events_sample_type);
}
sample.raw_size = perf_synth__raw_size(raw);
sample.raw_data = perf_synth__raw_data(&raw);
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->pwr_events_sample_type);
}
sample.raw_size = perf_synth__raw_size(raw);
sample.raw_data = perf_synth__raw_data(&raw);
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->pwr_events_sample_type);
}
sample.raw_size = perf_synth__raw_size(raw);
sample.raw_data = perf_synth__raw_data(&raw);
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->pwr_events_sample_type);
}
sample.raw_size = perf_synth__raw_size(raw);
sample.raw_data = perf_synth__raw_data(&raw);
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample,
+ return intel_pt_deliver_synth_event(pt, event, &sample,
pt->pwr_events_sample_type);
}
union {
struct branch_flags flags;
u64 result;
- } u = {
- .flags = {
- .mispred = !!(info & LBR_INFO_MISPRED),
- .predicted = !(info & LBR_INFO_MISPRED),
- .in_tx = !!(info & LBR_INFO_IN_TX),
- .abort = !!(info & LBR_INFO_ABORT),
- .cycles = info & LBR_INFO_CYCLES,
- }
- };
+ } u;
+
+ u.result = 0;
+ u.flags.mispred = !!(info & LBR_INFO_MISPRED);
+ u.flags.predicted = !(info & LBR_INFO_MISPRED);
+ u.flags.in_tx = !!(info & LBR_INFO_IN_TX);
+ u.flags.abort = !!(info & LBR_INFO_ABORT);
+ u.flags.cycles = info & LBR_INFO_CYCLES;
return u.result;
}
intel_pt_add_lbrs(&br.br_stack, items);
sample.branch_stack = &br.br_stack;
} else if (pt->synth_opts.last_branch) {
- intel_pt_copy_last_branch_rb(ptq);
+ thread_stack__br_sample(ptq->thread, ptq->cpu,
+ ptq->last_branch,
+ pt->br_stack_sz);
sample.branch_stack = ptq->last_branch;
} else {
br.br_stack.nr = 0;
sample.transaction = txn;
}
- return intel_pt_deliver_synth_event(pt, ptq, event, &sample, sample_type);
+ return intel_pt_deliver_synth_event(pt, event, &sample, sample_type);
}
static int intel_pt_synth_error(struct intel_pt *pt, int code, int cpu,
if (!(state->type & INTEL_PT_BRANCH))
return 0;
- if (pt->synth_opts.callchain || pt->synth_opts.thread_stack)
- thread_stack__event(ptq->thread, ptq->cpu, ptq->flags, state->from_ip,
- state->to_ip, ptq->insn_len,
- state->trace_nr);
- else
+ if (pt->use_thread_stack) {
+ thread_stack__event(ptq->thread, ptq->cpu, ptq->flags,
+ state->from_ip, state->to_ip, ptq->insn_len,
+ state->trace_nr, pt->callstack,
+ pt->br_stack_sz_plus,
+ pt->mispred_all);
+ } else {
thread_stack__set_trace_nr(ptq->thread, ptq->cpu, state->trace_nr);
+ }
if (pt->sample_branches) {
err = intel_pt_synth_branch_sample(ptq);
return err;
}
- if (pt->synth_opts.last_branch)
- intel_pt_update_last_branch_rb(ptq);
-
if (!ptq->sync_switch)
return 0;
if (evsel != pt->switch_evsel)
return 0;
- tid = perf_evsel__intval(evsel, sample, "next_pid");
+ tid = evsel__intval(evsel, sample, "next_pid");
cpu = sample->cpu;
intel_pt_log("sched_switch: cpu %d tid %d time %"PRIu64" tsc %#"PRIx64"\n",
if (err)
return err;
+ if (event->header.type == PERF_RECORD_SAMPLE) {
+ if (pt->synth_opts.add_callchain && !sample->callchain)
+ intel_pt_add_callchain(pt, sample);
+ if (pt->synth_opts.add_last_branch && !sample->branch_stack)
+ intel_pt_add_br_stack(pt, sample);
+ }
+
if (event->header.type == PERF_RECORD_AUX &&
(event->aux.flags & PERF_AUX_FLAG_TRUNCATED) &&
pt->synth_opts.errors) {
session->auxtrace = NULL;
thread__put(pt->unknown_thread);
addr_filters__exit(&pt->filts);
+ zfree(&pt->chain);
zfree(&pt->filter);
zfree(&pt->time_ranges);
free(pt);
}
+static bool intel_pt_evsel_is_auxtrace(struct perf_session *session,
+ struct evsel *evsel)
+{
+ struct intel_pt *pt = container_of(session->auxtrace, struct intel_pt,
+ auxtrace);
+
+ return evsel->core.attr.type == pt->pmu_type;
+}
+
static int intel_pt_process_auxtrace_event(struct perf_session *session,
union perf_event *event,
struct perf_tool *tool __maybe_unused)
struct evsel *evsel;
evlist__for_each_entry_reverse(evlist, evsel) {
- const char *name = perf_evsel__name(evsel);
+ const char *name = evsel__name(evsel);
if (!strcmp(name, "sched:sched_switch"))
return evsel;
pt->auxtrace.flush_events = intel_pt_flush;
pt->auxtrace.free_events = intel_pt_free_events;
pt->auxtrace.free = intel_pt_free;
+ pt->auxtrace.evsel_is_auxtrace = intel_pt_evsel_is_auxtrace;
session->auxtrace = &pt->auxtrace;
if (dump_trace)
!session->itrace_synth_opts->inject) {
pt->synth_opts.branches = false;
pt->synth_opts.callchain = true;
+ pt->synth_opts.add_callchain = true;
}
pt->synth_opts.thread_stack =
session->itrace_synth_opts->thread_stack;
pt->branches_filter |= PERF_IP_FLAG_RETURN |
PERF_IP_FLAG_TRACE_BEGIN;
- if (pt->synth_opts.callchain && !symbol_conf.use_callchain) {
+ if ((pt->synth_opts.callchain || pt->synth_opts.add_callchain) &&
+ !symbol_conf.use_callchain) {
symbol_conf.use_callchain = true;
if (callchain_register_param(&callchain_param) < 0) {
symbol_conf.use_callchain = false;
pt->synth_opts.callchain = false;
+ pt->synth_opts.add_callchain = false;
}
}
+ if (pt->synth_opts.add_callchain) {
+ err = intel_pt_callchain_init(pt);
+ if (err)
+ goto err_delete_thread;
+ }
+
+ if (pt->synth_opts.last_branch || pt->synth_opts.add_last_branch) {
+ pt->br_stack_sz = pt->synth_opts.last_branch_sz;
+ pt->br_stack_sz_plus = pt->br_stack_sz;
+ }
+
+ if (pt->synth_opts.add_last_branch) {
+ err = intel_pt_br_stack_init(pt);
+ if (err)
+ goto err_delete_thread;
+ /*
+ * Additional branch stack size to cater for tracing from the
+ * actual sample ip to where the sample time is recorded.
+ * Measured at about 200 branches, but generously set to 1024.
+ * If kernel space is not being traced, then add just 1 for the
+ * branch to kernel space.
+ */
+ if (intel_pt_tracing_kernel(pt))
+ pt->br_stack_sz_plus += 1024;
+ else
+ pt->br_stack_sz_plus += 1;
+ }
+
+ pt->use_thread_stack = pt->synth_opts.callchain ||
+ pt->synth_opts.add_callchain ||
+ pt->synth_opts.thread_stack ||
+ pt->synth_opts.last_branch ||
+ pt->synth_opts.add_last_branch;
+
+ pt->callstack = pt->synth_opts.callchain ||
+ pt->synth_opts.add_callchain ||
+ pt->synth_opts.thread_stack;
+
err = intel_pt_synth_events(pt, session);
if (err)
goto err_delete_thread;
return 0;
err_delete_thread:
+ zfree(&pt->chain);
thread__zput(pt->unknown_thread);
err_free_queues:
intel_pt_log_disable();
return 0;
}
+static int is_bpf_image(const char *name)
+{
+ return strncmp(name, "bpf_trampoline_", sizeof("bpf_trampoline_") - 1) ||
+ strncmp(name, "bpf_dispatcher_", sizeof("bpf_dispatcher_") - 1);
+}
+
static int machine__process_ksymbol_register(struct machine *machine,
union perf_event *event,
struct perf_sample *sample __maybe_unused)
map->start = event->ksymbol.addr;
map->end = map->start + event->ksymbol.len;
maps__insert(&machine->kmaps, map);
+ dso__set_loaded(dso);
+
+ if (is_bpf_image(event->ksymbol.name)) {
+ dso->binary_type = DSO_BINARY_TYPE__BPF_IMAGE;
+ dso__set_long_name(dso, "", false);
+ }
}
sym = symbol__new(map->map_ip(map, map->start),
return nr;
}
+static int lbr_callchain_add_kernel_ip(struct thread *thread,
+ struct callchain_cursor *cursor,
+ struct perf_sample *sample,
+ struct symbol **parent,
+ struct addr_location *root_al,
+ u64 branch_from,
+ bool callee, int end)
+{
+ struct ip_callchain *chain = sample->callchain;
+ u8 cpumode = PERF_RECORD_MISC_USER;
+ int err, i;
+
+ if (callee) {
+ for (i = 0; i < end + 1; i++) {
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, chain->ips[i],
+ false, NULL, NULL, branch_from);
+ if (err)
+ return err;
+ }
+ return 0;
+ }
+
+ for (i = end; i >= 0; i--) {
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, chain->ips[i],
+ false, NULL, NULL, branch_from);
+ if (err)
+ return err;
+ }
+
+ return 0;
+}
+
+static void save_lbr_cursor_node(struct thread *thread,
+ struct callchain_cursor *cursor,
+ int idx)
+{
+ struct lbr_stitch *lbr_stitch = thread->lbr_stitch;
+
+ if (!lbr_stitch)
+ return;
+
+ if (cursor->pos == cursor->nr) {
+ lbr_stitch->prev_lbr_cursor[idx].valid = false;
+ return;
+ }
+
+ if (!cursor->curr)
+ cursor->curr = cursor->first;
+ else
+ cursor->curr = cursor->curr->next;
+ memcpy(&lbr_stitch->prev_lbr_cursor[idx], cursor->curr,
+ sizeof(struct callchain_cursor_node));
+
+ lbr_stitch->prev_lbr_cursor[idx].valid = true;
+ cursor->pos++;
+}
+
+static int lbr_callchain_add_lbr_ip(struct thread *thread,
+ struct callchain_cursor *cursor,
+ struct perf_sample *sample,
+ struct symbol **parent,
+ struct addr_location *root_al,
+ u64 *branch_from,
+ bool callee)
+{
+ struct branch_stack *lbr_stack = sample->branch_stack;
+ struct branch_entry *entries = perf_sample__branch_entries(sample);
+ u8 cpumode = PERF_RECORD_MISC_USER;
+ int lbr_nr = lbr_stack->nr;
+ struct branch_flags *flags;
+ int err, i;
+ u64 ip;
+
+ /*
+ * The curr and pos are not used in writing session. They are cleared
+ * in callchain_cursor_commit() when the writing session is closed.
+ * Using curr and pos to track the current cursor node.
+ */
+ if (thread->lbr_stitch) {
+ cursor->curr = NULL;
+ cursor->pos = cursor->nr;
+ if (cursor->nr) {
+ cursor->curr = cursor->first;
+ for (i = 0; i < (int)(cursor->nr - 1); i++)
+ cursor->curr = cursor->curr->next;
+ }
+ }
+
+ if (callee) {
+ /* Add LBR ip from first entries.to */
+ ip = entries[0].to;
+ flags = &entries[0].flags;
+ *branch_from = entries[0].from;
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, ip,
+ true, flags, NULL,
+ *branch_from);
+ if (err)
+ return err;
+
+ /*
+ * The number of cursor node increases.
+ * Move the current cursor node.
+ * But does not need to save current cursor node for entry 0.
+ * It's impossible to stitch the whole LBRs of previous sample.
+ */
+ if (thread->lbr_stitch && (cursor->pos != cursor->nr)) {
+ if (!cursor->curr)
+ cursor->curr = cursor->first;
+ else
+ cursor->curr = cursor->curr->next;
+ cursor->pos++;
+ }
+
+ /* Add LBR ip from entries.from one by one. */
+ for (i = 0; i < lbr_nr; i++) {
+ ip = entries[i].from;
+ flags = &entries[i].flags;
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, ip,
+ true, flags, NULL,
+ *branch_from);
+ if (err)
+ return err;
+ save_lbr_cursor_node(thread, cursor, i);
+ }
+ return 0;
+ }
+
+ /* Add LBR ip from entries.from one by one. */
+ for (i = lbr_nr - 1; i >= 0; i--) {
+ ip = entries[i].from;
+ flags = &entries[i].flags;
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, ip,
+ true, flags, NULL,
+ *branch_from);
+ if (err)
+ return err;
+ save_lbr_cursor_node(thread, cursor, i);
+ }
+
+ /* Add LBR ip from first entries.to */
+ ip = entries[0].to;
+ flags = &entries[0].flags;
+ *branch_from = entries[0].from;
+ err = add_callchain_ip(thread, cursor, parent,
+ root_al, &cpumode, ip,
+ true, flags, NULL,
+ *branch_from);
+ if (err)
+ return err;
+
+ return 0;
+}
+
+static int lbr_callchain_add_stitched_lbr_ip(struct thread *thread,
+ struct callchain_cursor *cursor)
+{
+ struct lbr_stitch *lbr_stitch = thread->lbr_stitch;
+ struct callchain_cursor_node *cnode;
+ struct stitch_list *stitch_node;
+ int err;
+
+ list_for_each_entry(stitch_node, &lbr_stitch->lists, node) {
+ cnode = &stitch_node->cursor;
+
+ err = callchain_cursor_append(cursor, cnode->ip,
+ &cnode->ms,
+ cnode->branch,
+ &cnode->branch_flags,
+ cnode->nr_loop_iter,
+ cnode->iter_cycles,
+ cnode->branch_from,
+ cnode->srcline);
+ if (err)
+ return err;
+ }
+ return 0;
+}
+
+static struct stitch_list *get_stitch_node(struct thread *thread)
+{
+ struct lbr_stitch *lbr_stitch = thread->lbr_stitch;
+ struct stitch_list *stitch_node;
+
+ if (!list_empty(&lbr_stitch->free_lists)) {
+ stitch_node = list_first_entry(&lbr_stitch->free_lists,
+ struct stitch_list, node);
+ list_del(&stitch_node->node);
+
+ return stitch_node;
+ }
+
+ return malloc(sizeof(struct stitch_list));
+}
+
+static bool has_stitched_lbr(struct thread *thread,
+ struct perf_sample *cur,
+ struct perf_sample *prev,
+ unsigned int max_lbr,
+ bool callee)
+{
+ struct branch_stack *cur_stack = cur->branch_stack;
+ struct branch_entry *cur_entries = perf_sample__branch_entries(cur);
+ struct branch_stack *prev_stack = prev->branch_stack;
+ struct branch_entry *prev_entries = perf_sample__branch_entries(prev);
+ struct lbr_stitch *lbr_stitch = thread->lbr_stitch;
+ int i, j, nr_identical_branches = 0;
+ struct stitch_list *stitch_node;
+ u64 cur_base, distance;
+
+ if (!cur_stack || !prev_stack)
+ return false;
+
+ /* Find the physical index of the base-of-stack for current sample. */
+ cur_base = max_lbr - cur_stack->nr + cur_stack->hw_idx + 1;
+
+ distance = (prev_stack->hw_idx > cur_base) ? (prev_stack->hw_idx - cur_base) :
+ (max_lbr + prev_stack->hw_idx - cur_base);
+ /* Previous sample has shorter stack. Nothing can be stitched. */
+ if (distance + 1 > prev_stack->nr)
+ return false;
+
+ /*
+ * Check if there are identical LBRs between two samples.
+ * Identicall LBRs must have same from, to and flags values. Also,
+ * they have to be saved in the same LBR registers (same physical
+ * index).
+ *
+ * Starts from the base-of-stack of current sample.
+ */
+ for (i = distance, j = cur_stack->nr - 1; (i >= 0) && (j >= 0); i--, j--) {
+ if ((prev_entries[i].from != cur_entries[j].from) ||
+ (prev_entries[i].to != cur_entries[j].to) ||
+ (prev_entries[i].flags.value != cur_entries[j].flags.value))
+ break;
+ nr_identical_branches++;
+ }
+
+ if (!nr_identical_branches)
+ return false;
+
+ /*
+ * Save the LBRs between the base-of-stack of previous sample
+ * and the base-of-stack of current sample into lbr_stitch->lists.
+ * These LBRs will be stitched later.
+ */
+ for (i = prev_stack->nr - 1; i > (int)distance; i--) {
+
+ if (!lbr_stitch->prev_lbr_cursor[i].valid)
+ continue;
+
+ stitch_node = get_stitch_node(thread);
+ if (!stitch_node)
+ return false;
+
+ memcpy(&stitch_node->cursor, &lbr_stitch->prev_lbr_cursor[i],
+ sizeof(struct callchain_cursor_node));
+
+ if (callee)
+ list_add(&stitch_node->node, &lbr_stitch->lists);
+ else
+ list_add_tail(&stitch_node->node, &lbr_stitch->lists);
+ }
+
+ return true;
+}
+
+static bool alloc_lbr_stitch(struct thread *thread, unsigned int max_lbr)
+{
+ if (thread->lbr_stitch)
+ return true;
+
+ thread->lbr_stitch = zalloc(sizeof(*thread->lbr_stitch));
+ if (!thread->lbr_stitch)
+ goto err;
+
+ thread->lbr_stitch->prev_lbr_cursor = calloc(max_lbr + 1, sizeof(struct callchain_cursor_node));
+ if (!thread->lbr_stitch->prev_lbr_cursor)
+ goto free_lbr_stitch;
+
+ INIT_LIST_HEAD(&thread->lbr_stitch->lists);
+ INIT_LIST_HEAD(&thread->lbr_stitch->free_lists);
+
+ return true;
+
+free_lbr_stitch:
+ zfree(&thread->lbr_stitch);
+err:
+ pr_warning("Failed to allocate space for stitched LBRs. Disable LBR stitch\n");
+ thread->lbr_stitch_enable = false;
+ return false;
+}
+
/*
* Recolve LBR callstack chain sample
* Return:
struct perf_sample *sample,
struct symbol **parent,
struct addr_location *root_al,
- int max_stack)
+ int max_stack,
+ unsigned int max_lbr)
{
+ bool callee = (callchain_param.order == ORDER_CALLEE);
struct ip_callchain *chain = sample->callchain;
int chain_nr = min(max_stack, (int)chain->nr), i;
- u8 cpumode = PERF_RECORD_MISC_USER;
- u64 ip, branch_from = 0;
+ struct lbr_stitch *lbr_stitch;
+ bool stitched_lbr = false;
+ u64 branch_from = 0;
+ int err;
for (i = 0; i < chain_nr; i++) {
if (chain->ips[i] == PERF_CONTEXT_USER)
}
/* LBR only affects the user callchain */
- if (i != chain_nr) {
- struct branch_stack *lbr_stack = sample->branch_stack;
- struct branch_entry *entries = perf_sample__branch_entries(sample);
- int lbr_nr = lbr_stack->nr, j, k;
- bool branch;
- struct branch_flags *flags;
- /*
- * LBR callstack can only get user call chain.
- * The mix_chain_nr is kernel call chain
- * number plus LBR user call chain number.
- * i is kernel call chain number,
- * 1 is PERF_CONTEXT_USER,
- * lbr_nr + 1 is the user call chain number.
- * For details, please refer to the comments
- * in callchain__printf
- */
- int mix_chain_nr = i + 1 + lbr_nr + 1;
+ if (i == chain_nr)
+ return 0;
- for (j = 0; j < mix_chain_nr; j++) {
- int err;
- branch = false;
- flags = NULL;
+ if (thread->lbr_stitch_enable && !sample->no_hw_idx &&
+ (max_lbr > 0) && alloc_lbr_stitch(thread, max_lbr)) {
+ lbr_stitch = thread->lbr_stitch;
- if (callchain_param.order == ORDER_CALLEE) {
- if (j < i + 1)
- ip = chain->ips[j];
- else if (j > i + 1) {
- k = j - i - 2;
- ip = entries[k].from;
- branch = true;
- flags = &entries[k].flags;
- } else {
- ip = entries[0].to;
- branch = true;
- flags = &entries[0].flags;
- branch_from = entries[0].from;
- }
- } else {
- if (j < lbr_nr) {
- k = lbr_nr - j - 1;
- ip = entries[k].from;
- branch = true;
- flags = &entries[k].flags;
- }
- else if (j > lbr_nr)
- ip = chain->ips[i + 1 - (j - lbr_nr)];
- else {
- ip = entries[0].to;
- branch = true;
- flags = &entries[0].flags;
- branch_from = entries[0].from;
- }
- }
+ stitched_lbr = has_stitched_lbr(thread, sample,
+ &lbr_stitch->prev_sample,
+ max_lbr, callee);
- err = add_callchain_ip(thread, cursor, parent,
- root_al, &cpumode, ip,
- branch, flags, NULL,
- branch_from);
+ if (!stitched_lbr && !list_empty(&lbr_stitch->lists)) {
+ list_replace_init(&lbr_stitch->lists,
+ &lbr_stitch->free_lists);
+ }
+ memcpy(&lbr_stitch->prev_sample, sample, sizeof(*sample));
+ }
+
+ if (callee) {
+ /* Add kernel ip */
+ err = lbr_callchain_add_kernel_ip(thread, cursor, sample,
+ parent, root_al, branch_from,
+ true, i);
+ if (err)
+ goto error;
+
+ err = lbr_callchain_add_lbr_ip(thread, cursor, sample, parent,
+ root_al, &branch_from, true);
+ if (err)
+ goto error;
+
+ if (stitched_lbr) {
+ err = lbr_callchain_add_stitched_lbr_ip(thread, cursor);
if (err)
- return (err < 0) ? err : 0;
+ goto error;
}
- return 1;
+
+ } else {
+ if (stitched_lbr) {
+ err = lbr_callchain_add_stitched_lbr_ip(thread, cursor);
+ if (err)
+ goto error;
+ }
+ err = lbr_callchain_add_lbr_ip(thread, cursor, sample, parent,
+ root_al, &branch_from, false);
+ if (err)
+ goto error;
+
+ /* Add kernel ip */
+ err = lbr_callchain_add_kernel_ip(thread, cursor, sample,
+ parent, root_al, branch_from,
+ false, i);
+ if (err)
+ goto error;
}
+ return 1;
- return 0;
+error:
+ return (err < 0) ? err : 0;
}
static int find_prev_cpumode(struct ip_callchain *chain, struct thread *thread,
if (chain)
chain_nr = chain->nr;
- if (perf_evsel__has_branch_callstack(evsel)) {
+ if (evsel__has_branch_callstack(evsel)) {
+ struct perf_env *env = evsel__env(evsel);
+
err = resolve_lbr_callchain_sample(thread, cursor, sample, parent,
- root_al, max_stack);
+ root_al, max_stack,
+ !env ? 0 : env->max_branches);
if (err)
return (err < 0) ? err : 0;
}
#include <errno.h>
#include <inttypes.h>
+#include <asm/bug.h>
#include <linux/bitmap.h>
#include <linux/kernel.h>
#include <linux/zalloc.h>
/* Cut unused entries, due to merging. */
tmp_entries = realloc(entries, sizeof(*entries) * j);
- if (tmp_entries)
+ if (tmp_entries || WARN_ON_ONCE(j == 0))
entries = tmp_entries;
for (i = 0; i < j; i++) {
const char *metric_name;
const char *metric_expr;
const char *metric_unit;
+ int runtime;
};
static struct evsel *find_evsel_group(struct evlist *perf_evlist,
expr->metric_name = eg->metric_name;
expr->metric_unit = eg->metric_unit;
expr->metric_events = metric_events;
+ expr->runtime = eg->runtime;
list_add(&expr->nd, &me->head);
}
return false;
}
+int __weak arch_get_runtimeparam(void)
+{
+ return 1;
+}
+
+static int __metricgroup__add_metric(struct strbuf *events,
+ struct list_head *group_list, struct pmu_event *pe, int runtime)
+{
+
+ const char **ids;
+ int idnum;
+ struct egroup *eg;
+
+ if (expr__find_other(pe->metric_expr, NULL, &ids, &idnum, runtime) < 0)
+ return -EINVAL;
+
+ if (events->len > 0)
+ strbuf_addf(events, ",");
+
+ if (metricgroup__has_constraint(pe))
+ metricgroup__add_metric_non_group(events, ids, idnum);
+ else
+ metricgroup__add_metric_weak_group(events, ids, idnum);
+
+ eg = malloc(sizeof(*eg));
+ if (!eg)
+ return -ENOMEM;
+
+ eg->ids = ids;
+ eg->idnum = idnum;
+ eg->metric_name = pe->metric_name;
+ eg->metric_expr = pe->metric_expr;
+ eg->metric_unit = pe->unit;
+ eg->runtime = runtime;
+ list_add_tail(&eg->nd, group_list);
+
+ return 0;
+}
+
static int metricgroup__add_metric(const char *metric, struct strbuf *events,
struct list_head *group_list)
{
continue;
if (match_metric(pe->metric_group, metric) ||
match_metric(pe->metric_name, metric)) {
- const char **ids;
- int idnum;
- struct egroup *eg;
pr_debug("metric expr %s for %s\n", pe->metric_expr, pe->metric_name);
- if (expr__find_other(pe->metric_expr,
- NULL, &ids, &idnum) < 0)
- continue;
- if (events->len > 0)
- strbuf_addf(events, ",");
+ if (!strstr(pe->metric_expr, "?")) {
+ ret = __metricgroup__add_metric(events, group_list, pe, 1);
+ } else {
+ int j, count;
- if (metricgroup__has_constraint(pe))
- metricgroup__add_metric_non_group(events, ids, idnum);
- else
- metricgroup__add_metric_weak_group(events, ids, idnum);
+ count = arch_get_runtimeparam();
- eg = malloc(sizeof(struct egroup));
- if (!eg) {
- ret = -ENOMEM;
- break;
+ /* This loop is added to create multiple
+ * events depend on count value and add
+ * those events to group_list.
+ */
+
+ for (j = 0; j < count; j++)
+ ret = __metricgroup__add_metric(events, group_list, pe, j);
}
- eg->ids = ids;
- eg->idnum = idnum;
- eg->metric_name = pe->metric_name;
- eg->metric_expr = pe->metric_expr;
- eg->metric_unit = pe->unit;
- list_add_tail(&eg->nd, group_list);
- ret = 0;
+ if (ret == -ENOMEM)
+ break;
}
}
return ret;
const char *metric_name;
const char *metric_unit;
struct evsel **metric_events;
+ int runtime;
};
struct metric_event *metricgroup__lookup(struct rblist *metric_events,
void metricgroup__print(bool metrics, bool groups, char *filter,
bool raw, bool details);
bool metricgroup__has_metric(const char *metric);
+int arch_get_runtimeparam(void);
#endif
case OE_FLUSH__NONE:
default:
break;
- };
+ }
pr_oe_time(oe->next_flush, "next_flush - ordered_events__flush PRE %s, nr_events %u\n",
str[how], oe->nr_events);
cache_op = parse_aliases(str, perf_evsel__hw_cache_op,
PERF_COUNT_HW_CACHE_OP_MAX);
if (cache_op >= 0) {
- if (!perf_evsel__is_cache_op_valid(cache_type, cache_op))
+ if (!evsel__is_cache_op_valid(cache_type, cache_op))
return -EINVAL;
continue;
}
list_for_each_entry_safe(pos, tmp, &config_terms, list) {
list_del_init(&pos->list);
+ zfree(&pos->val.str);
free(pos);
}
return -EINVAL;
evsel->precise_max = mod.precise_max;
evsel->weak_group = mod.weak;
- if (perf_evsel__is_group_leader(evsel))
+ if (evsel__is_group_leader(evsel))
evsel->core.attr.pinned = mod.pinned;
}
return ret;
}
+int parse_events_option_new_evlist(const struct option *opt, const char *str, int unset)
+{
+ struct evlist **evlistp = opt->value;
+ int ret;
+
+ if (*evlistp == NULL) {
+ *evlistp = evlist__new();
+
+ if (*evlistp == NULL) {
+ fprintf(stderr, "Not enough memory to create evlist\n");
+ return -1;
+ }
+ }
+
+ ret = parse_events_option(opt, str, unset);
+ if (ret) {
+ evlist__delete(*evlistp);
+ *evlistp = NULL;
+ }
+
+ return ret;
+}
+
static int
foreach_evsel_in_last_glob(struct evlist *evlist,
int (*func)(struct evsel *evsel,
}
if (evsel->core.attr.type == PERF_TYPE_TRACEPOINT) {
- if (perf_evsel__append_tp_filter(evsel, str) < 0) {
+ if (evsel__append_tp_filter(evsel, str) < 0) {
fprintf(stderr,
"not enough memory to hold filter string\n");
return -1;
return -1;
}
- if (perf_evsel__append_addr_filter(evsel, str) < 0) {
+ if (evsel__append_addr_filter(evsel, str) < 0) {
fprintf(stderr,
"not enough memory to hold filter string\n");
return -1;
snprintf(new_filter, sizeof(new_filter), "common_pid != %d", getpid());
- if (perf_evsel__append_tp_filter(evsel, new_filter) < 0) {
+ if (evsel__append_tp_filter(evsel, new_filter) < 0) {
fprintf(stderr,
"not enough memory to hold filter string\n");
return -1;
for (type = 0; type < PERF_COUNT_HW_CACHE_MAX; type++) {
for (op = 0; op < PERF_COUNT_HW_CACHE_OP_MAX; op++) {
/* skip invalid cache type */
- if (!perf_evsel__is_cache_op_valid(type, op))
+ if (!evsel__is_cache_op_valid(type, op))
continue;
for (i = 0; i < PERF_COUNT_HW_CACHE_RESULT_MAX; i++) {
- __perf_evsel__hw_cache_type_op_res_name(type, op, i,
- name, sizeof(name));
+ __evsel__hw_cache_type_op_res_name(type, op, i, name, sizeof(name));
if (event_glob != NULL && !strglobmatch(name, event_glob))
continue;
const char *event_type(int type);
int parse_events_option(const struct option *opt, const char *str, int unset);
+int parse_events_option_new_evlist(const struct option *opt, const char *str, int unset);
int parse_events(struct evlist *evlist, const char *str,
struct parse_events_error *error);
int parse_events_terms(struct list_head *terms, const char *str);
percore { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_PERCORE); }
aux-output { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_OUTPUT); }
aux-sample-size { return term(yyscanner, PARSE_EVENTS__TERM_TYPE_AUX_SAMPLE_SIZE); }
+r{num_raw_hex} { return raw(yyscanner); }
, { return ','; }
"/" { BEGIN(INITIAL); return '/'; }
{name_minus} { return str(yyscanner, PE_NAME); }
list_for_each_entry_safe(evsel, tmp, list_evsel, core.node) {
list_del_init(&evsel->core.node);
- perf_evsel__delete(evsel);
+ evsel__delete(evsel);
}
free(list_evsel);
}
}
parse_events_terms__delete($2);
parse_events_terms__delete(orig_terms);
+ free(pattern);
free($1);
$$ = list;
#undef CLEANUP_YYABORT
}
event_term:
+PE_RAW
+{
+ struct parse_events_term *term;
+
+ ABORT_ON(parse_events_term__num(&term, PARSE_EVENTS__TERM_TYPE_CONFIG,
+ NULL, $1, false, &@1, NULL));
+ $$ = term;
+}
+|
PE_NAME '=' PE_NAME
{
struct parse_events_term *term;
--- /dev/null
+/* SPDX-License-Identifier: GPL-2.0 */
+
+#include "perf-sys.h"
+#include "util/cloexec.h"
+#include "util/evlist.h"
+#include "util/evsel.h"
+#include "util/parse-events.h"
+#include "util/perf_api_probe.h"
+#include <perf/cpumap.h>
+#include <errno.h>
+
+typedef void (*setup_probe_fn_t)(struct evsel *evsel);
+
+static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
+{
+ struct evlist *evlist;
+ struct evsel *evsel;
+ unsigned long flags = perf_event_open_cloexec_flag();
+ int err = -EAGAIN, fd;
+ static pid_t pid = -1;
+
+ evlist = evlist__new();
+ if (!evlist)
+ return -ENOMEM;
+
+ if (parse_events(evlist, str, NULL))
+ goto out_delete;
+
+ evsel = evlist__first(evlist);
+
+ while (1) {
+ fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1, flags);
+ if (fd < 0) {
+ if (pid == -1 && errno == EACCES) {
+ pid = 0;
+ continue;
+ }
+ goto out_delete;
+ }
+ break;
+ }
+ close(fd);
+
+ fn(evsel);
+
+ fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1, flags);
+ if (fd < 0) {
+ if (errno == EINVAL)
+ err = -EINVAL;
+ goto out_delete;
+ }
+ close(fd);
+ err = 0;
+
+out_delete:
+ evlist__delete(evlist);
+ return err;
+}
+
+static bool perf_probe_api(setup_probe_fn_t fn)
+{
+ const char *try[] = {"cycles:u", "instructions:u", "cpu-clock:u", NULL};
+ struct perf_cpu_map *cpus;
+ int cpu, ret, i = 0;
+
+ cpus = perf_cpu_map__new(NULL);
+ if (!cpus)
+ return false;
+ cpu = cpus->map[0];
+ perf_cpu_map__put(cpus);
+
+ do {
+ ret = perf_do_probe_api(fn, cpu, try[i++]);
+ if (!ret)
+ return true;
+ } while (ret == -EAGAIN && try[i]);
+
+ return false;
+}
+
+static void perf_probe_sample_identifier(struct evsel *evsel)
+{
+ evsel->core.attr.sample_type |= PERF_SAMPLE_IDENTIFIER;
+}
+
+static void perf_probe_comm_exec(struct evsel *evsel)
+{
+ evsel->core.attr.comm_exec = 1;
+}
+
+static void perf_probe_context_switch(struct evsel *evsel)
+{
+ evsel->core.attr.context_switch = 1;
+}
+
+bool perf_can_sample_identifier(void)
+{
+ return perf_probe_api(perf_probe_sample_identifier);
+}
+
+bool perf_can_comm_exec(void)
+{
+ return perf_probe_api(perf_probe_comm_exec);
+}
+
+bool perf_can_record_switch_events(void)
+{
+ return perf_probe_api(perf_probe_context_switch);
+}
+
+bool perf_can_record_cpu_wide(void)
+{
+ struct perf_event_attr attr = {
+ .type = PERF_TYPE_SOFTWARE,
+ .config = PERF_COUNT_SW_CPU_CLOCK,
+ .exclude_kernel = 1,
+ };
+ struct perf_cpu_map *cpus;
+ int cpu, fd;
+
+ cpus = perf_cpu_map__new(NULL);
+ if (!cpus)
+ return false;
+ cpu = cpus->map[0];
+ perf_cpu_map__put(cpus);
+
+ fd = sys_perf_event_open(&attr, -1, cpu, -1, 0);
+ if (fd < 0)
+ return false;
+ close(fd);
+
+ return true;
+}
+
+/*
+ * Architectures are expected to know if AUX area sampling is supported by the
+ * hardware. Here we check for kernel support.
+ */
+bool perf_can_aux_sample(void)
+{
+ struct perf_event_attr attr = {
+ .size = sizeof(struct perf_event_attr),
+ .exclude_kernel = 1,
+ /*
+ * Non-zero value causes the kernel to calculate the effective
+ * attribute size up to that byte.
+ */
+ .aux_sample_size = 1,
+ };
+ int fd;
+
+ fd = sys_perf_event_open(&attr, -1, 0, -1, 0);
+ /*
+ * If the kernel attribute is big enough to contain aux_sample_size
+ * then we assume that it is supported. We are relying on the kernel to
+ * validate the attribute size before anything else that could be wrong.
+ */
+ if (fd < 0 && errno == E2BIG)
+ return false;
+ if (fd >= 0)
+ close(fd);
+
+ return true;
+}
--- /dev/null
+
+/* SPDX-License-Identifier: GPL-2.0 */
+#ifndef __PERF_API_PROBE_H
+#define __PERF_API_PROBE_H
+
+#include <stdbool.h>
+
+bool perf_can_aux_sample(void);
+bool perf_can_comm_exec(void);
+bool perf_can_record_cpu_wide(void);
+bool perf_can_record_switch_events(void);
+bool perf_can_sample_identifier(void);
+
+#endif // __PERF_API_PROBE_H
#include <regex.h>
#include <perf/cpumap.h>
#include "debug.h"
+#include "evsel.h"
#include "pmu.h"
#include "parse-events.h"
#include "header.h"
INIT_LIST_HEAD(&pmu->format);
INIT_LIST_HEAD(&pmu->aliases);
+ INIT_LIST_HEAD(&pmu->caps);
list_splice(&format, &pmu->format);
list_splice(&aliases, &pmu->aliases);
list_add_tail(&pmu->list, &pmus);
return NULL;
}
+struct perf_pmu *perf_pmu__find_by_type(unsigned int type)
+{
+ struct perf_pmu *pmu;
+
+ list_for_each_entry(pmu, &pmus, list)
+ if (pmu->type == type)
+ return pmu;
+
+ return NULL;
+}
+
struct perf_pmu *perf_pmu__scan(struct perf_pmu *pmu)
{
/*
return NULL;
}
+struct perf_pmu *evsel__find_pmu(struct evsel *evsel)
+{
+ struct perf_pmu *pmu = NULL;
+
+ while ((pmu = perf_pmu__scan(pmu)) != NULL) {
+ if (pmu->type == evsel->core.attr.type)
+ break;
+ }
+
+ return pmu;
+}
+
+bool evsel__is_aux_event(struct evsel *evsel)
+{
+ struct perf_pmu *pmu = evsel__find_pmu(evsel);
+
+ return pmu && pmu->auxtrace;
+}
+
struct perf_pmu *perf_pmu__find(const char *name)
{
struct perf_pmu *pmu;
va_end(args);
return ret;
}
+
+static int perf_pmu__new_caps(struct list_head *list, char *name, char *value)
+{
+ struct perf_pmu_caps *caps = zalloc(sizeof(*caps));
+
+ if (!caps)
+ return -ENOMEM;
+
+ caps->name = strdup(name);
+ if (!caps->name)
+ goto free_caps;
+ caps->value = strndup(value, strlen(value) - 1);
+ if (!caps->value)
+ goto free_name;
+ list_add_tail(&caps->list, list);
+ return 0;
+
+free_name:
+ zfree(caps->name);
+free_caps:
+ free(caps);
+
+ return -ENOMEM;
+}
+
+/*
+ * Reading/parsing the given pmu capabilities, which should be located at:
+ * /sys/bus/event_source/devices/<dev>/caps as sysfs group attributes.
+ * Return the number of capabilities
+ */
+int perf_pmu__caps_parse(struct perf_pmu *pmu)
+{
+ struct stat st;
+ char caps_path[PATH_MAX];
+ const char *sysfs = sysfs__mountpoint();
+ DIR *caps_dir;
+ struct dirent *evt_ent;
+ int nr_caps = 0;
+
+ if (!sysfs)
+ return -1;
+
+ snprintf(caps_path, PATH_MAX,
+ "%s" EVENT_SOURCE_DEVICE_PATH "%s/caps", sysfs, pmu->name);
+
+ if (stat(caps_path, &st) < 0)
+ return 0; /* no error if caps does not exist */
+
+ caps_dir = opendir(caps_path);
+ if (!caps_dir)
+ return -EINVAL;
+
+ while ((evt_ent = readdir(caps_dir)) != NULL) {
+ char path[PATH_MAX + NAME_MAX + 1];
+ char *name = evt_ent->d_name;
+ char value[128];
+ FILE *file;
+
+ if (!strcmp(name, ".") || !strcmp(name, ".."))
+ continue;
+
+ snprintf(path, sizeof(path), "%s/%s", caps_path, name);
+
+ file = fopen(path, "r");
+ if (!file)
+ continue;
+
+ if (!fgets(value, sizeof(value), file) ||
+ (perf_pmu__new_caps(&pmu->caps, name, value) < 0)) {
+ fclose(file);
+ continue;
+ }
+
+ nr_caps++;
+ fclose(file);
+ }
+
+ closedir(caps_dir);
+
+ return nr_caps;
+}
struct perf_event_attr;
+struct perf_pmu_caps {
+ char *name;
+ char *value;
+ struct list_head list;
+};
+
struct perf_pmu {
char *name;
__u32 type;
struct perf_cpu_map *cpus;
struct list_head format; /* HEAD struct perf_pmu_format -> list */
struct list_head aliases; /* HEAD struct perf_pmu_alias -> list */
+ struct list_head caps; /* HEAD struct perf_pmu_caps -> list */
struct list_head list; /* ELEM */
};
};
struct perf_pmu *perf_pmu__find(const char *name);
+struct perf_pmu *perf_pmu__find_by_type(unsigned int type);
int perf_pmu__config(struct perf_pmu *pmu, struct perf_event_attr *attr,
struct list_head *head_terms,
struct parse_events_error *error);
int perf_pmu__convert_scale(const char *scale, char **end, double *sval);
+int perf_pmu__caps_parse(struct perf_pmu *pmu);
+
#endif /* __PMU_H */
static void pyrf_evsel__delete(struct pyrf_evsel *pevsel)
{
- perf_evsel__exit(&pevsel->evsel);
+ evsel__exit(&pevsel->evsel);
Py_TYPE(pevsel)->tp_free((PyObject*)pevsel);
}
pevent->evsel = evsel;
- err = perf_evsel__parse_sample(evsel, event, &pevent->sample);
+ err = evsel__parse_sample(evsel, event, &pevent->sample);
/* Consume the even only after we parsed it out. */
perf_mmap__consume(&md->core);
#include <subcmd/parse-options.h>
#include <perf/cpumap.h>
#include "cloexec.h"
+#include "util/perf_api_probe.h"
#include "record.h"
#include "../perf-sys.h"
-typedef void (*setup_probe_fn_t)(struct evsel *evsel);
-
-static int perf_do_probe_api(setup_probe_fn_t fn, int cpu, const char *str)
+/*
+ * evsel__config_leader_sampling() uses special rules for leader sampling.
+ * However, if the leader is an AUX area event, then assume the event to sample
+ * is the next event.
+ */
+static struct evsel *evsel__read_sampler(struct evsel *evsel, struct evlist *evlist)
{
- struct evlist *evlist;
- struct evsel *evsel;
- unsigned long flags = perf_event_open_cloexec_flag();
- int err = -EAGAIN, fd;
- static pid_t pid = -1;
-
- evlist = evlist__new();
- if (!evlist)
- return -ENOMEM;
-
- if (parse_events(evlist, str, NULL))
- goto out_delete;
-
- evsel = evlist__first(evlist);
+ struct evsel *leader = evsel->leader;
- while (1) {
- fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1, flags);
- if (fd < 0) {
- if (pid == -1 && errno == EACCES) {
- pid = 0;
- continue;
- }
- goto out_delete;
+ if (evsel__is_aux_event(leader)) {
+ evlist__for_each_entry(evlist, evsel) {
+ if (evsel->leader == leader && evsel != evsel->leader)
+ return evsel;
}
- break;
- }
- close(fd);
-
- fn(evsel);
-
- fd = sys_perf_event_open(&evsel->core.attr, pid, cpu, -1, flags);
- if (fd < 0) {
- if (errno == EINVAL)
- err = -EINVAL;
- goto out_delete;
}
- close(fd);
- err = 0;
-
-out_delete:
- evlist__delete(evlist);
- return err;
-}
-
-static bool perf_probe_api(setup_probe_fn_t fn)
-{
- const char *try[] = {"cycles:u", "instructions:u", "cpu-clock:u", NULL};
- struct perf_cpu_map *cpus;
- int cpu, ret, i = 0;
-
- cpus = perf_cpu_map__new(NULL);
- if (!cpus)
- return false;
- cpu = cpus->map[0];
- perf_cpu_map__put(cpus);
-
- do {
- ret = perf_do_probe_api(fn, cpu, try[i++]);
- if (!ret)
- return true;
- } while (ret == -EAGAIN && try[i]);
- return false;
+ return leader;
}
-static void perf_probe_sample_identifier(struct evsel *evsel)
+static void evsel__config_leader_sampling(struct evsel *evsel, struct evlist *evlist)
{
- evsel->core.attr.sample_type |= PERF_SAMPLE_IDENTIFIER;
-}
-
-static void perf_probe_comm_exec(struct evsel *evsel)
-{
- evsel->core.attr.comm_exec = 1;
-}
-
-static void perf_probe_context_switch(struct evsel *evsel)
-{
- evsel->core.attr.context_switch = 1;
-}
-
-bool perf_can_sample_identifier(void)
-{
- return perf_probe_api(perf_probe_sample_identifier);
-}
+ struct perf_event_attr *attr = &evsel->core.attr;
+ struct evsel *leader = evsel->leader;
+ struct evsel *read_sampler;
-static bool perf_can_comm_exec(void)
-{
- return perf_probe_api(perf_probe_comm_exec);
-}
+ if (!leader->sample_read)
+ return;
-bool perf_can_record_switch_events(void)
-{
- return perf_probe_api(perf_probe_context_switch);
-}
+ read_sampler = evsel__read_sampler(evsel, evlist);
-bool perf_can_record_cpu_wide(void)
-{
- struct perf_event_attr attr = {
- .type = PERF_TYPE_SOFTWARE,
- .config = PERF_COUNT_SW_CPU_CLOCK,
- .exclude_kernel = 1,
- };
- struct perf_cpu_map *cpus;
- int cpu, fd;
-
- cpus = perf_cpu_map__new(NULL);
- if (!cpus)
- return false;
- cpu = cpus->map[0];
- perf_cpu_map__put(cpus);
+ if (evsel == read_sampler)
+ return;
- fd = sys_perf_event_open(&attr, -1, cpu, -1, 0);
- if (fd < 0)
- return false;
- close(fd);
-
- return true;
-}
-
-/*
- * Architectures are expected to know if AUX area sampling is supported by the
- * hardware. Here we check for kernel support.
- */
-bool perf_can_aux_sample(void)
-{
- struct perf_event_attr attr = {
- .size = sizeof(struct perf_event_attr),
- .exclude_kernel = 1,
- /*
- * Non-zero value causes the kernel to calculate the effective
- * attribute size up to that byte.
- */
- .aux_sample_size = 1,
- };
- int fd;
-
- fd = sys_perf_event_open(&attr, -1, 0, -1, 0);
/*
- * If the kernel attribute is big enough to contain aux_sample_size
- * then we assume that it is supported. We are relying on the kernel to
- * validate the attribute size before anything else that could be wrong.
+ * Disable sampling for all group members other than the leader in
+ * case the leader 'leads' the sampling, except when the leader is an
+ * AUX area event, in which case the 2nd event in the group is the one
+ * that 'leads' the sampling.
*/
- if (fd < 0 && errno == E2BIG)
- return false;
- if (fd >= 0)
- close(fd);
+ attr->freq = 0;
+ attr->sample_freq = 0;
+ attr->sample_period = 0;
+ attr->write_backward = 0;
- return true;
+ /*
+ * We don't get a sample for slave events, we make them when delivering
+ * the group leader sample. Set the slave event to follow the master
+ * sample_type to ease up reporting.
+ * An AUX area event also has sample_type requirements, so also include
+ * the sample type bits from the leader's sample_type to cover that
+ * case.
+ */
+ attr->sample_type = read_sampler->core.attr.sample_type |
+ leader->core.attr.sample_type;
}
void perf_evlist__config(struct evlist *evlist, struct record_opts *opts,
use_comm_exec = perf_can_comm_exec();
evlist__for_each_entry(evlist, evsel) {
- perf_evsel__config(evsel, opts, callchain);
+ evsel__config(evsel, opts, callchain);
if (evsel->tracking && use_comm_exec)
evsel->core.attr.comm_exec = 1;
}
+ /* Configure leader sampling here now that the sample type is known */
+ evlist__for_each_entry(evlist, evsel)
+ evsel__config_leader_sampling(evsel, evlist);
+
if (opts->full_auxtrace) {
/*
* Need to be able to synthesize and parse selected events with
if (sample_id) {
evlist__for_each_entry(evlist, evsel)
- perf_evsel__set_sample_id(evsel, use_sample_identifier);
+ evsel__set_sample_id(evsel, use_sample_identifier);
}
perf_evlist__set_id_pos(evlist);
int affinity;
int mmap_flush;
unsigned int comp_level;
+ unsigned int nr_threads_synthesize;
};
extern const char * const *record_usage;
#define S390_CPUMCF_DIAG_DEF 0xfeef /* Counter diagnostic entry ID */
#define PERF_EVENT_CPUM_CF_DIAG 0xBC000 /* Event: Counter sets */
+#define PERF_EVENT_CPUM_SF_DIAG 0xBD000 /* Event: Combined-sampling */
struct cf_ctrset_entry { /* CPU-M CF counter set entry (8 byte) */
unsigned int def:16; /* 0-15 Data Entry Format */
free(sf);
}
+static bool
+s390_cpumsf_evsel_is_auxtrace(struct perf_session *session __maybe_unused,
+ struct evsel *evsel)
+{
+ return evsel->core.attr.type == PERF_TYPE_RAW &&
+ evsel->core.attr.config == PERF_EVENT_CPUM_SF_DIAG;
+}
+
static int s390_cpumsf_get_type(const char *cpuid)
{
int ret, family = 0;
itops->pwr_events || itops->errors ||
itops->dont_decode || itops->calls || itops->returns ||
itops->callchain || itops->thread_stack ||
- itops->last_branch;
+ itops->last_branch || itops->add_callchain ||
+ itops->add_last_branch;
if (!ison)
return true;
pr_err("Unsupported --itrace options specified\n");
sf->auxtrace.flush_events = s390_cpumsf_flush;
sf->auxtrace.free_events = s390_cpumsf_free_events;
sf->auxtrace.free = s390_cpumsf_free;
+ sf->auxtrace.evsel_is_auxtrace = s390_cpumsf_evsel_is_auxtrace;
session->auxtrace = &sf->auxtrace;
if (dump_trace)
if (!dict_sample)
Py_FatalError("couldn't create Python dictionary");
- pydict_set_item_string_decref(dict, "ev_name", _PyUnicode_FromString(perf_evsel__name(evsel)));
+ pydict_set_item_string_decref(dict, "ev_name", _PyUnicode_FromString(evsel__name(evsel)));
pydict_set_item_string_decref(dict, "attr", _PyBytes_FromStringAndSize((const char *)&evsel->core.attr, sizeof(evsel->core.attr)));
pydict_set_item_string_decref(dict_sample, "pid",
t = tuple_new(2);
tuple_set_u64(t, 0, evsel->db_id);
- tuple_set_string(t, 1, perf_evsel__name(evsel));
+ tuple_set_string(t, 1, evsel__name(evsel));
call_object(tables->evsel_handler, t, "evsel_table");
{
char *p = str;
- scnprintf(str, size, "stat__%s", perf_evsel__name(evsel));
+ scnprintf(str, size, "stat__%s", evsel__name(evsel));
while ((p = strchr(p, ':'))) {
*p = '_';
unsigned int i;
struct ip_callchain *callchain = sample->callchain;
- if (perf_evsel__has_branch_callstack(evsel))
+ if (evsel__has_branch_callstack(evsel))
callchain__lbr_callstack_printf(sample);
printf("... FP chain: nr:%" PRIu64 "\n", callchain->nr);
if (evsel__has_callchain(evsel))
callchain__printf(evsel, sample);
- if (sample_type & PERF_SAMPLE_BRANCH_STACK)
- branch_stack__printf(sample, perf_evsel__has_branch_callstack(evsel));
+ if (evsel__has_br_stack(evsel))
+ branch_stack__printf(sample, evsel__has_branch_callstack(evsel));
if (sample_type & PERF_SAMPLE_REGS_USER)
regs_user__printf(sample);
return;
printf(": %d %d %s %" PRI_lu64 "\n", event->read.pid, event->read.tid,
- perf_evsel__name(evsel),
- event->read.value);
+ evsel__name(evsel), event->read.value);
if (!evsel)
return;
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0-only
+
+#include "util/debug.h"
+#include "util/evlist.h"
+#include "util/evsel.h"
+#include "util/mmap.h"
+#include "util/perf_api_probe.h"
+#include <perf/mmap.h>
+#include <linux/perf_event.h>
+#include <limits.h>
+#include <pthread.h>
+#include <sched.h>
+#include <stdbool.h>
+
+int perf_evlist__add_sb_event(struct evlist *evlist, struct perf_event_attr *attr,
+ evsel__sb_cb_t cb, void *data)
+{
+ struct evsel *evsel;
+
+ if (!attr->sample_id_all) {
+ pr_warning("enabling sample_id_all for all side band events\n");
+ attr->sample_id_all = 1;
+ }
+
+ evsel = perf_evsel__new_idx(attr, evlist->core.nr_entries);
+ if (!evsel)
+ return -1;
+
+ evsel->side_band.cb = cb;
+ evsel->side_band.data = data;
+ evlist__add(evlist, evsel);
+ return 0;
+}
+
+static void *perf_evlist__poll_thread(void *arg)
+{
+ struct evlist *evlist = arg;
+ bool draining = false;
+ int i, done = 0;
+ /*
+ * In order to read symbols from other namespaces perf to needs to call
+ * setns(2). This isn't permitted if the struct_fs has multiple users.
+ * unshare(2) the fs so that we may continue to setns into namespaces
+ * that we're observing when, for instance, reading the build-ids at
+ * the end of a 'perf record' session.
+ */
+ unshare(CLONE_FS);
+
+ while (!done) {
+ bool got_data = false;
+
+ if (evlist->thread.done)
+ draining = true;
+
+ if (!draining)
+ evlist__poll(evlist, 1000);
+
+ for (i = 0; i < evlist->core.nr_mmaps; i++) {
+ struct mmap *map = &evlist->mmap[i];
+ union perf_event *event;
+
+ if (perf_mmap__read_init(&map->core))
+ continue;
+ while ((event = perf_mmap__read_event(&map->core)) != NULL) {
+ struct evsel *evsel = perf_evlist__event2evsel(evlist, event);
+
+ if (evsel && evsel->side_band.cb)
+ evsel->side_band.cb(event, evsel->side_band.data);
+ else
+ pr_warning("cannot locate proper evsel for the side band event\n");
+
+ perf_mmap__consume(&map->core);
+ got_data = true;
+ }
+ perf_mmap__read_done(&map->core);
+ }
+
+ if (draining && !got_data)
+ break;
+ }
+ return NULL;
+}
+
+void evlist__set_cb(struct evlist *evlist, evsel__sb_cb_t cb, void *data)
+{
+ struct evsel *evsel;
+
+ evlist__for_each_entry(evlist, evsel) {
+ evsel->core.attr.sample_id_all = 1;
+ evsel->core.attr.watermark = 1;
+ evsel->core.attr.wakeup_watermark = 1;
+ evsel->side_band.cb = cb;
+ evsel->side_band.data = data;
+ }
+}
+
+int perf_evlist__start_sb_thread(struct evlist *evlist, struct target *target)
+{
+ struct evsel *counter;
+
+ if (!evlist)
+ return 0;
+
+ if (perf_evlist__create_maps(evlist, target))
+ goto out_delete_evlist;
+
+ if (evlist->core.nr_entries > 1) {
+ bool can_sample_identifier = perf_can_sample_identifier();
+
+ evlist__for_each_entry(evlist, counter)
+ evsel__set_sample_id(counter, can_sample_identifier);
+
+ perf_evlist__set_id_pos(evlist);
+ }
+
+ evlist__for_each_entry(evlist, counter) {
+ if (evsel__open(counter, evlist->core.cpus, evlist->core.threads) < 0)
+ goto out_delete_evlist;
+ }
+
+ if (evlist__mmap(evlist, UINT_MAX))
+ goto out_delete_evlist;
+
+ evlist__for_each_entry(evlist, counter) {
+ if (evsel__enable(counter))
+ goto out_delete_evlist;
+ }
+
+ evlist->thread.done = 0;
+ if (pthread_create(&evlist->thread.th, NULL, perf_evlist__poll_thread, evlist))
+ goto out_delete_evlist;
+
+ return 0;
+
+out_delete_evlist:
+ evlist__delete(evlist);
+ evlist = NULL;
+ return -1;
+}
+
+void perf_evlist__stop_sb_thread(struct evlist *evlist)
+{
+ if (!evlist)
+ return;
+ evlist->thread.done = 1;
+ pthread_join(evlist->thread.th, NULL);
+ evlist__delete(evlist);
+}
if (cached)
return cached_result;
+ if (sysfs__read_int("devices/system/cpu/smt/active", &cached_result) > 0)
+ goto done;
+
ncpu = sysconf(_SC_NPROCESSORS_CONF);
for (cpu = 0; cpu < ncpu; cpu++) {
unsigned long long siblings;
snprintf(fn, sizeof fn,
"devices/system/cpu/cpu%d/topology/core_cpus", cpu);
- if (access(fn, F_OK) == -1) {
+ if (sysfs__read_str(fn, &str, &strlen) < 0) {
snprintf(fn, sizeof fn,
"devices/system/cpu/cpu%d/topology/thread_siblings",
cpu);
+ if (sysfs__read_str(fn, &str, &strlen) < 0)
+ continue;
}
- if (sysfs__read_str(fn, &str, &strlen) < 0)
- continue;
/* Entry is hex, but does not have 0x, so need custom parser */
siblings = strtoull(str, NULL, 16);
free(str);
}
if (!cached) {
cached_result = 0;
+done:
cached = true;
}
return cached_result;
return (int64_t)(right_ip - left_ip);
}
-static int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r)
+int64_t _sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r)
{
if (!sym_l || !sym_r)
return cmp_null(sym_l, sym_r);
if (verbose > 0) {
char o = map ? dso__symtab_origin(map->dso) : '!';
+ u64 rip = ip;
+
+ if (map && map->dso && map->dso->kernel
+ && map->dso->adjust_symbols)
+ rip = map->unmap_ip(map, ip);
+
ret += repsep_snprintf(bf, size, "%-#*llx %c ",
- BITS_PER_LONG / 4 + 2, ip, o);
+ BITS_PER_LONG / 4 + 2, rip, o);
}
ret += repsep_snprintf(bf + ret, size - ret, "[%c] ", level);
evsel = evlist__first(evlist);
while (--nr > 0)
- evsel = perf_evsel__next(evsel);
+ evsel = evsel__next(evsel);
return evsel;
}
sort__daddr_cmp(struct hist_entry *left, struct hist_entry *right);
int64_t
sort__dcacheline_cmp(struct hist_entry *left, struct hist_entry *right);
+int64_t
+_sort__sym_cmp(struct symbol *sym_l, struct symbol *sym_r);
char *hist_entry__srcline(struct hist_entry *he);
#endif /* __PERF_SORT_H */
if (!unit)
return false;
if (strstr(unit, "/sec") ||
- strstr(unit, "hz") ||
- strstr(unit, "Hz") ||
strstr(unit, "CPUs utilized"))
return false;
return true;
const char *unit)
{
if (!strncmp(unit, "of all", 6)) {
- snprintf(buf, 1024, "%s %s", perf_evsel__name(evsel),
+ snprintf(buf, 1024, "%s %s", evsel__name(evsel),
unit);
return buf;
}
if (config->aggr_mode == AGGR_GLOBAL)
return 0;
- for (i = 0; i < perf_evsel__nr_cpus(evsel); i++) {
+ for (i = 0; i < evsel__nr_cpus(evsel); i++) {
int cpu2 = evsel__cpus(evsel)->map[i];
if (config->aggr_get_id(config, evlist->core.cpus, cpu2) == id)
config->csv_output ? 0 : config->unit_width,
evsel->unit, config->csv_sep);
- fprintf(output, "%-*s", config->csv_output ? 0 : 25, perf_evsel__name(evsel));
+ fprintf(output, "%-*s", config->csv_output ? 0 : 25, evsel__name(evsel));
print_cgroup(config, evsel);
}
counter->unit, config->csv_sep);
fprintf(config->output, "%*s",
- config->csv_output ? 0 : -25,
- perf_evsel__name(counter));
+ config->csv_output ? 0 : -25, evsel__name(counter));
print_cgroup(config, counter);
id = config->aggr_map->map[s];
evlist__for_each_entry(evlist, counter) {
val = 0;
- for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+ for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
s2 = config->aggr_get_id(config, evlist->core.cpus, cpu);
if (s2 != id)
continue;
alias = list_prepare_entry(counter, &(evlist->core.entries), core.node);
list_for_each_entry_continue (alias, &evlist->core.entries, core.node) {
- if (strcmp(perf_evsel__name(alias), perf_evsel__name(counter)) ||
+ if (strcmp(evsel__name(alias), evsel__name(counter)) ||
alias->scale != counter->scale ||
alias->cgrp != counter->cgrp ||
strcmp(alias->unit, counter->unit) ||
- perf_evsel__is_clock(alias) != perf_evsel__is_clock(counter) ||
+ evsel__is_clock(alias) != evsel__is_clock(counter) ||
!strcmp(alias->pmu_name, counter->pmu_name))
break;
alias->merged_stat = true;
struct aggr_data *ad = data;
int cpu, s2;
- for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+ for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
struct perf_counts_values *counts;
s2 = config->aggr_get_id(config, evsel__cpus(counter), cpu);
double uval;
int cpu;
- for (cpu = 0; cpu < perf_evsel__nr_cpus(counter); cpu++) {
+ for (cpu = 0; cpu < evsel__nr_cpus(counter); cpu++) {
struct aggr_data ad = { .cpu = cpu };
if (!collect_data(config, counter, counter_cb, &ad))
int s, s2, id;
bool first = true;
- for (int i = 0; i < perf_evsel__nr_cpus(counter); i++) {
+ for (int i = 0; i < evsel__nr_cpus(counter); i++) {
s2 = config->aggr_get_id(config, evsel__cpus(counter), i);
for (s = 0; s < config->aggr_map->nr; s++) {
id = config->aggr_map->map[s];
count *= counter->scale;
- if (perf_evsel__is_clock(counter))
+ if (evsel__is_clock(counter))
update_runtime_stat(st, STAT_NSECS, 0, cpu, count_ns);
- else if (perf_evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
+ else if (evsel__match(counter, HARDWARE, HW_CPU_CYCLES))
update_runtime_stat(st, STAT_CYCLES, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, CYCLES_IN_TX))
update_runtime_stat(st, STAT_CYCLES_IN_TX, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, TOPDOWN_RECOVERY_BUBBLES))
update_runtime_stat(st, STAT_TOPDOWN_RECOVERY_BUBBLES,
ctx, cpu, count);
- else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
+ else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_FRONTEND))
update_runtime_stat(st, STAT_STALLED_CYCLES_FRONT,
ctx, cpu, count);
- else if (perf_evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
+ else if (evsel__match(counter, HARDWARE, HW_STALLED_CYCLES_BACKEND))
update_runtime_stat(st, STAT_STALLED_CYCLES_BACK,
ctx, cpu, count);
- else if (perf_evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
+ else if (evsel__match(counter, HARDWARE, HW_BRANCH_INSTRUCTIONS))
update_runtime_stat(st, STAT_BRANCHES, ctx, cpu, count);
- else if (perf_evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
+ else if (evsel__match(counter, HARDWARE, HW_CACHE_REFERENCES))
update_runtime_stat(st, STAT_CACHEREFS, ctx, cpu, count);
- else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
+ else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1D))
update_runtime_stat(st, STAT_L1_DCACHE, ctx, cpu, count);
- else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
+ else if (evsel__match(counter, HW_CACHE, HW_CACHE_L1I))
update_runtime_stat(st, STAT_L1_ICACHE, ctx, cpu, count);
- else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_LL))
+ else if (evsel__match(counter, HW_CACHE, HW_CACHE_LL))
update_runtime_stat(st, STAT_LL_CACHE, ctx, cpu, count);
- else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
+ else if (evsel__match(counter, HW_CACHE, HW_CACHE_DTLB))
update_runtime_stat(st, STAT_DTLB_CACHE, ctx, cpu, count);
- else if (perf_evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
+ else if (evsel__match(counter, HW_CACHE, HW_CACHE_ITLB))
update_runtime_stat(st, STAT_ITLB_CACHE, ctx, cpu, count);
else if (perf_stat_evsel__is(counter, SMI_NUM))
update_runtime_stat(st, STAT_SMI_NUM, ctx, cpu, count);
metric_events = counter->metric_events;
if (!metric_events) {
if (expr__find_other(counter->metric_expr, counter->name,
- &metric_names, &num_metric_names) < 0)
+ &metric_names, &num_metric_names, 1) < 0)
continue;
metric_events = calloc(sizeof(struct evsel *),
char *name,
const char *metric_name,
const char *metric_unit,
+ int runtime,
double avg,
int cpu,
struct perf_stat_output_ctx *out,
struct runtime_stat *st)
{
print_metric_t print_metric = out->print_metric;
- struct parse_ctx pctx;
+ struct expr_parse_ctx pctx;
double ratio, scale;
int i;
void *ctxp = out->ctx;
}
if (!metric_events[i]) {
- if (expr__parse(&ratio, &pctx, metric_expr) == 0) {
+ if (expr__parse(&ratio, &pctx, metric_expr, runtime) == 0) {
char *unit;
char metric_bf[64];
&unit, &scale) >= 0) {
ratio *= scale;
}
-
- scnprintf(metric_bf, sizeof(metric_bf),
+ if (strstr(metric_expr, "?"))
+ scnprintf(metric_bf, sizeof(metric_bf),
+ "%s %s_%d", unit, metric_name, runtime);
+ else
+ scnprintf(metric_bf, sizeof(metric_bf),
"%s %s", unit, metric_name);
+
print_metric(config, ctxp, NULL, "%8.1f",
metric_bf, ratio);
} else {
struct metric_event *me;
int num = 1;
- if (perf_evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
+ if (evsel__match(evsel, HARDWARE, HW_INSTRUCTIONS)) {
total = runtime_stat_avg(st, STAT_CYCLES, ctx, cpu);
if (total) {
"stalled cycles per insn",
ratio);
}
- } else if (perf_evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
+ } else if (evsel__match(evsel, HARDWARE, HW_BRANCH_MISSES)) {
if (runtime_stat_n(st, STAT_BRANCHES, ctx, cpu) != 0)
print_branch_misses(config, cpu, evsel, avg, out, st);
else
print_ll_cache_misses(config, cpu, evsel, avg, out, st);
else
print_metric(config, ctxp, NULL, NULL, "of all LL-cache hits", 0);
- } else if (perf_evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
+ } else if (evsel__match(evsel, HARDWARE, HW_CACHE_MISSES)) {
total = runtime_stat_avg(st, STAT_CACHEREFS, ctx, cpu);
if (total)
"of all cache refs", ratio);
else
print_metric(config, ctxp, NULL, NULL, "of all cache refs", 0);
- } else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
+ } else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_FRONTEND)) {
print_stalled_cycles_frontend(config, cpu, evsel, avg, out, st);
- } else if (perf_evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
+ } else if (evsel__match(evsel, HARDWARE, HW_STALLED_CYCLES_BACKEND)) {
print_stalled_cycles_backend(config, cpu, evsel, avg, out, st);
- } else if (perf_evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
+ } else if (evsel__match(evsel, HARDWARE, HW_CPU_CYCLES)) {
total = runtime_stat_avg(st, STAT_NSECS, 0, cpu);
if (total) {
ratio = total / avg;
print_metric(config, ctxp, NULL, "%8.0f", "cycles / elision", ratio);
- } else if (perf_evsel__is_clock(evsel)) {
+ } else if (evsel__is_clock(evsel)) {
if ((ratio = avg_stats(&walltime_nsecs_stats)) != 0)
print_metric(config, ctxp, NULL, "%8.3f", "CPUs utilized",
avg / (ratio * evsel->scale));
print_metric(config, ctxp, NULL, NULL, name, 0);
} else if (evsel->metric_expr) {
generic_metric(config, evsel->metric_expr, evsel->metric_events, evsel->name,
- evsel->metric_name, NULL, avg, cpu, out, st);
+ evsel->metric_name, NULL, 1, avg, cpu, out, st);
} else if (runtime_stat_n(st, STAT_NSECS, 0, cpu) != 0) {
char unit = 'M';
char unit_buf[10];
out->new_line(config, ctxp);
generic_metric(config, mexp->metric_expr, mexp->metric_events,
evsel->name, mexp->metric_name,
- mexp->metric_unit, avg, cpu, out, st);
+ mexp->metric_unit, mexp->runtime, avg, cpu, out, st);
}
}
if (num == 0)
/* ps->id is 0 hence PERF_STAT_EVSEL_ID__NONE by default */
for (i = 0; i < PERF_STAT_EVSEL_ID__MAX; i++) {
- if (!strcmp(perf_evsel__name(evsel), id_str[i])) {
+ if (!strcmp(evsel__name(evsel), id_str[i])) {
ps->id = i;
break;
}
static int perf_evsel__alloc_stats(struct evsel *evsel, bool alloc_raw)
{
- int ncpus = perf_evsel__nr_cpus(evsel);
+ int ncpus = evsel__nr_cpus(evsel);
int nthreads = perf_thread_map__nr(evsel->core.threads);
if (perf_evsel__alloc_stat_priv(evsel) < 0 ||
case AGGR_NODE:
case AGGR_NONE:
if (!evsel->snapshot)
- perf_evsel__compute_deltas(evsel, cpu, thread, count);
+ evsel__compute_deltas(evsel, cpu, thread, count);
perf_counts_values__scale(count, config->scale, NULL);
if ((config->aggr_mode == AGGR_NONE) && (!evsel->percore)) {
perf_stat__update_shadow_stats(evsel, count->val,
struct evsel *counter)
{
int nthreads = perf_thread_map__nr(counter->core.threads);
- int ncpus = perf_evsel__nr_cpus(counter);
+ int ncpus = evsel__nr_cpus(counter);
int cpu, thread;
if (counter->core.system_wide)
* interval mode, otherwise overall avg running
* averages will be shown for each interval.
*/
- if (config->interval)
- init_stats(ps->res_stats);
+ if (config->interval) {
+ for (i = 0; i < 3; i++)
+ init_stats(&ps->res_stats[i]);
+ }
if (counter->per_pkg)
zero_per_pkg(counter);
return 0;
if (!counter->snapshot)
- perf_evsel__compute_deltas(counter, -1, -1, aggr);
+ evsel__compute_deltas(counter, -1, -1, aggr);
perf_counts_values__scale(aggr, config->scale, &counter->counts->scaled);
for (i = 0; i < 3; i++)
if (verbose > 0) {
fprintf(config->output, "%s: %" PRIu64 " %" PRIu64 " %" PRIu64 "\n",
- perf_evsel__name(counter), count[0], count[1], count[2]);
+ evsel__name(counter), count[0], count[1], count[2]);
}
/*
* either manually by us or by kernel via enable_on_exec
* set later.
*/
- if (perf_evsel__is_group_leader(evsel)) {
+ if (evsel__is_group_leader(evsel)) {
attr->disabled = 1;
/*
}
if (target__has_cpu(target) && !target__has_per_thread(target))
- return perf_evsel__open_per_cpu(evsel, evsel__cpus(evsel), cpu);
+ return evsel__open_per_cpu(evsel, evsel__cpus(evsel), cpu);
- return perf_evsel__open_per_thread(evsel, evsel->core.threads);
+ return evsel__open_per_thread(evsel, evsel->core.threads);
}
return symbols__sort_by_name(&dso->symbol_names, &dso->symbols);
}
+/*
+ * While we find nice hex chars, build a long_val.
+ * Return number of chars processed.
+ */
+static int hex2u64(const char *ptr, u64 *long_val)
+{
+ char *p;
+
+ *long_val = strtoull(ptr, &p, 16);
+
+ return p - ptr;
+}
+
+
int modules__parse(const char *filename, void *arg,
int (*process_module)(void *arg, const char *name,
u64 start, u64 size))
return true;
case DSO_BINARY_TYPE__BPF_PROG_INFO:
+ case DSO_BINARY_TYPE__BPF_IMAGE:
case DSO_BINARY_TYPE__NOT_FOUND:
default:
return false;
#include <string.h>
#include <uapi/linux/mman.h> /* To get things like MAP_HUGETLB even on older libc headers */
#include <api/fs/fs.h>
+#include <api/io.h>
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
static int perf_event__get_comm_ids(pid_t pid, char *comm, size_t len,
pid_t *tgid, pid_t *ppid)
{
- char filename[PATH_MAX];
char bf[4096];
int fd;
size_t size = 0;
*tgid = -1;
*ppid = -1;
- snprintf(filename, sizeof(filename), "/proc/%d/status", pid);
+ snprintf(bf, sizeof(bf), "/proc/%d/status", pid);
- fd = open(filename, O_RDONLY);
+ fd = open(bf, O_RDONLY);
if (fd < 0) {
- pr_debug("couldn't open %s\n", filename);
+ pr_debug("couldn't open %s\n", bf);
return -1;
}
return 0;
}
+static bool read_proc_maps_line(struct io *io, __u64 *start, __u64 *end,
+ u32 *prot, u32 *flags, __u64 *offset,
+ u32 *maj, u32 *min,
+ __u64 *inode,
+ ssize_t pathname_size, char *pathname)
+{
+ __u64 temp;
+ int ch;
+ char *start_pathname = pathname;
+
+ if (io__get_hex(io, start) != '-')
+ return false;
+ if (io__get_hex(io, end) != ' ')
+ return false;
+
+ /* map protection and flags bits */
+ *prot = 0;
+ ch = io__get_char(io);
+ if (ch == 'r')
+ *prot |= PROT_READ;
+ else if (ch != '-')
+ return false;
+ ch = io__get_char(io);
+ if (ch == 'w')
+ *prot |= PROT_WRITE;
+ else if (ch != '-')
+ return false;
+ ch = io__get_char(io);
+ if (ch == 'x')
+ *prot |= PROT_EXEC;
+ else if (ch != '-')
+ return false;
+ ch = io__get_char(io);
+ if (ch == 's')
+ *flags = MAP_SHARED;
+ else if (ch == 'p')
+ *flags = MAP_PRIVATE;
+ else
+ return false;
+ if (io__get_char(io) != ' ')
+ return false;
+
+ if (io__get_hex(io, offset) != ' ')
+ return false;
+
+ if (io__get_hex(io, &temp) != ':')
+ return false;
+ *maj = temp;
+ if (io__get_hex(io, &temp) != ' ')
+ return false;
+ *min = temp;
+
+ ch = io__get_dec(io, inode);
+ if (ch != ' ') {
+ *pathname = '\0';
+ return ch == '\n';
+ }
+ do {
+ ch = io__get_char(io);
+ } while (ch == ' ');
+ while (true) {
+ if (ch < 0)
+ return false;
+ if (ch == '\0' || ch == '\n' ||
+ (pathname + 1 - start_pathname) >= pathname_size) {
+ *pathname = '\0';
+ return true;
+ }
+ *pathname++ = ch;
+ ch = io__get_char(io);
+ }
+}
+
int perf_event__synthesize_mmap_events(struct perf_tool *tool,
union perf_event *event,
pid_t pid, pid_t tgid,
struct machine *machine,
bool mmap_data)
{
- char filename[PATH_MAX];
- FILE *fp;
unsigned long long t;
+ char bf[BUFSIZ];
+ struct io io;
bool truncation = false;
unsigned long long timeout = proc_map_timeout * 1000000ULL;
int rc = 0;
if (machine__is_default_guest(machine))
return 0;
- snprintf(filename, sizeof(filename), "%s/proc/%d/task/%d/maps",
- machine->root_dir, pid, pid);
+ snprintf(bf, sizeof(bf), "%s/proc/%d/task/%d/maps",
+ machine->root_dir, pid, pid);
- fp = fopen(filename, "r");
- if (fp == NULL) {
+ io.fd = open(bf, O_RDONLY, 0);
+ if (io.fd < 0) {
/*
* We raced with a task exiting - just return:
*/
- pr_debug("couldn't open %s\n", filename);
+ pr_debug("couldn't open %s\n", bf);
return -1;
}
+ io__init(&io, io.fd, bf, sizeof(bf));
event->header.type = PERF_RECORD_MMAP2;
t = rdclock();
- while (1) {
- char bf[BUFSIZ];
- char prot[5];
- char execname[PATH_MAX];
- char anonstr[] = "//anon";
- unsigned int ino;
+ while (!io.eof) {
+ static const char anonstr[] = "//anon";
size_t size;
- ssize_t n;
- if (fgets(bf, sizeof(bf), fp) == NULL)
- break;
+ /* ensure null termination since stack will be reused. */
+ event->mmap2.filename[0] = '\0';
+
+ /* 00400000-0040c000 r-xp 00000000 fd:01 41038 /bin/cat */
+ if (!read_proc_maps_line(&io,
+ &event->mmap2.start,
+ &event->mmap2.len,
+ &event->mmap2.prot,
+ &event->mmap2.flags,
+ &event->mmap2.pgoff,
+ &event->mmap2.maj,
+ &event->mmap2.min,
+ &event->mmap2.ino,
+ sizeof(event->mmap2.filename),
+ event->mmap2.filename))
+ continue;
if ((rdclock() - t) > timeout) {
- pr_warning("Reading %s time out. "
+ pr_warning("Reading %s/proc/%d/task/%d/maps time out. "
"You may want to increase "
"the time limit by --proc-map-timeout\n",
- filename);
+ machine->root_dir, pid, pid);
truncation = true;
goto out;
}
- /* ensure null termination since stack will be reused. */
- strcpy(execname, "");
-
- /* 00400000-0040c000 r-xp 00000000 fd:01 41038 /bin/cat */
- n = sscanf(bf, "%"PRI_lx64"-%"PRI_lx64" %s %"PRI_lx64" %x:%x %u %[^\n]\n",
- &event->mmap2.start, &event->mmap2.len, prot,
- &event->mmap2.pgoff, &event->mmap2.maj,
- &event->mmap2.min,
- &ino, execname);
-
- /*
- * Anon maps don't have the execname.
- */
- if (n < 7)
- continue;
-
- event->mmap2.ino = (u64)ino;
event->mmap2.ino_generation = 0;
/*
else
event->header.misc = PERF_RECORD_MISC_GUEST_USER;
- /* map protection and flags bits */
- event->mmap2.prot = 0;
- event->mmap2.flags = 0;
- if (prot[0] == 'r')
- event->mmap2.prot |= PROT_READ;
- if (prot[1] == 'w')
- event->mmap2.prot |= PROT_WRITE;
- if (prot[2] == 'x')
- event->mmap2.prot |= PROT_EXEC;
-
- if (prot[3] == 's')
- event->mmap2.flags |= MAP_SHARED;
- else
- event->mmap2.flags |= MAP_PRIVATE;
-
- if (prot[2] != 'x') {
- if (!mmap_data || prot[0] != 'r')
+ if ((event->mmap2.prot & PROT_EXEC) == 0) {
+ if (!mmap_data || (event->mmap2.prot & PROT_READ) == 0)
continue;
event->header.misc |= PERF_RECORD_MISC_MMAP_DATA;
if (truncation)
event->header.misc |= PERF_RECORD_MISC_PROC_MAP_PARSE_TIMEOUT;
- if (!strcmp(execname, ""))
- strcpy(execname, anonstr);
+ if (!strcmp(event->mmap2.filename, ""))
+ strcpy(event->mmap2.filename, anonstr);
if (hugetlbfs_mnt_len &&
- !strncmp(execname, hugetlbfs_mnt, hugetlbfs_mnt_len)) {
- strcpy(execname, anonstr);
+ !strncmp(event->mmap2.filename, hugetlbfs_mnt,
+ hugetlbfs_mnt_len)) {
+ strcpy(event->mmap2.filename, anonstr);
event->mmap2.flags |= MAP_HUGETLB;
}
- size = strlen(execname) + 1;
- memcpy(event->mmap2.filename, execname, size);
+ size = strlen(event->mmap2.filename) + 1;
size = PERF_ALIGN(size, sizeof(u64));
event->mmap2.len -= event->mmap.start;
event->mmap2.header.size = (sizeof(event->mmap2) -
break;
}
- fclose(fp);
+ close(io.fd);
return rc;
}
synthesize_mask((struct perf_record_record_cpu_map *)data->data, map, max);
default:
break;
- };
+ }
}
static struct perf_record_cpu_map *cpu_map_event__new(struct perf_cpu_map *map)
* @comm: current comm
* @arr_sz: size of array if this is the first element of an array
* @rstate: used to detect retpolines
+ * @br_stack_rb: branch stack (ring buffer)
+ * @br_stack_sz: maximum branch stack size
+ * @br_stack_pos: current position in @br_stack_rb
+ * @mispred_all: mark all branches as mispredicted
*/
struct thread_stack {
struct thread_stack_entry *stack;
struct comm *comm;
unsigned int arr_sz;
enum retpoline_state_t rstate;
+ struct branch_stack *br_stack_rb;
+ unsigned int br_stack_sz;
+ unsigned int br_stack_pos;
+ bool mispred_all;
};
/*
}
static int thread_stack__init(struct thread_stack *ts, struct thread *thread,
- struct call_return_processor *crp)
+ struct call_return_processor *crp,
+ bool callstack, unsigned int br_stack_sz)
{
int err;
- err = thread_stack__grow(ts);
- if (err)
- return err;
+ if (callstack) {
+ err = thread_stack__grow(ts);
+ if (err)
+ return err;
+ }
+
+ if (br_stack_sz) {
+ size_t sz = sizeof(struct branch_stack);
+
+ sz += br_stack_sz * sizeof(struct branch_entry);
+ ts->br_stack_rb = zalloc(sz);
+ if (!ts->br_stack_rb)
+ return -ENOMEM;
+ ts->br_stack_sz = br_stack_sz;
+ }
if (thread->maps && thread->maps->machine) {
struct machine *machine = thread->maps->machine;
}
static struct thread_stack *thread_stack__new(struct thread *thread, int cpu,
- struct call_return_processor *crp)
+ struct call_return_processor *crp,
+ bool callstack,
+ unsigned int br_stack_sz)
{
struct thread_stack *ts = thread->ts, *new_ts;
unsigned int old_sz = ts ? ts->arr_sz : 0;
ts += cpu;
if (!ts->stack &&
- thread_stack__init(ts, thread, crp))
+ thread_stack__init(ts, thread, crp, callstack, br_stack_sz))
return NULL;
return ts;
if (!crp) {
ts->cnt = 0;
+ ts->br_stack_pos = 0;
+ if (ts->br_stack_rb)
+ ts->br_stack_rb->nr = 0;
return 0;
}
return err;
}
+static void thread_stack__update_br_stack(struct thread_stack *ts, u32 flags,
+ u64 from_ip, u64 to_ip)
+{
+ struct branch_stack *bs = ts->br_stack_rb;
+ struct branch_entry *be;
+
+ if (!ts->br_stack_pos)
+ ts->br_stack_pos = ts->br_stack_sz;
+
+ ts->br_stack_pos -= 1;
+
+ be = &bs->entries[ts->br_stack_pos];
+ be->from = from_ip;
+ be->to = to_ip;
+ be->flags.value = 0;
+ be->flags.abort = !!(flags & PERF_IP_FLAG_TX_ABORT);
+ be->flags.in_tx = !!(flags & PERF_IP_FLAG_IN_TX);
+ /* No support for mispredict */
+ be->flags.mispred = ts->mispred_all;
+
+ if (bs->nr < ts->br_stack_sz)
+ bs->nr += 1;
+}
+
int thread_stack__event(struct thread *thread, int cpu, u32 flags, u64 from_ip,
- u64 to_ip, u16 insn_len, u64 trace_nr)
+ u64 to_ip, u16 insn_len, u64 trace_nr, bool callstack,
+ unsigned int br_stack_sz, bool mispred_all)
{
struct thread_stack *ts = thread__stack(thread, cpu);
return -EINVAL;
if (!ts) {
- ts = thread_stack__new(thread, cpu, NULL);
+ ts = thread_stack__new(thread, cpu, NULL, callstack, br_stack_sz);
if (!ts) {
pr_warning("Out of memory: no thread stack\n");
return -ENOMEM;
}
ts->trace_nr = trace_nr;
+ ts->mispred_all = mispred_all;
}
/*
ts->trace_nr = trace_nr;
}
- /* Stop here if thread_stack__process() is in use */
- if (ts->crp)
+ if (br_stack_sz)
+ thread_stack__update_br_stack(ts, flags, from_ip, to_ip);
+
+ /*
+ * Stop here if thread_stack__process() is in use, or not recording call
+ * stack.
+ */
+ if (ts->crp || !callstack)
return 0;
if (flags & PERF_IP_FLAG_CALL) {
{
__thread_stack__flush(thread, ts);
zfree(&ts->stack);
+ zfree(&ts->br_stack_rb);
}
static void thread_stack__reset(struct thread *thread, struct thread_stack *ts)
chain->nr = i;
}
+/*
+ * Hardware sample records, created some time after the event occurred, need to
+ * have subsequent addresses removed from the call chain.
+ */
+void thread_stack__sample_late(struct thread *thread, int cpu,
+ struct ip_callchain *chain, size_t sz,
+ u64 sample_ip, u64 kernel_start)
+{
+ struct thread_stack *ts = thread__stack(thread, cpu);
+ u64 sample_context = callchain_context(sample_ip, kernel_start);
+ u64 last_context, context, ip;
+ size_t nr = 0, j;
+
+ if (sz < 2) {
+ chain->nr = 0;
+ return;
+ }
+
+ if (!ts)
+ goto out;
+
+ /*
+ * When tracing kernel space, kernel addresses occur at the top of the
+ * call chain after the event occurred but before tracing stopped.
+ * Skip them.
+ */
+ for (j = 1; j <= ts->cnt; j++) {
+ ip = ts->stack[ts->cnt - j].ret_addr;
+ context = callchain_context(ip, kernel_start);
+ if (context == PERF_CONTEXT_USER ||
+ (context == sample_context && ip == sample_ip))
+ break;
+ }
+
+ last_context = sample_ip; /* Use sample_ip as an invalid context */
+
+ for (; nr < sz && j <= ts->cnt; nr++, j++) {
+ ip = ts->stack[ts->cnt - j].ret_addr;
+ context = callchain_context(ip, kernel_start);
+ if (context != last_context) {
+ if (nr >= sz - 1)
+ break;
+ chain->ips[nr++] = context;
+ last_context = context;
+ }
+ chain->ips[nr] = ip;
+ }
+out:
+ if (nr) {
+ chain->nr = nr;
+ } else {
+ chain->ips[0] = sample_context;
+ chain->ips[1] = sample_ip;
+ chain->nr = 2;
+ }
+}
+
+void thread_stack__br_sample(struct thread *thread, int cpu,
+ struct branch_stack *dst, unsigned int sz)
+{
+ struct thread_stack *ts = thread__stack(thread, cpu);
+ const size_t bsz = sizeof(struct branch_entry);
+ struct branch_stack *src;
+ struct branch_entry *be;
+ unsigned int nr;
+
+ dst->nr = 0;
+
+ if (!ts)
+ return;
+
+ src = ts->br_stack_rb;
+ if (!src->nr)
+ return;
+
+ dst->nr = min((unsigned int)src->nr, sz);
+
+ be = &dst->entries[0];
+ nr = min(ts->br_stack_sz - ts->br_stack_pos, (unsigned int)dst->nr);
+ memcpy(be, &src->entries[ts->br_stack_pos], bsz * nr);
+
+ if (src->nr >= ts->br_stack_sz) {
+ sz -= nr;
+ be = &dst->entries[nr];
+ nr = min(ts->br_stack_pos, sz);
+ memcpy(be, &src->entries[0], bsz * ts->br_stack_pos);
+ }
+}
+
+/* Start of user space branch entries */
+static bool us_start(struct branch_entry *be, u64 kernel_start, bool *start)
+{
+ if (!*start)
+ *start = be->to && be->to < kernel_start;
+
+ return *start;
+}
+
+/*
+ * Start of branch entries after the ip fell in between 2 branches, or user
+ * space branch entries.
+ */
+static bool ks_start(struct branch_entry *be, u64 sample_ip, u64 kernel_start,
+ bool *start, struct branch_entry *nb)
+{
+ if (!*start) {
+ *start = (nb && sample_ip >= be->to && sample_ip <= nb->from) ||
+ be->from < kernel_start ||
+ (be->to && be->to < kernel_start);
+ }
+
+ return *start;
+}
+
+/*
+ * Hardware sample records, created some time after the event occurred, need to
+ * have subsequent addresses removed from the branch stack.
+ */
+void thread_stack__br_sample_late(struct thread *thread, int cpu,
+ struct branch_stack *dst, unsigned int sz,
+ u64 ip, u64 kernel_start)
+{
+ struct thread_stack *ts = thread__stack(thread, cpu);
+ struct branch_entry *d, *s, *spos, *ssz;
+ struct branch_stack *src;
+ unsigned int nr = 0;
+ bool start = false;
+
+ dst->nr = 0;
+
+ if (!ts)
+ return;
+
+ src = ts->br_stack_rb;
+ if (!src->nr)
+ return;
+
+ spos = &src->entries[ts->br_stack_pos];
+ ssz = &src->entries[ts->br_stack_sz];
+
+ d = &dst->entries[0];
+ s = spos;
+
+ if (ip < kernel_start) {
+ /*
+ * User space sample: start copying branch entries when the
+ * branch is in user space.
+ */
+ for (s = spos; s < ssz && nr < sz; s++) {
+ if (us_start(s, kernel_start, &start)) {
+ *d++ = *s;
+ nr += 1;
+ }
+ }
+
+ if (src->nr >= ts->br_stack_sz) {
+ for (s = &src->entries[0]; s < spos && nr < sz; s++) {
+ if (us_start(s, kernel_start, &start)) {
+ *d++ = *s;
+ nr += 1;
+ }
+ }
+ }
+ } else {
+ struct branch_entry *nb = NULL;
+
+ /*
+ * Kernel space sample: start copying branch entries when the ip
+ * falls in between 2 branches (or the branch is in user space
+ * because then the start must have been missed).
+ */
+ for (s = spos; s < ssz && nr < sz; s++) {
+ if (ks_start(s, ip, kernel_start, &start, nb)) {
+ *d++ = *s;
+ nr += 1;
+ }
+ nb = s;
+ }
+
+ if (src->nr >= ts->br_stack_sz) {
+ for (s = &src->entries[0]; s < spos && nr < sz; s++) {
+ if (ks_start(s, ip, kernel_start, &start, nb)) {
+ *d++ = *s;
+ nr += 1;
+ }
+ nb = s;
+ }
+ }
+ }
+
+ dst->nr = nr;
+}
+
struct call_return_processor *
call_return_processor__new(int (*process)(struct call_return *cr, u64 *parent_db_id, void *data),
void *data)
}
if (!ts) {
- ts = thread_stack__new(thread, sample->cpu, crp);
+ ts = thread_stack__new(thread, sample->cpu, crp, true, 0);
if (!ts)
return -ENOMEM;
ts->comm = comm;
};
int thread_stack__event(struct thread *thread, int cpu, u32 flags, u64 from_ip,
- u64 to_ip, u16 insn_len, u64 trace_nr);
+ u64 to_ip, u16 insn_len, u64 trace_nr, bool callstack,
+ unsigned int br_stack_sz, bool mispred_all);
void thread_stack__set_trace_nr(struct thread *thread, int cpu, u64 trace_nr);
void thread_stack__sample(struct thread *thread, int cpu, struct ip_callchain *chain,
size_t sz, u64 ip, u64 kernel_start);
+void thread_stack__sample_late(struct thread *thread, int cpu,
+ struct ip_callchain *chain, size_t sz, u64 ip,
+ u64 kernel_start);
+void thread_stack__br_sample(struct thread *thread, int cpu,
+ struct branch_stack *dst, unsigned int sz);
+void thread_stack__br_sample_late(struct thread *thread, int cpu,
+ struct branch_stack *dst, unsigned int sz,
+ u64 sample_ip, u64 kernel_start);
int thread_stack__flush(struct thread *thread);
void thread_stack__free(struct thread *thread);
size_t thread_stack__depth(struct thread *thread, int cpu);
thread->tid = tid;
thread->ppid = -1;
thread->cpu = -1;
+ thread->lbr_stitch_enable = false;
INIT_LIST_HEAD(&thread->namespaces_list);
INIT_LIST_HEAD(&thread->comm_list);
init_rwsem(&thread->namespaces_lock);
exit_rwsem(&thread->namespaces_lock);
exit_rwsem(&thread->comm_lock);
+ thread__free_stitch_list(thread);
free(thread);
}
return dso__data_read_offset(al.map->dso, machine, offset, buf, len);
}
+
+void thread__free_stitch_list(struct thread *thread)
+{
+ struct lbr_stitch *lbr_stitch = thread->lbr_stitch;
+ struct stitch_list *pos, *tmp;
+
+ if (!lbr_stitch)
+ return;
+
+ list_for_each_entry_safe(pos, tmp, &lbr_stitch->lists, node) {
+ list_del_init(&pos->node);
+ free(pos);
+ }
+
+ list_for_each_entry_safe(pos, tmp, &lbr_stitch->free_lists, node) {
+ list_del_init(&pos->node);
+ free(pos);
+ }
+
+ zfree(&lbr_stitch->prev_lbr_cursor);
+ zfree(&thread->lbr_stitch);
+}
#include <strlist.h>
#include <intlist.h>
#include "rwsem.h"
+#include "event.h"
+#include "callchain.h"
struct addr_location;
struct map;
struct thread_stack;
struct unwind_libunwind_ops;
+struct lbr_stitch {
+ struct list_head lists;
+ struct list_head free_lists;
+ struct perf_sample prev_sample;
+ struct callchain_cursor_node *prev_lbr_cursor;
+};
+
struct thread {
union {
struct rb_node rb_node;
struct srccode_state srccode_state;
bool filter;
int filter_entry_depth;
+
+ /* LBR call stack stitch */
+ bool lbr_stitch_enable;
+ struct lbr_stitch *lbr_stitch;
};
struct machine;
return false;
}
+void thread__free_stitch_list(struct thread *thread);
+
#endif /* __PERF_THREAD_H */
opts->freq ? "Hz" : "");
}
- ret += SNPRINTF(bf + ret, size - ret, "%s", perf_evsel__name(top->sym_evsel));
+ ret += SNPRINTF(bf + ret, size - ret, "%s", evsel__name(top->sym_evsel));
ret += SNPRINTF(bf + ret, size - ret, "], ");
struct perf_top {
struct perf_tool tool;
- struct evlist *evlist;
+ struct evlist *evlist, *sb_evlist;
struct record_opts record_opts;
struct annotation_options annotation_opts;
struct evswitch evswitch;
bool use_tui, use_stdio;
bool vmlinux_warned;
bool dump_symtab;
+ bool stitch_lbr;
struct hist_entry *sym_filter_entry;
struct evsel *sym_evsel;
struct perf_session *session;
r = size > BUFSIZ ? BUFSIZ : size;
do_read(buf, r);
size -= r;
- };
+ }
}
static unsigned int read4(struct tep_handle *pevent)
bool perf_event_paranoid_check(int max_level)
{
return perf_cap__capable(CAP_SYS_ADMIN) ||
+ perf_cap__capable(CAP_PERFMON) ||
perf_event_paranoid() <= max_level;
}
suspend_console:
acpi_pm_prepare:
syscore_suspend:
-arch_enable_nonboot_cpus_end:
+arch_thaw_secondary_cpus_end:
syscore_resume:
acpi_pm_finish:
resume_console:
'suspend_console': {},
'acpi_pm_prepare': {},
'syscore_suspend': {},
- 'arch_enable_nonboot_cpus_end': {},
+ 'arch_thaw_secondary_cpus_end': {},
'syscore_resume': {},
'acpi_pm_finish': {},
'resume_console': {},
clean:
rm -f $(ALL_PROGRAMS)
rm -rf $(OUTPUT)include/
- find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete -o -name '\.*.d' -delete
+ find $(if $(OUTPUT),$(OUTPUT),.) -name '*.o' -delete
+ find $(if $(OUTPUT),$(OUTPUT),.) -name '\.*.o.d' -delete
+ find $(if $(OUTPUT),$(OUTPUT),.) -name '\.*.o.cmd' -delete
install: $(ALL_PROGRAMS)
install -d -m 755 $(DESTDIR)$(bindir); \
.bits_per_word = bits,
};
- if (mode & SPI_TX_QUAD)
+ if (mode & SPI_TX_OCTAL)
+ tr.tx_nbits = 8;
+ else if (mode & SPI_TX_QUAD)
tr.tx_nbits = 4;
else if (mode & SPI_TX_DUAL)
tr.tx_nbits = 2;
- if (mode & SPI_RX_QUAD)
+ if (mode & SPI_RX_OCTAL)
+ tr.rx_nbits = 8;
+ else if (mode & SPI_RX_QUAD)
tr.rx_nbits = 4;
else if (mode & SPI_RX_DUAL)
tr.rx_nbits = 2;
if (!(mode & SPI_LOOP)) {
- if (mode & (SPI_TX_QUAD | SPI_TX_DUAL))
+ if (mode & (SPI_TX_OCTAL | SPI_TX_QUAD | SPI_TX_DUAL))
tr.rx_buf = 0;
- else if (mode & (SPI_RX_QUAD | SPI_RX_DUAL))
+ else if (mode & (SPI_RX_OCTAL | SPI_RX_QUAD | SPI_RX_DUAL))
tr.tx_buf = 0;
}
" -R --ready slave pulls low to pause\n"
" -2 --dual dual transfer\n"
" -4 --quad quad transfer\n"
+ " -8 --octal octal transfer\n"
" -S --size transfer size\n"
" -I --iter iterations\n");
exit(1);
{ "dual", 0, 0, '2' },
{ "verbose", 0, 0, 'v' },
{ "quad", 0, 0, '4' },
+ { "octal", 0, 0, '8' },
{ "size", 1, 0, 'S' },
{ "iter", 1, 0, 'I' },
{ NULL, 0, 0, 0 },
};
int c;
- c = getopt_long(argc, argv, "D:s:d:b:i:o:lHOLC3NR24p:vS:I:",
+ c = getopt_long(argc, argv, "D:s:d:b:i:o:lHOLC3NR248p:vS:I:",
lopts, NULL);
if (c == -1)
case '4':
mode |= SPI_TX_QUAD;
break;
+ case '8':
+ mode |= SPI_TX_OCTAL;
+ break;
case 'S':
transfer_size = atoi(optarg);
break;
mode |= SPI_RX_DUAL;
if (mode & SPI_TX_QUAD)
mode |= SPI_RX_QUAD;
+ if (mode & SPI_TX_OCTAL)
+ mode |= SPI_RX_OCTAL;
}
}
const size_t map_sz = roundup_page(sizeof(struct map_data));
const int zero = 0, one = 1, two = 2, far = 1500;
const long page_size = sysconf(_SC_PAGE_SIZE);
- int err, duration = 0, i, data_map_fd, data_map_id, tmp_fd;
+ int err, duration = 0, i, data_map_fd, data_map_id, tmp_fd, rdmap_fd;
struct bpf_map *data_map, *bss_map;
void *bss_mmaped = NULL, *map_mmaped = NULL, *tmp1, *tmp2;
struct test_mmap__bss *bss_data;
data_map = skel->maps.data_map;
data_map_fd = bpf_map__fd(data_map);
+ rdmap_fd = bpf_map__fd(skel->maps.rdonly_map);
+ tmp1 = mmap(NULL, 4096, PROT_READ | PROT_WRITE, MAP_SHARED, rdmap_fd, 0);
+ if (CHECK(tmp1 != MAP_FAILED, "rdonly_write_mmap", "unexpected success\n")) {
+ munmap(tmp1, 4096);
+ goto cleanup;
+ }
+ /* now double-check if it's mmap()'able at all */
+ tmp1 = mmap(NULL, 4096, PROT_READ, MAP_SHARED, rdmap_fd, 0);
+ if (CHECK(tmp1 == MAP_FAILED, "rdonly_read_mmap", "failed: %d\n", errno))
+ goto cleanup;
+
/* get map's ID */
memset(&map_info, 0, map_info_sz);
err = bpf_obj_get_info_by_fd(data_map_fd, &map_info, &map_info_sz);
munmap(tmp2, 4 * page_size);
+ /* map all 4 pages, but with pg_off=1 page, should fail */
+ tmp1 = mmap(NULL, 4 * page_size, PROT_READ, MAP_SHARED | MAP_FIXED,
+ data_map_fd, page_size /* initial page shift */);
+ if (CHECK(tmp1 != MAP_FAILED, "adv_mmap7", "unexpected success")) {
+ munmap(tmp1, 4 * page_size);
+ goto cleanup;
+ }
+
tmp1 = mmap(NULL, map_sz, PROT_READ, MAP_SHARED, data_map_fd, 0);
if (CHECK(tmp1 == MAP_FAILED, "last_mmap", "failed %d\n", errno))
goto cleanup;
char _license[] SEC("license") = "GPL";
+struct {
+ __uint(type, BPF_MAP_TYPE_ARRAY);
+ __uint(max_entries, 4096);
+ __uint(map_flags, BPF_F_MMAPABLE | BPF_F_RDONLY_PROG);
+ __type(key, __u32);
+ __type(value, char);
+} rdonly_map SEC(".maps");
+
struct {
__uint(type, BPF_MAP_TYPE_ARRAY);
__uint(max_entries, 512 * 4); /* at least 4 pages of data */
SEC("fentry/__set_task_comm")
int BPF_PROG(prog4, struct task_struct *tsk, const char *buf, bool exec)
{
- return !tsk;
+ return 0;
}
SEC("fexit/__set_task_comm")
int BPF_PROG(prog5, struct task_struct *tsk, const char *buf, bool exec)
{
- return !tsk;
+ return 0;
}
char _license[] SEC("license") = "GPL";
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
BPF_LD_MAP_FD(BPF_REG_1, 0),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
- BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8),
/* r1 = [0x00, 0xff] */
BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0xffffff80 >> 1),
* [0xffff'ffff'0000'0080, 0xffff'ffff'ffff'ffff]
*/
BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 0xffffff80 >> 1),
- /* r1 = 0 or
- * [0x00ff'ffff'ff00'0000, 0x00ff'ffff'ffff'ffff]
- */
- BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
/* error on OOB pointer computation */
BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
/* exit */
},
.fixup_map_hash_8b = { 3 },
/* not actually fully unbounded, but the bound is very high */
- .errstr = "value 72057594021150720 makes map_value pointer be out of bounds",
- .result = REJECT
+ .errstr_unpriv = "R1 has unknown scalar with mixed signed bounds, pointer arithmetic with it prohibited for !root",
+ .result_unpriv = REJECT,
+ .errstr = "value -4294967168 makes map_value pointer be out of bounds",
+ .result = REJECT,
},
{
"bounds check after truncation of boundary-crossing range (2)",
BPF_ALU64_IMM(BPF_ADD, BPF_REG_2, -8),
BPF_LD_MAP_FD(BPF_REG_1, 0),
BPF_RAW_INSN(BPF_JMP | BPF_CALL, 0, 0, 0, BPF_FUNC_map_lookup_elem),
- BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 9),
+ BPF_JMP_IMM(BPF_JEQ, BPF_REG_0, 0, 8),
/* r1 = [0x00, 0xff] */
BPF_LDX_MEM(BPF_B, BPF_REG_1, BPF_REG_0, 0),
BPF_ALU64_IMM(BPF_ADD, BPF_REG_1, 0xffffff80 >> 1),
* [0xffff'ffff'0000'0080, 0xffff'ffff'ffff'ffff]
*/
BPF_ALU64_IMM(BPF_SUB, BPF_REG_1, 0xffffff80 >> 1),
- /* r1 = 0 or
- * [0x00ff'ffff'ff00'0000, 0x00ff'ffff'ffff'ffff]
- */
- BPF_ALU64_IMM(BPF_RSH, BPF_REG_1, 8),
/* error on OOB pointer computation */
BPF_ALU64_REG(BPF_ADD, BPF_REG_0, BPF_REG_1),
/* exit */
},
.fixup_map_hash_8b = { 3 },
/* not actually fully unbounded, but the bound is very high */
- .errstr = "value 72057594021150720 makes map_value pointer be out of bounds",
- .result = REJECT
+ .errstr_unpriv = "R1 has unknown scalar with mixed signed bounds, pointer arithmetic with it prohibited for !root",
+ .result_unpriv = REJECT,
+ .errstr = "value -4294967168 makes map_value pointer be out of bounds",
+ .result = REJECT,
},
{
"bounds check after wrapping 32-bit addition",
},
.result = ACCEPT
},
+{
+ "assigning 32bit bounds to 64bit for wA = 0, wB = wA",
+ .insns = {
+ BPF_LDX_MEM(BPF_W, BPF_REG_8, BPF_REG_1,
+ offsetof(struct __sk_buff, data_end)),
+ BPF_LDX_MEM(BPF_W, BPF_REG_7, BPF_REG_1,
+ offsetof(struct __sk_buff, data)),
+ BPF_MOV32_IMM(BPF_REG_9, 0),
+ BPF_MOV32_REG(BPF_REG_2, BPF_REG_9),
+ BPF_MOV64_REG(BPF_REG_6, BPF_REG_7),
+ BPF_ALU64_REG(BPF_ADD, BPF_REG_6, BPF_REG_2),
+ BPF_MOV64_REG(BPF_REG_3, BPF_REG_6),
+ BPF_ALU64_IMM(BPF_ADD, BPF_REG_3, 8),
+ BPF_JMP_REG(BPF_JGT, BPF_REG_3, BPF_REG_8, 1),
+ BPF_LDX_MEM(BPF_W, BPF_REG_5, BPF_REG_6, 0),
+ BPF_MOV64_IMM(BPF_REG_0, 0),
+ BPF_EXIT_INSN(),
+ },
+ .prog_type = BPF_PROG_TYPE_SCHED_CLS,
+ .result = ACCEPT,
+ .flags = F_NEEDS_EFFICIENT_UNALIGNED_ACCESS,
+},
}
printf("Expected error checking passed\n");
+ ret = 0;
out:
if (dmabuf_fd >= 0)
close(dmabuf_fd);
local i
for ((i = 0; i < attempts; ++i)); do
- if $ARPING -c 1 -I $h1 -b 192.0.2.66 -q -w 0.1; then
+ if $ARPING -c 1 -I $h1 -b 192.0.2.66 -q -w 1; then
((passes++))
fi
local packets_t0
local packets_t1
+ RET=0
+
if [ $(devlink_trap_policers_num_get) -eq 0 ]; then
check_err 1 "Failed to dump policers"
fi
trap_policer_bind_test()
{
+ RET=0
+
devlink trap group set $DEVLINK_DEV group l2_drops policer 1
check_err $? "Failed to bind a valid policer"
if [ $(devlink_trap_group_policer_get "l2_drops") -ne 1 ]; then
exit_unsupported
}
-modprobe $MOD || unsup "$MOD module not available"
+unres() { #msg
+ reset_tracer
+ rmmod $MOD || true
+ echo $1
+ exit_unresolved
+}
+
+modprobe $MOD || unres "$MOD module not available"
rmmod $MOD
grep -q "preemptoff" available_tracers || unsup "preemptoff tracer not enabled"
TEST_GEN_PROGS_x86_64 += x86_64/vmx_set_nested_state_test
TEST_GEN_PROGS_x86_64 += x86_64/vmx_tsc_adjust_test
TEST_GEN_PROGS_x86_64 += x86_64/xss_msr_test
+TEST_GEN_PROGS_x86_64 += x86_64/debug_regs
TEST_GEN_PROGS_x86_64 += clear_dirty_log_test
TEST_GEN_PROGS_x86_64 += demand_paging_test
TEST_GEN_PROGS_x86_64 += dirty_log_test
void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid);
int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid);
void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid);
+void vcpu_set_guest_debug(struct kvm_vm *vm, uint32_t vcpuid,
+ struct kvm_guest_debug *debug);
void vcpu_set_mp_state(struct kvm_vm *vm, uint32_t vcpuid,
struct kvm_mp_state *mp_state);
void vcpu_regs_get(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_regs *regs);
ret, errno);
}
+void vcpu_set_guest_debug(struct kvm_vm *vm, uint32_t vcpuid,
+ struct kvm_guest_debug *debug)
+{
+ struct vcpu *vcpu = vcpu_find(vm, vcpuid);
+ int ret = ioctl(vcpu->fd, KVM_SET_GUEST_DEBUG, debug);
+
+ TEST_ASSERT(ret == 0, "KVM_SET_GUEST_DEBUG failed: %d", ret);
+}
+
/*
* VM VCPU Set MP State
*
--- /dev/null
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * KVM guest debug register tests
+ *
+ * Copyright (C) 2020, Red Hat, Inc.
+ */
+#include <stdio.h>
+#include <string.h>
+#include "kvm_util.h"
+#include "processor.h"
+
+#define VCPU_ID 0
+
+#define DR6_BD (1 << 13)
+#define DR7_GD (1 << 13)
+
+/* For testing data access debug BP */
+uint32_t guest_value;
+
+extern unsigned char sw_bp, hw_bp, write_data, ss_start, bd_start;
+
+static void guest_code(void)
+{
+ /*
+ * Software BP tests.
+ *
+ * NOTE: sw_bp need to be before the cmd here, because int3 is an
+ * exception rather than a normal trap for KVM_SET_GUEST_DEBUG (we
+ * capture it using the vcpu exception bitmap).
+ */
+ asm volatile("sw_bp: int3");
+
+ /* Hardware instruction BP test */
+ asm volatile("hw_bp: nop");
+
+ /* Hardware data BP test */
+ asm volatile("mov $1234,%%rax;\n\t"
+ "mov %%rax,%0;\n\t write_data:"
+ : "=m" (guest_value) : : "rax");
+
+ /* Single step test, covers 2 basic instructions and 2 emulated */
+ asm volatile("ss_start: "
+ "xor %%rax,%%rax\n\t"
+ "cpuid\n\t"
+ "movl $0x1a0,%%ecx\n\t"
+ "rdmsr\n\t"
+ : : : "rax", "ecx");
+
+ /* DR6.BD test */
+ asm volatile("bd_start: mov %%dr0, %%rax" : : : "rax");
+ GUEST_DONE();
+}
+
+#define CLEAR_DEBUG() memset(&debug, 0, sizeof(debug))
+#define APPLY_DEBUG() vcpu_set_guest_debug(vm, VCPU_ID, &debug)
+#define CAST_TO_RIP(v) ((unsigned long long)&(v))
+#define SET_RIP(v) do { \
+ vcpu_regs_get(vm, VCPU_ID, ®s); \
+ regs.rip = (v); \
+ vcpu_regs_set(vm, VCPU_ID, ®s); \
+ } while (0)
+#define MOVE_RIP(v) SET_RIP(regs.rip + (v));
+
+int main(void)
+{
+ struct kvm_guest_debug debug;
+ unsigned long long target_dr6, target_rip;
+ struct kvm_regs regs;
+ struct kvm_run *run;
+ struct kvm_vm *vm;
+ struct ucall uc;
+ uint64_t cmd;
+ int i;
+ /* Instruction lengths starting at ss_start */
+ int ss_size[4] = {
+ 3, /* xor */
+ 2, /* cpuid */
+ 5, /* mov */
+ 2, /* rdmsr */
+ };
+
+ if (!kvm_check_cap(KVM_CAP_SET_GUEST_DEBUG)) {
+ print_skip("KVM_CAP_SET_GUEST_DEBUG not supported");
+ return 0;
+ }
+
+ vm = vm_create_default(VCPU_ID, 0, guest_code);
+ vcpu_set_cpuid(vm, VCPU_ID, kvm_get_supported_cpuid());
+ run = vcpu_state(vm, VCPU_ID);
+
+ /* Test software BPs - int3 */
+ CLEAR_DEBUG();
+ debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_SW_BP;
+ APPLY_DEBUG();
+ vcpu_run(vm, VCPU_ID);
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
+ run->debug.arch.exception == BP_VECTOR &&
+ run->debug.arch.pc == CAST_TO_RIP(sw_bp),
+ "INT3: exit %d exception %d rip 0x%llx (should be 0x%llx)",
+ run->exit_reason, run->debug.arch.exception,
+ run->debug.arch.pc, CAST_TO_RIP(sw_bp));
+ MOVE_RIP(1);
+
+ /* Test instruction HW BP over DR[0-3] */
+ for (i = 0; i < 4; i++) {
+ CLEAR_DEBUG();
+ debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP;
+ debug.arch.debugreg[i] = CAST_TO_RIP(hw_bp);
+ debug.arch.debugreg[7] = 0x400 | (1UL << (2*i+1));
+ APPLY_DEBUG();
+ vcpu_run(vm, VCPU_ID);
+ target_dr6 = 0xffff0ff0 | (1UL << i);
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
+ run->debug.arch.exception == DB_VECTOR &&
+ run->debug.arch.pc == CAST_TO_RIP(hw_bp) &&
+ run->debug.arch.dr6 == target_dr6,
+ "INS_HW_BP (DR%d): exit %d exception %d rip 0x%llx "
+ "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
+ i, run->exit_reason, run->debug.arch.exception,
+ run->debug.arch.pc, CAST_TO_RIP(hw_bp),
+ run->debug.arch.dr6, target_dr6);
+ }
+ /* Skip "nop" */
+ MOVE_RIP(1);
+
+ /* Test data access HW BP over DR[0-3] */
+ for (i = 0; i < 4; i++) {
+ CLEAR_DEBUG();
+ debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP;
+ debug.arch.debugreg[i] = CAST_TO_RIP(guest_value);
+ debug.arch.debugreg[7] = 0x00000400 | (1UL << (2*i+1)) |
+ (0x000d0000UL << (4*i));
+ APPLY_DEBUG();
+ vcpu_run(vm, VCPU_ID);
+ target_dr6 = 0xffff0ff0 | (1UL << i);
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
+ run->debug.arch.exception == DB_VECTOR &&
+ run->debug.arch.pc == CAST_TO_RIP(write_data) &&
+ run->debug.arch.dr6 == target_dr6,
+ "DATA_HW_BP (DR%d): exit %d exception %d rip 0x%llx "
+ "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
+ i, run->exit_reason, run->debug.arch.exception,
+ run->debug.arch.pc, CAST_TO_RIP(write_data),
+ run->debug.arch.dr6, target_dr6);
+ /* Rollback the 4-bytes "mov" */
+ MOVE_RIP(-7);
+ }
+ /* Skip the 4-bytes "mov" */
+ MOVE_RIP(7);
+
+ /* Test single step */
+ target_rip = CAST_TO_RIP(ss_start);
+ target_dr6 = 0xffff4ff0ULL;
+ vcpu_regs_get(vm, VCPU_ID, ®s);
+ for (i = 0; i < (sizeof(ss_size) / sizeof(ss_size[0])); i++) {
+ target_rip += ss_size[i];
+ CLEAR_DEBUG();
+ debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_SINGLESTEP;
+ debug.arch.debugreg[7] = 0x00000400;
+ APPLY_DEBUG();
+ vcpu_run(vm, VCPU_ID);
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
+ run->debug.arch.exception == DB_VECTOR &&
+ run->debug.arch.pc == target_rip &&
+ run->debug.arch.dr6 == target_dr6,
+ "SINGLE_STEP[%d]: exit %d exception %d rip 0x%llx "
+ "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
+ i, run->exit_reason, run->debug.arch.exception,
+ run->debug.arch.pc, target_rip, run->debug.arch.dr6,
+ target_dr6);
+ }
+
+ /* Finally test global disable */
+ CLEAR_DEBUG();
+ debug.control = KVM_GUESTDBG_ENABLE | KVM_GUESTDBG_USE_HW_BP;
+ debug.arch.debugreg[7] = 0x400 | DR7_GD;
+ APPLY_DEBUG();
+ vcpu_run(vm, VCPU_ID);
+ target_dr6 = 0xffff0ff0 | DR6_BD;
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_DEBUG &&
+ run->debug.arch.exception == DB_VECTOR &&
+ run->debug.arch.pc == CAST_TO_RIP(bd_start) &&
+ run->debug.arch.dr6 == target_dr6,
+ "DR7.GD: exit %d exception %d rip 0x%llx "
+ "(should be 0x%llx) dr6 0x%llx (should be 0x%llx)",
+ run->exit_reason, run->debug.arch.exception,
+ run->debug.arch.pc, target_rip, run->debug.arch.dr6,
+ target_dr6);
+
+ /* Disable all debug controls, run to the end */
+ CLEAR_DEBUG();
+ APPLY_DEBUG();
+
+ vcpu_run(vm, VCPU_ID);
+ TEST_ASSERT(run->exit_reason == KVM_EXIT_IO, "KVM_EXIT_IO");
+ cmd = get_ucall(vm, VCPU_ID, &uc);
+ TEST_ASSERT(cmd == UCALL_DONE, "UCALL_DONE");
+
+ kvm_vm_free(vm);
+
+ return 0;
+}
# Figure out which test to run from our script name.
test=$(basename $0 .sh)
# Look up details about the test from master list of LKDTM tests.
-line=$(egrep '^#?'"$test"'\b' tests.txt)
+line=$(grep -E '^#?'"$test"'\b' tests.txt)
if [ -z "$line" ]; then
echo "Skipped: missing test '$test' in tests.txt"
exit $KSELFTEST_SKIP_TEST
fi
# Check that the test is known to LKDTM.
-if ! egrep -q '^'"$test"'$' "$TRIGGER" ; then
+if ! grep -E -q '^'"$test"'$' "$TRIGGER" ; then
echo "Skipped: test '$test' missing in $TRIGGER!"
exit $KSELFTEST_SKIP_TEST
fi
expect="call trace:"
fi
-# Clear out dmesg for output reporting
-dmesg -c >/dev/null
-
# Prepare log for report checking
-LOG=$(mktemp --tmpdir -t lkdtm-XXXXXX)
+LOG=$(mktemp --tmpdir -t lkdtm-log-XXXXXX)
+DMESG=$(mktemp --tmpdir -t lkdtm-dmesg-XXXXXX)
cleanup() {
- rm -f "$LOG"
+ rm -f "$LOG" "$DMESG"
}
trap cleanup EXIT
+# Save existing dmesg so we can detect new content below
+dmesg > "$DMESG"
+
# Most shells yell about signals and we're expecting the "cat" process
# to usually be killed by the kernel. So we have to run it in a sub-shell
# and silence errors.
($SHELL -c 'cat <(echo '"$test"') >'"$TRIGGER" 2>/dev/null) || true
# Record and dump the results
-dmesg -c >"$LOG"
+dmesg | diff --changed-group-format='%>' --unchanged-group-format='' "$DMESG" - > "$LOG" || true
+
cat "$LOG"
# Check for expected output
-if egrep -qi "$expect" "$LOG" ; then
+if grep -E -qi "$expect" "$LOG" ; then
echo "$test: saw '$expect': ok"
exit 0
else
- if egrep -qi XFAIL: "$LOG" ; then
+ if grep -E -qi XFAIL: "$LOG" ; then
echo "$test: saw 'XFAIL': [SKIP]"
exit $KSELFTEST_SKIP_TEST
else
cleanup()
{
- rm -f $out
+ rm -f $err
ip netns del $ns1
}
#define __stack_aligned__ __attribute__((aligned(16)))
struct cr_clone_arg {
char stack[128] __stack_aligned__;
- char stack_ptr[0];
+ char stack_ptr[];
};
static int child(void *args)
. ./common_tests
prlog -n "Checking pstore console is registered ... "
-dmesg | grep -q "console \[pstore"
+dmesg | grep -Eq "console \[(pstore|${backend})"
show_result $?
prlog -n "Checking /dev/pmsg0 exists ... "
--- /dev/null
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0+
+#
+# If this was a KCSAN run, collapse the reports in the various console.log
+# files onto pairs of functions.
+#
+# Usage: kcsan-collapse.sh resultsdir
+#
+# Copyright (C) 2020 Facebook, Inc.
+#
+# Authors: Paul E. McKenney <paulmck@kernel.org>
+
+if test -z "$TORTURE_KCONFIG_KCSAN_ARG"
+then
+ exit 0
+fi
+cat $1/*/console.log |
+ grep "BUG: KCSAN: " |
+ sed -e 's/^\[[^]]*] //' |
+ sort |
+ uniq -c |
+ sort -k1nr > $1/kcsan.sum
title="$title ($ngpsps/s)"
fi
echo $title $stopstate $fwdprog
- nclosecalls=`grep --binary-files=text 'torture: Reader Batch' $i/console.log | tail -1 | awk '{for (i=NF-8;i<=NF;i++) sum+=$i; } END {print sum}'`
+ nclosecalls=`grep --binary-files=text 'torture: Reader Batch' $i/console.log | tail -1 | \
+ awk -v sum=0 '
+ {
+ for (i = 0; i <= NF; i++) {
+ sum += $i;
+ if ($i ~ /Batch:/) {
+ sum = 0;
+ i = i + 2;
+ }
+ }
+ }
+
+ END {
+ print sum
+ }'`
if test -z "$nclosecalls"
then
exit 0
fi
fi
done
+ if test -f "$rd/kcsan.sum"
+ then
+ if test -s "$rd/kcsan.sum"
+ then
+ echo KCSAN summary in $rd/kcsan.sum
+ else
+ echo Clean KCSAN run in $rd
+ fi
+ fi
done
EDITOR=echo kvm-find-errors.sh "${@: -1}" > $T 2>&1
ret=$?
fi
echo ' ---' `date`: Starting build
echo ' ---' Kconfig fragment at: $config_template >> $resdir/log
-touch $resdir/ConfigFragment.input $resdir/ConfigFragment
-if test -r "$config_dir/CFcommon"
-then
- echo " --- $config_dir/CFcommon" >> $resdir/ConfigFragment.input
- cat < $config_dir/CFcommon >> $resdir/ConfigFragment.input
- config_override.sh $config_dir/CFcommon $config_template > $T/Kc1
- grep '#CHECK#' $config_dir/CFcommon >> $resdir/ConfigFragment
-else
- cp $config_template $T/Kc1
-fi
-echo " --- $config_template" >> $resdir/ConfigFragment.input
-cat $config_template >> $resdir/ConfigFragment.input
-grep '#CHECK#' $config_template >> $resdir/ConfigFragment
-if test -n "$TORTURE_KCONFIG_ARG"
-then
- echo $TORTURE_KCONFIG_ARG | tr -s " " "\012" > $T/cmdline
- echo " --- --kconfig argument" >> $resdir/ConfigFragment.input
- cat $T/cmdline >> $resdir/ConfigFragment.input
- config_override.sh $T/Kc1 $T/cmdline > $T/Kc2
- # Note that "#CHECK#" is not permitted on commandline.
-else
- cp $T/Kc1 $T/Kc2
-fi
-cat $T/Kc2 >> $resdir/ConfigFragment
+touch $resdir/ConfigFragment.input
+
+# Combine additional Kconfig options into an existing set such that
+# newer options win. The first argument is the Kconfig source ID, the
+# second the to-be-updated file within $T, and the third and final the
+# list of additional Kconfig options. Note that a $2.tmp file is
+# created when doing the update.
+config_override_param () {
+ if test -n "$3"
+ then
+ echo $3 | sed -e 's/^ *//' -e 's/ *$//' | tr -s " " "\012" > $T/Kconfig_args
+ echo " --- $1" >> $resdir/ConfigFragment.input
+ cat $T/Kconfig_args >> $resdir/ConfigFragment.input
+ config_override.sh $T/$2 $T/Kconfig_args > $T/$2.tmp
+ mv $T/$2.tmp $T/$2
+ # Note that "#CHECK#" is not permitted on commandline.
+ fi
+}
+
+echo > $T/KcList
+config_override_param "$config_dir/CFcommon" KcList "`cat $config_dir/CFcommon 2> /dev/null`"
+config_override_param "$config_template" KcList "`cat $config_template 2> /dev/null`"
+config_override_param "--kasan options" KcList "$TORTURE_KCONFIG_KASAN_ARG"
+config_override_param "--kcsan options" KcList "$TORTURE_KCONFIG_KCSAN_ARG"
+config_override_param "--kconfig argument" KcList "$TORTURE_KCONFIG_ARG"
+cp $T/KcList $resdir/ConfigFragment
base_resdir=`echo $resdir | sed -e 's/\.[0-9]\+$//'`
if test "$base_resdir" != "$resdir" -a -f $base_resdir/bzImage -a -f $base_resdir/vmlinux
ln -s $base_resdir/.config $resdir # for kvm-recheck.sh
# Arch-independent indicator
touch $resdir/builtkernel
-elif kvm-build.sh $T/Kc2 $resdir
+elif kvm-build.sh $T/KcList $resdir
then
# Had to build a kernel for this test.
QEMU="`identify_qemu vmlinux`"
TORTURE_BOOT_IMAGE=""
TORTURE_INITRD="$KVM/initrd"; export TORTURE_INITRD
TORTURE_KCONFIG_ARG=""
+TORTURE_KCONFIG_KASAN_ARG=""
+TORTURE_KCONFIG_KCSAN_ARG=""
TORTURE_KMAKE_ARG=""
TORTURE_QEMU_MEM=512
TORTURE_SHUTDOWN_GRACE=180
TORTURE_KCONFIG_ARG="$2"
shift
;;
+ --kasan)
+ TORTURE_KCONFIG_KASAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KASAN=y"; export TORTURE_KCONFIG_KASAN_ARG
+ ;;
+ --kcsan)
+ TORTURE_KCONFIG_KCSAN_ARG="CONFIG_DEBUG_INFO=y CONFIG_KCSAN=y CONFIG_KCSAN_ASSUME_PLAIN_WRITES_ATOMIC=n CONFIG_KCSAN_REPORT_VALUE_CHANGE_ONLY=n CONFIG_KCSAN_REPORT_ONCE_IN_MS=100000 CONFIG_KCSAN_VERBOSE=y CONFIG_KCSAN_INTERRUPT_WATCHER=y"; export TORTURE_KCONFIG_KCSAN_ARG
+ ;;
--kmake-arg)
checkarg --kmake-arg "(kernel make arguments)" $# "$2" '.*' '^error$'
TORTURE_KMAKE_ARG="$2"
TORTURE_DEFCONFIG="$TORTURE_DEFCONFIG"; export TORTURE_DEFCONFIG
TORTURE_INITRD="$TORTURE_INITRD"; export TORTURE_INITRD
TORTURE_KCONFIG_ARG="$TORTURE_KCONFIG_ARG"; export TORTURE_KCONFIG_ARG
+TORTURE_KCONFIG_KASAN_ARG="$TORTURE_KCONFIG_KASAN_ARG"; export TORTURE_KCONFIG_KASAN_ARG
+TORTURE_KCONFIG_KCSAN_ARG="$TORTURE_KCONFIG_KCSAN_ARG"; export TORTURE_KCONFIG_KCSAN_ARG
TORTURE_KMAKE_ARG="$TORTURE_KMAKE_ARG"; export TORTURE_KMAKE_ARG
TORTURE_QEMU_CMD="$TORTURE_QEMU_CMD"; export TORTURE_QEMU_CMD
TORTURE_QEMU_INTERACTIVE="$TORTURE_QEMU_INTERACTIVE"; export TORTURE_QEMU_INTERACTIVE
echo
echo " --- `date` Test summary:"
echo Results directory: $resdir/$ds
+kcsan-collapse.sh $resdir/$ds
kvm-recheck.sh $resdir/$ds
___EOF___
TASKS01
TASKS02
TASKS03
+RUDE01
+TRACE01
+TRACE02
--- /dev/null
+CONFIG_SMP=y
+CONFIG_NR_CPUS=2
+CONFIG_HOTPLUG_CPU=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=y
+#CHECK#CONFIG_PROVE_RCU=y
+CONFIG_RCU_EXPERT=y
--- /dev/null
+rcutorture.torture_type=tasks-rude
--- /dev/null
+CONFIG_SMP=y
+CONFIG_NR_CPUS=4
+CONFIG_HOTPLUG_CPU=y
+CONFIG_PREEMPT_NONE=y
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=n
+CONFIG_DEBUG_LOCK_ALLOC=y
+CONFIG_PROVE_LOCKING=y
+#CHECK#CONFIG_PROVE_RCU=y
+CONFIG_TASKS_TRACE_RCU_READ_MB=y
+CONFIG_RCU_EXPERT=y
--- /dev/null
+rcutorture.torture_type=tasks-tracing
--- /dev/null
+CONFIG_SMP=y
+CONFIG_NR_CPUS=4
+CONFIG_HOTPLUG_CPU=y
+CONFIG_PREEMPT_NONE=n
+CONFIG_PREEMPT_VOLUNTARY=n
+CONFIG_PREEMPT=y
+CONFIG_DEBUG_LOCK_ALLOC=n
+CONFIG_PROVE_LOCKING=n
+#CHECK#CONFIG_PROVE_RCU=n
+CONFIG_TASKS_TRACE_RCU_READ_MB=n
+CONFIG_RCU_EXPERT=y
--- /dev/null
+rcutorture.torture_type=tasks-tracing
CONFIG_SMP=y
-CONFIG_NR_CPUS=100
+CONFIG_NR_CPUS=56
CONFIG_PREEMPT_NONE=y
CONFIG_PREEMPT_VOLUNTARY=n
CONFIG_PREEMPT=n
--- /dev/null
+[
+ {
+ "id": "83be",
+ "name": "Create FQ-PIE with invalid number of flows",
+ "category": [
+ "qdisc",
+ "fq_pie"
+ ],
+ "setup": [
+ "$IP link add dev $DUMMY type dummy || /bin/true"
+ ],
+ "cmdUnderTest": "$TC qdisc add dev $DUMMY root fq_pie flows 65536",
+ "expExitCode": "2",
+ "verifyCmd": "$TC qdisc show dev $DUMMY",
+ "matchPattern": "qdisc",
+ "matchCount": "0",
+ "teardown": [
+ "$IP link del dev $DUMMY"
+ ]
+ }
+]
thuge-gen
compaction_test
mlock2-tests
+mremap_dontunmap
on-fault-limit
transhuge-stress
userfaultfd
int write = 0;
int reserve = 1;
- unsigned long i;
-
if (signal(SIGINT, sig_handler) == SIG_ERR)
err(1, "\ncan't catch SIGINT\n");
$(eval $(call tar_download,MUSL,musl,1.2.0,.tar.gz,https://musl.libc.org/releases/,c6de7b191139142d3f9a7b5b702c9cae1b5ee6e7f57e582da9328629408fd4e8))
$(eval $(call tar_download,IPERF,iperf,3.7,.tar.gz,https://downloads.es.net/pub/iperf/,d846040224317caf2f75c843d309a950a7db23f9b44b94688ccbe557d6d1710c))
$(eval $(call tar_download,BASH,bash,5.0,.tar.gz,https://ftp.gnu.org/gnu/bash/,b4a80f2ac66170b2913efbfb9f2594f1f76c7b1afd11f799e22035d63077fb4d))
-$(eval $(call tar_download,IPROUTE2,iproute2,5.4.0,.tar.xz,https://www.kernel.org/pub/linux/utils/net/iproute2/,fe97aa60a0d4c5ac830be18937e18dc3400ca713a33a89ad896ff1e3d46086ae))
+$(eval $(call tar_download,IPROUTE2,iproute2,5.6.0,.tar.xz,https://www.kernel.org/pub/linux/utils/net/iproute2/,1b5b0e25ce6e23da7526ea1da044e814ad85ba761b10dd29c2b027c056b04692))
$(eval $(call tar_download,IPTABLES,iptables,1.8.4,.tar.bz2,https://www.netfilter.org/projects/iptables/files/,993a3a5490a544c2cbf2ef15cf7e7ed21af1845baf228318d5c36ef8827e157c))
$(eval $(call tar_download,NMAP,nmap,7.80,.tar.bz2,https://nmap.org/dist/,fcfa5a0e42099e12e4bf7a68ebe6fde05553383a682e816a7ec9256ab4773faa))
$(eval $(call tar_download,IPUTILS,iputils,s20190709,.tar.gz,https://github.com/iputils/iputils/archive/s20190709.tar.gz/#,a15720dd741d7538dd2645f9f516d193636ae4300ff7dbc8bfca757bf166490a))
}
bool kvm_make_vcpus_request_mask(struct kvm *kvm, unsigned int req,
+ struct kvm_vcpu *except,
unsigned long *vcpu_bitmap, cpumask_var_t tmp)
{
int i, cpu, me;
me = get_cpu();
kvm_for_each_vcpu(i, vcpu, kvm) {
- if (vcpu_bitmap && !test_bit(i, vcpu_bitmap))
+ if ((vcpu_bitmap && !test_bit(i, vcpu_bitmap)) ||
+ vcpu == except)
continue;
kvm_make_request(req, vcpu);
return called;
}
-bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
+bool kvm_make_all_cpus_request_except(struct kvm *kvm, unsigned int req,
+ struct kvm_vcpu *except)
{
cpumask_var_t cpus;
bool called;
zalloc_cpumask_var(&cpus, GFP_ATOMIC);
- called = kvm_make_vcpus_request_mask(kvm, req, NULL, cpus);
+ called = kvm_make_vcpus_request_mask(kvm, req, except, NULL, cpus);
free_cpumask_var(cpus);
return called;
}
+bool kvm_make_all_cpus_request(struct kvm *kvm, unsigned int req)
+{
+ return kvm_make_all_cpus_request_except(kvm, req, NULL);
+}
+
#ifndef CONFIG_HAVE_KVM_ARCH_TLB_FLUSH_ALL
void kvm_flush_remote_tlbs(struct kvm *kvm)
{