Peter Maydell [Wed, 13 Mar 2019 14:44:28 +0000 (14:44 +0000)]
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- file-posix: Make auto-read-only dynamic
- Add x-blockdev-reopen QMP command
- Finalize block-latency-histogram QMP command
- gluster: Build fixes for newer lib version
# gpg: Signature made Tue 12 Mar 2019 19:30:31 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>" [full]
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (28 commits)
qemu-iotests: Test the x-blockdev-reopen QMP command
block: Add an 'x-blockdev-reopen' QMP command
block: Remove the AioContext parameter from bdrv_reopen_multiple()
block: Add bdrv_reset_options_allowed()
block: Add a 'mutable_opts' field to BlockDriver
block: Allow changing the backing file on reopen
block: Allow omitting the 'backing' option in certain cases
block: Handle child references in bdrv_reopen_queue()
block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
block: Freeze the backing chain for the duration of the stream job
block: Freeze the backing chain for the duration of the mirror job
block: Freeze the backing chain for the duration of the commit job
block: Allow freezing BdrvChild links
nvme: fix write zeroes offset and count
file-posix: Make auto-read-only dynamic
file-posix: Prepare permission code for fd switching
file-posix: Lock new fd in raw_reopen_prepare()
file-posix: Store BDRVRawState.reopen_state during reopen
file-posix: Factor out raw_reconfigure_getfd()
file-posix: Fix bdrv_open_flags() for snapshot=on
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Wed, 13 Mar 2019 13:09:38 +0000 (13:09 +0000)]
Merge remote-tracking branch 'remotes/rth/tags/pull-dt-20190312' into staging
Break out documentation to docs/devel/.
Add support for pattern groups.
Other misc cleanups for multiple decode functions.
# gpg: Signature made Tue 12 Mar 2019 16:59:37 GMT
# gpg: using RSA key 64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-dt-20190312:
decodetree: Properly diagnose fields overflowing an insn
decodetree: Prefix extract function names with decode_function
decodetree: Allow +- to begin a number initializing a field
decodetree: Produce clean output for an empty input file
decodetree: Add --static-decode option
test/decode: Add tests for PatternGroups
decodetree: Allow grouping of overlapping patterns
decodetree: Do not unconditionaly return from Pattern.output_code
decodetree: Ensure build_tree does not include values outside insnmask
decodetree: Document the usefulness of argument sets
decodetree: Move documentation to docs/devel/decodetree.rst
MAINTAINERS: Add scripts/decodetree.py to the TCG section
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* remotes/huth-gitlab/tags/pull-request-2019-03-12:
scripts/qemugdb: re-license timers.py to GPLv2 or later
hw/sd/sdhci: Move PCI-related code into a separate file
ahci-test: Drop dependence on global_qtest
tests: test-announce-self: fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Alberto Garcia [Tue, 12 Mar 2019 16:48:51 +0000 (18:48 +0200)]
block: Add an 'x-blockdev-reopen' QMP command
This command allows reopening an arbitrary BlockDriverState with a
new set of options. Some options (e.g node-name) cannot be changed
and some block drivers don't allow reopening, but otherwise this
command is modelled after 'blockdev-add' and the state of the reopened
BlockDriverState should generally be the same as if it had just been
added by 'blockdev-add' with the same set of options.
One notable exception is the 'backing' option: 'x-blockdev-reopen'
requires that it is always present unless the BlockDriverState in
question doesn't have a current or default backing file.
This command allows reconfiguring the graph by using the appropriate
options to change the children of a node. At the moment it's possible
to change a backing file by setting the 'backing' option to the name
of the new node, but it should also be possible to add a similar
functionality to other block drivers (e.g. Quorum, blkverify).
Although the API is unlikely to change, this command is marked
experimental for the time being so there's room to see if the
semantics need changes.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:49 +0000 (18:48 +0200)]
block: Add bdrv_reset_options_allowed()
bdrv_reopen_prepare() receives a BDRVReopenState with (among other
things) a new set of options to be applied to that BlockDriverState.
If an option is missing then it means that we want to reset it to its
default value rather than keep the previous one. This way the state
of the block device after being reopened is comparable to that of a
device added with "blockdev-add" using the same set of options.
Not all options from all drivers can be changed this way, however.
If the user attempts to reset an immutable option to its default value
using this method then we must forbid it.
This new function takes a BlockDriverState and a new set of options
and checks if there's any option that was previously set but is
missing from the new set of options.
If the option is present in both sets we don't need to check that they
have the same value. The loop at the end of bdrv_reopen_prepare()
already takes care of that.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:48 +0000 (18:48 +0200)]
block: Add a 'mutable_opts' field to BlockDriver
If we reopen a BlockDriverState and there is an option that is present
in bs->options but missing from the new set of options then we have to
return an error unless the driver is able to reset it to its default
value.
This patch adds a new 'mutable_opts' field to BlockDriver. This is
a list of runtime options that can be modified during reopen. If an
option in this list is unspecified on reopen then it must be reset (or
return an error).
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:47 +0000 (18:48 +0200)]
block: Allow changing the backing file on reopen
This patch allows the user to change the backing file of an image that
is being reopened. Here's what it does:
- In bdrv_reopen_prepare(): check that the value of 'backing' points
to an existing node or is null. If it points to an existing node it
also needs to make sure that replacing the backing file will not
create a cycle in the node graph (i.e. you cannot reach the parent
from the new backing file).
- In bdrv_reopen_commit(): perform the actual node replacement by
calling bdrv_set_backing_hd().
There may be temporary implicit nodes between a BDS and its backing
file (e.g. a commit filter node). In these cases bdrv_reopen_prepare()
looks for the real (non-implicit) backing file and requires that the
'backing' option points to it. Replacing or detaching a backing file
is forbidden if there are implicit nodes in the middle.
Although x-blockdev-reopen is meant to be used like blockdev-add,
there's an important thing that must be taken into account: the only
way to set a new backing file is by using a reference to an existing
node (previously added with e.g. blockdev-add). If 'backing' contains
a dictionary with a new set of options ({"driver": "qcow2", "file": {
... }}) then it is interpreted that the _existing_ backing file must
be reopened with those options.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:46 +0000 (18:48 +0200)]
block: Allow omitting the 'backing' option in certain cases
Of all options of type BlockdevRef used to specify children in
BlockdevOptions, 'backing' is the only one that is optional.
For "x-blockdev-reopen" we want that if an option is omitted then it
must be reset to its default value. The default value of 'backing'
means that QEMU opens the backing file specified in the image
metadata, but this is not something that we want to support for the
reopen operation.
Because of this the 'backing' option has to be specified during
reopen, pointing to the existing backing file if we want to keep it,
or pointing to a different one (or NULL) if we want to replace it (to
be implemented in a subsequent patch).
In order to simplify things a bit and not to require that the user
passes the 'backing' option to every single block device even when
it's clearly not necessary, this patch allows omitting this option if
the block device being reopened doesn't have a backing file attached
_and_ no default backing file is specified in the image metadata.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:45 +0000 (18:48 +0200)]
block: Handle child references in bdrv_reopen_queue()
Children in QMP are specified with BlockdevRef / BlockdevRefOrNull,
which can contain a set of child options, a child reference, or
NULL. In optional attributes like "backing" it can also be missing.
Only the first case (set of child options) is being handled properly
by bdrv_reopen_queue(). This patch deals with all the others.
Here's how these cases should be handled when bdrv_reopen_queue() is
deciding what to do with each child of a BlockDriverState:
1) Set of child options: if the child was implicitly created (i.e
inherits_from points to the parent) then the options are removed
from the parent's options QDict and are passed to the child with
a recursive bdrv_reopen_queue() call. This case was already
working fine.
2) Child reference: there's two possibilites here.
2a) Reference to the current child: if the child was implicitly
created then it is put in the reopen queue, keeping its
current set of options (since this was a child reference
there was no way to specify a different set of options).
If the child is not implicit then it keeps its current set
of options but it is not reopened (and therefore does not
inherit any new option from the parent).
2b) Reference to a different BDS: the current child is not put
in the reopen queue at all. Passing a reference to a
different BDS can be used to replace a child, although at
the moment no driver implements this, so it results in an
error. In any case, the current child is not going to be
reopened (and might in fact disappear if it's replaced)
3) NULL: This is similar to (2b). Although no driver allows this
yet it can be used to detach the current child so it should not
be put in the reopen queue.
4) Missing option: at the moment "backing" is the only case where
this can happen. With "blockdev-add", leaving "backing" out
means that the default backing file is opened. We don't want to
open a new image during reopen, so we require that "backing" is
always present. We'll relax this requirement a bit in the next
patch. If keep_old_opts is true and "backing" is missing then
this behaves like 2a (the current child is reopened).
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:44 +0000 (18:48 +0200)]
block: Add 'keep_old_opts' parameter to bdrv_reopen_queue()
The bdrv_reopen_queue() function is used to create a queue with
the BDSs that are going to be reopened and their new options. Once
the queue is ready bdrv_reopen_multiple() is called to perform the
operation.
The original options from each one of the BDSs are kept, with the new
options passed to bdrv_reopen_queue() applied on top of them.
For "x-blockdev-reopen" we want a function that behaves much like
"blockdev-add". We want to ignore the previous set of options so that
only the ones actually specified by the user are applied, with the
rest having their default values.
One of the things that we need is a way to tell bdrv_reopen_queue()
whether we want to keep the old set of options or not, and that's what
this patch does. All current callers are setting this new parameter to
true and x-blockdev-reopen will set it to false.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Alberto Garcia [Tue, 12 Mar 2019 16:48:40 +0000 (18:48 +0200)]
block: Allow freezing BdrvChild links
Our permission system is useful to define what operations are allowed
on a certain block node and includes things like BLK_PERM_WRITE or
BLK_PERM_RESIZE among others.
One of the permissions is BLK_PERM_GRAPH_MOD which allows "changing
the node that this BdrvChild points to". The exact meaning of this has
never been very clear, but it can be understood as "change any of the
links connected to the node". This can be used to prevent changing a
backing link, but it's too coarse.
This patch adds a new 'frozen' attribute to BdrvChild, which forbids
detaching the link from the node it points to, and new API to freeze
and unfreeze a backing chain.
After this change a few functions can fail, so they need additional
checks.
Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Keith Busch [Mon, 11 Mar 2019 15:11:53 +0000 (09:11 -0600)]
nvme: fix write zeroes offset and count
The implementation used blocks units rather than the expected bytes.
Fixes: c03e7ef12a9 ("nvme: Implement Write Zeroes") Reported-by: Ming Lei <ming.lei@redhat.com> Signed-off-by: Keith Busch <keith.busch@intel.com> Reviewed-by: Christoph Hellwig <hch@lst.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Kevin Wolf [Fri, 1 Mar 2019 21:15:11 +0000 (22:15 +0100)]
file-posix: Make auto-read-only dynamic
Until now, with auto-read-only=on we tried to open the file read-write
first and if that failed, read-only was tried. This is actually not good
enough for libvirt, which gives QEMU SELinux permissions for read-write
only as soon as it actually intends to write to the image. So we need to
be able to switch between read-only and read-write at runtime.
This patch makes auto-read-only dynamic, i.e. the file is opened
read-only as long as no user of the node has requested write
permissions, but it is automatically reopened read-write as soon as the
first writer is attached. Conversely, if the last writer goes away, the
file is reopened read-only again.
bs->read_only is no longer set for auto-read-only=on files even if the
file descriptor is opened read-only because it will be transparently
upgraded as soon as a writer is attached. This changes the output of
qemu-iotests 232.
Kevin Wolf [Fri, 8 Mar 2019 14:40:40 +0000 (15:40 +0100)]
file-posix: Prepare permission code for fd switching
In order to be able to dynamically reopen the file read-only or
read-write, depending on the users that are attached, we need to be able
to switch to a different file descriptor during the permission change.
This interacts with reopen, which also creates a new file descriptor and
performs permission changes internally. In this case, the permission
change code must reuse the reopen file descriptor instead of creating a
third one.
In turn, reopen can drop its code to copy file locks to the new file
descriptor because that is now done when applying the new permissions.
Kevin Wolf [Fri, 1 Mar 2019 21:48:45 +0000 (22:48 +0100)]
file-posix: Lock new fd in raw_reopen_prepare()
There is no reason why we can take locks on the new file descriptor only
in raw_reopen_commit() where error handling isn't possible any more.
Instead, we can already do this in raw_reopen_prepare().
Kevin Wolf [Mon, 11 Mar 2019 15:13:16 +0000 (16:13 +0100)]
file-posix: Fix bdrv_open_flags() for snapshot=on
Using a different read-only setting for bs->open_flags than for the
flags to the driver's open function is just inconsistent and a bad idea.
After this patch, the temporary snapshot keeps being opened read-only if
read-only=on,snapshot=on is passed.
If we wanted to change this behaviour to make only the orginal image
file read-only, but the temporary overlay read-write (as the comment in
the removed code suggests), that change would have to be made in
bdrv_temp_snapshot_options() (where the comment suggests otherwise).
Addressing this inconsistency before introducing dynamic auto-read-only
is important because otherwise we would immediately try to reopen the
temporary overlay even though the file is already unlinked.
Kevin Wolf [Tue, 5 Mar 2019 16:18:22 +0000 (17:18 +0100)]
block: Make permission changes in reopen less wrong
The way that reopen interacts with permission changes has one big
problem: Both operations are recursive, and the permissions are changes
for each node in the reopen queue.
For a simple graph that consists just of parent and child,
.bdrv_check_perm will be called twice for the child, once recursively
when adjusting the permissions of parent, and once again when the child
itself is reopened.
Even worse, the first .bdrv_check_perm call happens before
.bdrv_reopen_prepare was called for the child and the second one is
called afterwards.
Making sure that .bdrv_check_perm (and the other permission callbacks)
are called only once is hard. We can cope with multiple calls right now,
but as soon as file-posix gets a dynamic auto-read-only that may need to
open a new file descriptor, we get the additional requirement that all
of them are after the .bdrv_reopen_prepare call.
So reorder things in bdrv_reopen_multiple() to first call
.bdrv_reopen_prepare for all involved nodes and only then adjust
permissions.
Kevin Wolf [Tue, 5 Mar 2019 13:14:52 +0000 (14:14 +0100)]
tests/virtio-blk-test: Disable auto-read-only
tests/virtio-blk-test uses a temporary image file that it deletes while
QEMU is still running, so it can't be reopened when writers are
attached or detached. Disable auto-read-only to keep it always writable.
Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>
This adds one test that supposed to succeed to test deep nesting
of pattern groups which is rarely exercised by targets using decode
tree. The remaining tests exercise various fail conditions.
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <20190227120217.20794-1-kbastian@mail.uni-paderborn.de> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
MAINTAINERS: Add scripts/decodetree.py to the TCG section
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20181110211313.6922-2-f4bug@amsat.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
* remotes/kraxel/tags/audio-20190312-pull-request:
audio: -audiodev command line option: cleanup
wavaudio: port to -audiodev config
spiceaudio: port to -audiodev config
sdlaudio: port to -audiodev config
paaudio: port to -audiodev config
ossaudio: port to -audiodev config
noaudio: port to -audiodev config
dsoundaudio: port to -audiodev config
coreaudio: port to -audiodev config
alsaaudio: port to -audiodev config
audio: -audiodev command line option basic implementation
audio: -audiodev command line option: documentation
audio: use qapi AudioFormat instead of audfmt_e
qapi: qapi for audio backends
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# qemu-deprecated.texi
Sven Schnelle [Mon, 11 Mar 2019 19:16:01 +0000 (20:16 +0100)]
target/hppa: exit TB if either Data or Instruction TLB changes
The current code assumes that we don't need to exit the TB
if a Data Cache Flush or Insert has happend. However, as we
have a shared Data/Instruction TLB, a Data cache flush also
flushes Instruction TLB entries, and a Data cache TLB insert
might also evict a Instruction TLB entry.
So exit the TB in all cases if Instruction translation is enabled.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-11-svens@stackframe.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:16:00 +0000 (20:16 +0100)]
target/hppa: add TLB protection id check
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-10-svens@stackframe.org>
[rth: Add required tlb flushing when prot id registers change.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:59 +0000 (20:15 +0100)]
target/hppa: allow multiple itlbp without itlba
The ODE software calls itlbp on existing TLB entries without
calling itlba first, so this seems to be valid.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-9-svens@stackframe.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:58 +0000 (20:15 +0100)]
target/hppa: fix b,gate instruction
b,gate does GR[t] ← cat(GR[t]{0..29},IAOQ_Front{30..31});
instead of saving the link address to register t.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-8-svens@stackframe.org>
[rth: Move link check outside of ifndef CONFIG_USER_ONLY;
use ctx->privilege; nullify the insn earlier.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:57 +0000 (20:15 +0100)]
target/hppa: ignore DIAG opcode
DIAG is usually only used by diagnostics software as it's CPU
specific. In most of the cases it's better to ignore it and log
a message that it's not implemented.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-7-svens@stackframe.org>
[rth: Free the nullify condition.] Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:56 +0000 (20:15 +0100)]
target/hppa: remove PSW I/R/Q bit check
HP ODE use rfi to set the Q bit, and i don't see anything in the
documentation that this is forbidden. So remove it.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-6-svens@stackframe.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:55 +0000 (20:15 +0100)]
target/hppa: add TLB trace events
To ease TLB debugging add a few trace events, which are disabled
by default so that there's no performance impact.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-5-svens@stackframe.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:54 +0000 (20:15 +0100)]
target/hppa: report ITLB_EXCP_MISS for ITLB misses
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-4-svens@stackframe.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will purge the whole TLB and add an entry for page 0. However
the current TLB implementation in helper_iitlba() will store to
the last empty TLB entry, while helper_iitlbp() will write to the
first empty entry. That is because an empty entry will match address
0 in helper_iitlba()
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-3-svens@stackframe.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Sven Schnelle [Mon, 11 Mar 2019 19:15:52 +0000 (20:15 +0100)]
target/hppa: fix overwriting source reg in addb
When one of the source registers is the same as the destination register,
the source register gets overwritten with the destionation value before
do_add_sv() is called, which leads to unexpection condition matches.
Signed-off-by: Sven Schnelle <svens@stackframe.org>
Message-Id: <20190311191602.25796-2-svens@stackframe.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
# gpg: Signature made Tue 12 Mar 2019 00:57:41 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>" [full]
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
memfd: improve error messages
memfd: set up correct errno if not supported
memfd: always check for MFD_CLOEXEC
hostmem-memfd: disable for systems without sealing support
machine: Move nvdimms state into struct MachineState
nvdimm: Rename AcpiNVDIMMState into NVDIMMState
hostmem-file: reject invalid pmem file sizes
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Niels de Vos [Tue, 5 Mar 2019 15:46:34 +0000 (16:46 +0100)]
gluster: the glfs_io_cbk callback function pointer adds pre/post stat args
The glfs_*_async() functions do a callback once finished. This callback
has changed its arguments, pre- and post-stat structures have been
added. This makes it possible to improve caching, which is useful for
Samba and NFS-Ganesha, but not so much for QEMU. Gluster 6 is the first
release that includes these new arguments.
With an additional detection in ./configure, the new arguments can
conditionally get included in the glfs_io_cbk handler.
Signed-off-by: Niels de Vos <ndevos@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
New versions of Glusters libgfapi.so have an updated glfs_ftruncate()
function that returns additional 'struct stat' structures to enable
advanced caching of attributes. This is useful for file servers, not so
much for QEMU. Nevertheless, the API has changed and needs to be
adopted.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com> Signed-off-by: Niels de Vos <ndevos@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>
* remotes/vivier2/tags/trivial-branch-pull-request:
hw/nvram/fw_cfg: Use the ldst API
hw/arm/virt: Remove null-check in virt_build_smbios()
hw/i386: Remove unused include
hw/nvram/fw_cfg: Remove the unnecessary boot_splash_filedata_size
thunk: improve readability of allocation loop
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Peter Maydell [Tue, 12 Mar 2019 10:15:00 +0000 (10:15 +0000)]
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-4.0-20190312' into staging
ppc patch queue for 2019-03-10
This pull requests supersedes ppc-for-4.0-20190310. Changes are:
* Fixed a bunch of minor style problems
* Suppressed warnings about Spectre/Meltdown mitigations with TCG
* Added one more patch, a preliminary fix towards the not-quite-ready
support for NVLink VFIO passthrough.
This is a final pull request before the 4.0 soft freeze. Changes
include:
* A Great Renaming to use camel case properly in spapr code
* Optimization of some vector instructions
* Support for POWER9 cpus in the powernv machine
* Fixes a regression from the last pull request in handling VSX
instructions with mixed operands from the FPR and VMX parts of the
register array
* Optimization hack to avoid scanning all the (empty) entries on a
new IOMMU window
* Add FSL I2C controller model for E500
* Support for KVM acceleration of the H_PAGE_INIT hypercall on spapr
* Update u-boot image for E500
* Enable Specre/Meltdown mitigations by default on the new machine type
* Enable large decrementer support for POWER9
* remotes/dgibson/tags/ppc-for-4.0-20190312: (62 commits)
vfio: Make vfio_get_region_info_cap public
Suppress test warnings about missing Spectre/Meltdown mitigations with TCG
spapr: Use CamelCase properly
target/ppc: Optimize x[sv]xsigdp using deposit_i64()
target/ppc: Optimize xviexpdp() using deposit_i64()
target/ppc: add HV support for POWER9
ppc/pnv: add a "ibm,opal/power-mgt" device tree node on POWER9
ppc/pnv: add more dummy XSCOM addresses
ppc/pnv: activate XSCOM tests for POWER9
ppc/pnv: POWER9 XSCOM quad support
ppc/pnv: extend XSCOM core support for POWER9
ppc/pnv: add a OCC model for POWER9
ppc/pnv: add a OCC model class
ppc/pnv: add SerIRQ routing registers
ppc/pnv: add a LPC Controller model for POWER9
ppc/pnv: add a 'dt_isa_nodename' to the chip
ppc/pnv: add a LPC Controller class model
ppc/pnv: lpc: fix OPB address ranges
ppc/pnv: add a PSI bridge model for POWER9
ppc/pnv: add a PSI bridge class model
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Thomas Huth [Wed, 20 Feb 2019 11:19:14 +0000 (12:19 +0100)]
hw/sd/sdhci: Move PCI-related code into a separate file
Some machines have an SDHCI device, but no PCI. To be able to
compile hw/sd/sdhci.c without CONFIG_PCI, we must not call functions
like pci_get_address_space() and pci_allocate_irq() there. Thus
move the PCI-related code into a separate file.
This is required for the new Kconfig-like build system, e.g. it is
needed if a user wants to compile a QEMU binary with just one machine
that has SDHCI, but no PCI, like the ARM "raspi" machines for example.
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
Eric Blake [Mon, 11 Sep 2017 17:20:02 +0000 (12:20 -0500)]
ahci-test: Drop dependence on global_qtest
Managing parallel connections to two different monitors via
the implicit global_qtest makes it hard to copy-and-paste code
to tests that are not aware of the implicit state; the
management of global_qtest is even harder to follow because
it was masked behind set_context().
Instead, explicitly pass QTestState* around (generally, by
reusing the member already present in ahci->parent QOSState),
and call explicit qtest_* functions on all places that
interact with a monitor.
We can assert that the conversion is correct by checking that
global_qtest remains NULL throughout the test (a later patch
that changes global_qtest to not be a public global variable
will drop the assertions).
Signed-off-by: Eric Blake <eblake@redhat.com> Acked-by: John Snow <jsnow@redhat.com>
[thuth: rebased patch to current master branch] Signed-off-by: Thomas Huth <thuth@redhat.com>
Li Qiang [Mon, 11 Mar 2019 14:14:57 +0000 (07:14 -0700)]
tests: test-announce-self: fix memory leak
Spotted by ASAN while running 'make check'.
Fixes: 4b9b7000 ("tests: Add a test for qemu self announcements") Suggested-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Li Qiang <liq3ea@163.com> Signed-off-by: Thomas Huth <thuth@redhat.com>
David Gibson [Tue, 12 Mar 2019 05:07:14 +0000 (16:07 +1100)]
Suppress test warnings about missing Spectre/Meltdown mitigations with TCG
The new pseries-4.0 machine type defaults to enabling Spectre/Meltdown
mitigations. Unfortunately those mitigations aren't implemented for TCG
because we're not yet sure if they're necessary or how to implement them.
We don't fail fatally, but we do warn in this case, because it is quite
plausible that Spectre/Meltdown can be exploited through TCG (at least for
the guest to get access to the qemu address space).
This create noise in our testcases though. So, modify the affected tests
to explicitly disable the mitigations to suppress these warnings.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
David Gibson [Wed, 6 Mar 2019 04:35:37 +0000 (15:35 +1100)]
spapr: Use CamelCase properly
The qemu coding standard is to use CamelCase for type and structure names,
and the pseries code follows that... sort of. There are quite a lot of
places where we bend the rules in order to preserve the capitalization of
internal acronyms like "PHB", "TCE", "DIMM" and most commonly "sPAPR".
That was a bad idea - it frequently leads to names ending up with hard to
read clusters of capital letters, and means they don't catch the eye as
type identifiers, which is kind of the point of the CamelCase convention in
the first place.
In short, keeping type identifiers look like CamelCase is more important
than preserving standard capitalization of internal "words". So, this
patch renames a heap of spapr internal type names to a more standard
CamelCase.
In addition to case changes, we also make some other identifier renames:
VIOsPAPR* -> SpaprVio*
The reverse word ordering was only ever used to mitigate the capital
cluster, so revert to the natural ordering.
VIOsPAPRVTYDevice -> SpaprVioVty
VIOsPAPRVLANDevice -> SpaprVioVlan
Brevity, since the "Device" didn't add useful information
sPAPRDRConnector -> SpaprDrc
sPAPRDRConnectorClass -> SpaprDrcClass
Brevity, and makes it clearer this is the same thing as a "DRC"
mentioned in many other places in the code
This is 100% a mechanical search-and-replace patch. It will, however,
conflict with essentially any and all outstanding patches touching the
spapr code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Optimize x[sv]xsigdp using deposit_i64()
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190309214255.9952-3-f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: Optimize xviexpdp() using deposit_i64()
The t0 tcg_temp register is now unused, remove it.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20190309214255.9952-2-f4bug@amsat.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The POWER9 processor does not support per-core frequency control. The
cores are arranged in groups of four, along with their respective L2
and L3 caches, into a structure known as a Quad. The frequency must be
managed at the Quad level.
Provide a basic Quad model to fake the settings done by the firmware
on the Non-Cacheable Unit (NCU). Each core pair (EX) needs a special
BAR setting for the TIMA area of XIVE because it resides on the same
address on all chips.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-12-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Provide a new class attribute to define XSCOM operations per CPU
family and add a couple of XSCOM addresses controlling the power
management states of the core on POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-11-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To ease the introduction of the OCC model for POWER9, provide a new
class attributes to define XSCOM operations per CPU family and a PSI
IRQ number.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190307223548.20516-9-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The LPC Controller on POWER9 is very similar to the one found on
POWER8 but accesses are now done via on MMIOs, without the XSCOM and
ECCB logic. The device tree is populated differently so we add a
specific POWER9 routine for the purpose.
SerIRQ routing is yet to be done.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-7-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The ISA bus has a different DT nodename on POWER9. Compute the name
when the PnvChip is realized, that is before it is used by the machine
to populate the device tree with the ISA devices.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-6-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It will ease the introduction of the LPC Controller model for POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org> Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20190307223548.20516-5-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PowerNV LPC Controller exposes different sets of registers for
each of the functional units it encompasses, among which the OPB
(On-Chip Peripheral Bus) Master and Arbitrer and the LPC HOST
Controller.
The mapping addresses of each register range are correct but the sizes
are too large. Fix the sizes and define the OPB Arbitrer range to fill
the gap between the OPB Master registers and the LPC HOST Controller
registers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-4-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PSI bridge on POWER9 is very similar to POWER8. The BAR is still
set through XSCOM but the controls are now entirely done with MMIOs.
More interrupts are defined and the interrupt controller interface has
changed to XIVE. The POWER9 model is a first example of the usage of
the notify() handler of the XiveNotifier interface, linking the PSI
XiveSource to its owning device model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-3-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
To ease the introduction of the PSI bridge model for POWER9, abstract
the POWER chip differences in a PnvPsi class model and introduce a
specific Pnv8Psi type for POWER8. POWER8 interface to the interrupt
controller is still XICS whereas POWER9 uses the new XIVE model.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190307223548.20516-2-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
mac_newworld: use node name instead of alias name for hd device in FWPathProvider
When using -drive to configure the hd drive for the New World machine, the node
name "disk" should be used instead of the "hd" alias.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307212058.4890-3-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
mac_oldworld: use node name instead of alias name for hd device in FWPathProvider
When using -drive to configure the hd drive for the Old World machine, the node
name "disk" should be used instead of the "hd" alias.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307212058.4890-2-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: introduce vsr64_offset() to simplify get_cpu_vsr{l,h}() and set_cpu_vsr{l,h}()
Now that all VSX registers are stored in host endian order, there is no need
to go via different accessors depending upon the register number. Instead we
introduce vsr64_offset() and use it directly from within get_cpu_vsr{l,h}() and
set_cpu_vsr{l,h}().
This also allows us to rewrite avr64_offset() and fpr_offset() in terms of the
new vsr64_offset() function to more clearly express the relationship between the
VSX, FPR and VMX registers, and also remove vsrl_offset() which is no longer
required.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-8-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: switch fpr/vsrl registers so all VSX registers are in host endian order
When VSX support was initially added, the fpr registers were added at
offset 0 of the VSR register and the vsrl registers were added at offset
1. This is in contrast to the VMX registers (the last 32 VSX registers) which
are stored in host-endian order.
Switch the fpr/vsrl registers so that the lower 32 VSX registers are now also
stored in host endian order to match the VMX registers. This ensures that TCG
vector operations involving mixed VMX and VSX registers will function
correctly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-7-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: improve avr64_offset() and use it to simplify get_avr64()/set_avr64()
By using the VsrD macro in avr64_offset() the same offset calculation can be
used regardless of the host endian. This allows get_avr64() and set_avr64() to
be simplified accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-6-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All TCG vector operations require pointers to the base address of the vector
rather than separate access to the top and bottom 64-bits. Convert the VMX TCG
instructions to use a new avr_full_offset() function instead of avr64_offset()
which can then itself be written as a simple wrapper onto vsr_full_offset().
This same function can also reused in cpu_avr_ptr() to avoid having more than
one copy of the offset calculation logic.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-5-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: move Vsr* macros from internal.h to cpu.h
It isn't possible to include internal.h from cpu.h so move the Vsr* macros
into cpu.h alongside the other VMX/VSX register access functions.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20190307180520.13868-4-mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: introduce single vsrl_offset() function
Instead of having multiple copies of the offset calculation logic, move it to a
single vsrl_offset() function.
This commit also renames the existing get_vsr()/set_vsr() functions to
get_vsrl()/set_vsrl() which better describes their purpose.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-3-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target/ppc: introduce single fpr_offset() function
Instead of having multiple copies of the offset calculation logic, move it to a
single fpr_offset() function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk> Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20190307180520.13868-2-mark.cave-ayland@ilande.co.uk> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_iommu: Do not replay mappings from just created DMA window
On sPAPR vfio_listener_region_add() is called in 2 situations:
1. a new listener is registered from vfio_connect_container();
2. a new IOMMU Memory Region is added from rtas_ibm_create_pe_dma_window().
In both cases vfio_listener_region_add() calls
memory_region_iommu_replay() to notify newly registered IOMMU notifiers
about existing mappings which is totally desirable for case 1.
However for case 2 it is nothing but noop as the window has just been
created and has no valid mappings so replaying those does not do anything.
It is barely noticeable with usual guests but if the window happens to be
really big, such no-op replay might take minutes and trigger RCU stall
warnings in the guest.
For example, a upcoming GPU RAM memory region mapped at 64TiB (right
after SPAPR_PCI_LIMIT) causes a 64bit DMA window to be at least 128TiB
which is (128<<40)/0x10000=2.147.483.648 TCEs to replay.
This mitigates the problem by adding an "skipping_replay" flag to
sPAPRTCETable and defining sPAPR own IOMMU MR replay() hook which does
exactly the same thing as the generic one except it returns early if
@skipping_replay==true.
Another way of fixing this would be delaying replay till the very first
H_PUT_TCE but this does not work if in-kernel H_PUT_TCE handler is
enabled (a likely case).
When "ibm,create-pe-dma-window" is complete, the guest will map only
required regions of the huge DMA window.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20190307050518.64968-2-aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The NSR register of the HV ring has a different, although similar, bit
layout. TM_QW3_NSR_HE_PHYS bit should now be raised when the
Hypervisor interrupt line is signaled. Other bits TM_QW3_NSR_HE_POOL
and TM_QW3_NSR_HE_LSI are not modeled. LSI are for special interrupts
reserved for HW bringup and the POOL bit is used when signaling a
group of VPs. This is not currently implemented in Linux but it is in
pHyp.
The most important special commands on the HV TIMA page are added to
let the core manage interrupts : acking and changing the CPU priority.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-Id: <20190306085032.15744-10-clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>