Drop the old SysBus init function and use instance_init
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
surface_bits_per_pixel() always returns 32
so, removing other dead code which is
based on DEPTH !== 32
Signed-off-by: Pooja Dhannawat <dhannawatpooja1@gmail.com>
Message-id: 1459260142-9144-1-git-send-email-dhannawatpooja1@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Removing support for DEPTH != 32 from blizzard template header
and file that includes it, as macro DEPTH == 32 only used.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Pooja Dhannawat <dhannawatpooja1@gmail.com>
Message-id: 1458971873-2768-1-git-send-email-dhannawatpooja1@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The semantics of the list visit are somewhat baroque, with the
following pseudocode when FooList is used:
start()
for (prev = head; cur = next(prev); prev = &cur) {
visit(&cur->value)
}
Note that these semantics (advance before visit) requires that
the first call to next() return the list head, while all other
calls return the next element of the list; that is, every visitor
implementation is required to track extra state to decide whether
to return the input as-is, or to advance. It also requires an
argument of 'GenericList **' to next(), solely because the first
iteration might need to modify the caller's GenericList head, so
that all other calls have to do a layer of dereferencing.
Thankfully, we only have two uses of list visits in the entire
code base: one in spapr_drc (which completely avoids
visit_next_list(), feeding in integers from a different source
than uint8List), and one in qapi-visit.py. That is, all other
list visitors are generated in qapi-visit.c, and share the same
paradigm based on a qapi FooList type, so we can refactor how
lists are laid out with minimal churn among clients.
We can greatly simplify things by hoisting the special case
into the start() routine, and flipping the order in the loop
to visit before advance:
start(head)
for (tail = *head; tail; tail = next(tail)) {
visit(&tail->value)
}
With the simpler semantics, visitors have less state to track,
the argument to next() is reduced to 'GenericList *', and it
also becomes obvious whether an input visitor is allocating a
FooList during visit_start_list() (rather than the old way of
not knowing if an allocation happened until the first
visit_next_list()). As a minor drawback, we now allocate in
two functions instead of one, and have to pass the size to
both functions (unless we were to tweak the input visitors to
cache the size to start_list for reuse during next_list, but
that defeats the goal of less visitor state).
The signature of visit_start_list() is chosen to match
visit_start_struct(), with the new parameters after 'name'.
The spapr_drc case is a virtual visit, done by passing NULL for
list, similarly to how NULL is passed to visit_start_struct()
when a qapi type is not used in those visits. It was easy to
provide these semantics for qmp-output and dealloc visitors,
and a bit harder for qmp-input (several prerequisite patches
refactored things to make this patch straightforward). But it
turned out that the string and opts visitors munge enough other
state during visit_next_list() to make it easier to just
document and require a GenericList visit for now; an assertion
will remind us to adjust things if we need the semantics in the
future.
Several pre-requisite cleanup patches made the reshuffling of
the various visitors easier; particularly the qmp input visitor.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
As mentioned in previous patches, we want to call visit_end_struct()
functions unconditionally, so that visitors can release resources
tied up since the matching visit_start_struct() without also having
to worry about error priority if more than one error occurs.
Even though error_propagate() can be safely used to ignore a second
error during cleanup caused by a first error, it is simpler if the
cleanup cannot set an error. So, split out the error checking
portion (basically, input visitors checking for unvisited keys) into
a new function visit_check_struct(), which can be safely skipped if
any earlier errors are encountered, and leave the cleanup portion
(which never fails, but must be called unconditionally if
visit_start_struct() succeeded) in visit_end_struct().
Generated code in qapi-visit.c has diffs resembling:
|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
| goto out_obj;
| }
| visit_type_ACPIOSTInfo_members(v, obj, &err);
|- error_propagate(errp, err);
|- err = NULL;
|+ if (err) {
|+ goto out_obj;
|+ }
|+ visit_check_struct(v, &err);
| out_obj:
|- visit_end_struct(v, &err);
|+ visit_end_struct(v);
| out:
and in qapi-event.c:
@@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP
| goto out;
| }
| visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, ¶m, &err);
|- visit_end_struct(v, err ? NULL : &err);
|+ if (!err) {
|+ visit_check_struct(v, &err);
|+ }
|+ visit_end_struct(v);
| if (err) {
| goto out;
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com>
[Conflict with a doc fixup resolved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Now that the QMP output visitor supports an explicit null
output, we should utilize it to make it easier to diagnose
the difference between a missing fdt ('null') vs. a
present-but-empty one ('{}').
(Note that this reverts the behavior of commit ab8bf1d, taking
us back to the behavior of commit 6c2f9a1 [which in turn
stemmed from a crash fix in 1d10b44]; but that this time,
the change is intentional and not an accidental side-effect.)
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1461879932-9020-17-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
This is a hack to support compilation with Mingw-w64 which provides
a libusb-1.0 package, but no poll.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1458630800-10088-1-git-send-email-sw@weilnetz.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If an application uses libmtp on the guest system,
it will complain with the warning message:
LIBMTP WARNING: VendorExtensionID: ffffffff
LIBMTP WARNING: VendorExtensionDesc: (null)
LIBMTP WARNING: this typically means the device is PTP (i.e. a camera) but
not a MTP device at all. Trying to continue anyway.
This is because libmtp expects a MTP Vendor Extension ID of 0x00000006 and a
MTP Version of 0x0064. These numbers are taken from Microsoft's MTP Vendor
Extension Identification Message page and are what most physical devices
show.
Signed-off-by: Isaac Lozano <109lozanoi@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1460892593-5908-1-git-send-email-109lozanoi@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch is a rough fix to a memory corruption we are observing when
running VMs with xhci USB controller and OVMF firmware.
Specifically, on the following call chain
xhci_reset
xhci_disable_slot
xhci_disable_ep
xhci_set_ep_state
QEMU overwrites guest memory using stale guest addresses.
This doesn't happen when the guest (firmware) driver sets up xhci for
the first time as there are no slots configured yet. However when the
firmware hands over the control to the OS some slots and endpoints are
already set up with their context in the guest RAM. Now the OS' driver
resets the controller again and xhci_set_ep_state then reads and writes
that memory which is now owned by the OS.
As a quick fix, skip calling xhci_set_ep_state in xhci_disable_ep if the
device context base address array pointer is zero (indicating we're in
the HC reset and no DMA is possible).
Cc: qemu-stable@nongnu.org
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-id: 1462384435-1034-1-git-send-email-rkagan@virtuozzo.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode. This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.
Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.
Which is good for one or another buffer overflow. Not that
critical as they typically read overflows happening somewhere
in the display code. So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.
Fixes: CVE-2016-3712
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Call the new vbe_update_vgaregs() function on vbe configuration
changes, to make sure vga registers are up-to-date.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer. Move that code to a separate function so we can call it
from other places too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.
The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register. The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.
Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.
Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.
Fixes: CVE-2016-3710
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
'make check' fails with:
ERROR:tests/bios-tables-test.c:493:load_expected_aml:
assertion failed: (g_file_test(aml_file, G_FILE_TEST_EXISTS))
since commit:
caf50c7166
tests: pc: acpi: drop not needed 'expected SSDT' blobs
Assert happens because qemu-system-x86_64 generates
SSDT table and test looks for a corresponding expected
table to compare with.
However there is no expected SSDT blob anymore, since
QEMU souldn't generate one. As it happens BIOS is not
able to read ACPI tables from QEMU and fallbacks to
embeded legacy ACPI codepath, which generates SSDT.
That happens due to wrongly sized endiannes conversion
which makes
uint8_t BiosLinkerLoaderEntry.alloc.zone
end up with 0 due to truncation of 32 bit integer
which on host is 1 or 2.
Fix it by dropping invalid cpu_to_le32() as uint8_t
doesn't require any conversion.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1330174
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
CPU/memory resources can be signalled en-masse via
spapr_hotplug_req_add_by_count(), and when doing so, actually change
the meaning of the 'drc' parameter passed to
spapr_hotplug_req_event() to be a count rather than an index.
f40eb92 added a hook in spapr_hotplug_req_event() to record when a
device had been 'signalled' to the guest, but that code assumes that
drc is always an index. In cases where it's a count, such as memory
hotplug, the DRC lookup will fail, leading to an assert.
Fix this by only explicitly setting the signalled state for cases where
we are doing PCI hotplug.
For other resources types, since we cannot selectively track whether a
resource has been signalled in cases where we signal attach as a count,
set the 'signalled' state to true immediately upon making the
resource available via drck->attach().
Reported-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: david@gibson.dropbear.id.au
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit "5f77e06 usb: add pid check at the first of uhci_handle_td()"
moved the pid verification to the start of the uhci_handle_td function,
to simplify the error handling (we don't have to free stuff which we
didn't allocate in the first place ...).
Problem is now the check fires too often, it raises error IRQs even for
TDs which we are not going to process because they are not set active.
So, lets move down the check a bit, so it is done only for active TDs,
but still before we are going to allocate stuff to process the requested
transfer.
Reported-by: Joe Clifford <joe@thunderbug.co.uk>
Tested-by: Joe Clifford <joe@thunderbug.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1461321893-15811-1-git-send-email-kraxel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU currently crashes when using bad parameters for the
spapr-pci-host-bridge device:
$ qemu-system-ppc64 -device spapr-pci-host-bridge,buid=0x123,liobn=0x321,mem_win_addr=0x1,io_win_addr=0x10
Segmentation fault
The problem is that spapr_tce_find_by_liobn() might return NULL, but
the code in spapr_populate_pci_dt() does not check for this condition
and then tries to dereference this NULL pointer.
Apart from that, the return value of spapr_populate_pci_dt() also
has to be checked for all PCI buses, not only for the last one, to
make sure we catch all errors.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The effect of this change is the block layer drained section can work,
for example when mirror job is being completed.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers pass "false" keeping the old semantics. The windows
implementation doesn't distinguish the flag yet. On posix, it is passed
down to the underlying aio context.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The 32-bit ARM Linux kernel booting ABI requires that r0 is 0
when calling the kernel image. A bug in commit 10b8ec73e6
meant that for boards which use the write_board_setup hook (which
means "highbank", "midway", "raspi2" and "xilinx-zynq-a9") we
were incorrectly skipping the "clear r0" instruction in the
mini-bootloader. Use the right offset in the "add lr, pc, #n"
instruction so that we return from the board-setup code to the
correct place.
Signed-off-by: Sylvain Garrigues <sylvain@sylvaingarrigues.com>
[PMM: Expanded commit message]
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A single fix for a regression since 2.5. This should be the last ppc
pull request for 2.6.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXFY5uAAoJEGw4ysog2bOS1MgQAIKbBZPcKZeu7k8zVik6tObc
N7T3xrzZC0zMJEB9uu8m2ULsHhk7NMs2nl951q8ofHeufYtMUVwrmvML90+09wrL
brq08o0fHxyzWLmadwyHW8YuY5rTB1rsPTfUM+nUblS8n3LdcI2C8xBR6+Zvdjfj
/4znUujytbxyncVgQR624Y0TXDFD3+EzYSnF9mEGMpXG4DLoIZpltFR1XwSf0Izz
MkUeyPuXacapXofIKtJTPwmHDjetsElJTt4u85kw4XrjVeo9vXjBfZnbRIqd6jrM
1dPz2oDYjNLU1TrpQtaXM54DXYyy+klpBbZbEBp0O43GRNWAtVDvK6XSpwHsScuE
C/7wAIoMzNuGHrUnhmpkDJuJpulJuGiY0df+8me+K52NDaPgTeW0ZF1heGnMBQ3t
7P2aSZ06Us047isGHYQpmvzf0ptLwn54i0Hh35ChXyrkCBHfA0DyRXhDbI+7GurA
42quB8NZ8JeIoCtP0EPjYn532bDa8DKCegnNR6au+pkr9Ato4cDxO02JiINT4Y94
+fMtvlWeHVGLuFVul/WRPYhzP7cRPvrcswpL/iABjCXKpfyRrg11Pe0Wvmn1JLnX
tMjPCd2K1fqwYMtJ2uCIlv4NpDofJIdKF+kgVe8/VAFlKM4SpA6F+IILOqIsZUR1
pE+nmQ6KAfIhnH2rfxy6
=+Eci
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160419' into staging
ppc patch queueu for 2016-04-19
A single fix for a regression since 2.5. This should be the last ppc
pull request for 2.6.
# gpg: Signature made Tue 19 Apr 2016 02:48:30 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160419:
cuda: fix off-by-one error in SET_TIME command
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
cadence_uart_init() initializes an I/O memory region of size 0x1000
bytes. However in uart_write(), the 'offset' parameter (offset within
region) is divided by 4 and then used to index the array 'r' of size
CADENCE_UART_R_MAX which is much smaller: (0x48/4). If 'offset>>=2'
exceeds CADENCE_UART_R_MAX, this will cause an out-of-bounds memory
write where the offset and the value are controlled by guest.
This will corrupt QEMU memory, in most situations this causes the vm to
crash.
Fix by checking the offset against the array size.
Cc: qemu-stable@nongnu.org
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20160418100735.GA517@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit "156a2e4 ehci: make idt processing more robust" tries to avoid a
DoS by the guest (create a circular iTD queue and let qemu ehci
emulation run in circles forever). Unfortunately this has two problems:
First it misses the case of siTDs, and second it reportedly breaks
FreeBSD.
So lets go for a different approach: just count the number of iTDs and
siTDs we have seen per frame and apply a limit. That should really
catch all cases now.
Reported-by: 杜少博 <dushaobo@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
With the new framework the cuda_cmd_set_time command directly receive
the data, without the command byte. Therefore the time is stored at
in_data[0], not at in_data[1].
This fixes the "hwclock --systohc" command in a guest.
Cc: Hervé Poussineau <hpoussin@reactos.org>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
[this fixes a regression introduced by e647317 "cuda: port SET_TIME
command to new framework"]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Pflash migration (e.g. q35 + EFI variable storage) fails
with the assert:
bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
This avoids the problem by delaying the pflash update until after
the device loads complete.
Tested by:
Migrating Q35/EFI vm.
Changing efi variable content (with efiboot in the guest)
md5sum'ing the variable file before migration and after.
This is a fix that Paolo posted in the message
570244B3.4070105@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Minor fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXD58TAAoJECgfDbjSjVRpDCAH/iZXlMxl4j23qH4mqJa88HJq
UHqsuU6NGHXhsYUzGy9wQp7RTNnMlwF1GC+vsIlZzr1XPu/U/GwUZVPf1Ca0xZ0Q
ukRzd7nvAaHnUEC26AJul8CgoThmPf5ip4LqAqQvSUrrAsQ1viR49HHCtmFC2w33
iOg9ZznZM+Prlh8IGMCSF93ER9l4s7T2CvDPmlKtC5iXepU8J47V2EmPg3VjCd3B
jeQ6RIF0RtJQCvUxLW3FcUnM6bmIszqPEwmBkiOfJcvuNisNMZGavAyzzoXfMmQ9
YkrGEnDwLa5a3qMTptmxDvPzy4ksc7OzVIw0bcBnqnGmDJwGz44mBjYJ555zJi8=
=ufoF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
tpm, vhost, virtio: fixes for 2.6
Minor fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 14 Apr 2016 14:45:55 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
hw/virtio/balloon: Replace TARGET_PAGE_SIZE with BALLOON_PAGE_SIZE
tpm: Fix write to file descriptor function
tpm: acpi: remove IRQ from TPM's CRS to make Windows not see conflict
pc: acpi: tpm: add missing MMIO resource to PCI0._CRS
specs/vhost-user: spelling fix
specs/vhost-user: improve VHOST_SET_VRING_NUM documentation
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The balloon code currently calls madvise() with TARGET_PAGE_SIZE as
length parameter. Since the virtio-balloon protocol is always based
on 4k pages, no matter what the host and guest are using as page size,
this could cause problems: If TARGET_PAGE_SIZE is bigger than 4k, the
madvise call also destroys the 4k areas after the current one - which
might be wrong since the guest did not want free that area yet (in
case the guest used as smaller MMU page size than the hard-coded
TARGET_PAGE_SIZE). So to fix this issue, introduce a proper define
called BALLOON_PAGE_SIZE (which is 4096) to use this as the size
parameter for the madvise() call instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix a bug introduced in commit 46f296c while moving send_all to the
tpm_passthrough code. Fix the name of the variable used in the loop.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IRQ 5 used by TPM conflicts with PNP0C0F IRQs,
as result Windows fails driver initialization with reason
'device cannot find enough free resources'
But if TPM._CRS.IRQ entry is commented out, Windows
seems to initialize driver without errors as it doesn't
notice possible conflict and it seems to work
probably due to a link with IRQ 5 being unused/disabled.
So temporary comment out TPM._CRS.IRQ to 'fix'
regression in TPM, with intent to fix it correctly
later i.e.:
1. pick unused IRQ as default one for TPM
2. fetch IRQ value from device model so that user
could override default one if it conflicts with
some other device.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Windows will fail initialize TMP driver with the reason:
'device cannot find enough free resources'
That happens because parent BUS doesn't describe
MMIO resources used by TPM child device.
Fix it by describing it in top-most parent bus scope PCI0.
It was 'regressed' by commit
5cb18b3d TPM2 ACPI table support
with following fixup
9e472263 acpi: add missing ssdt
which did the right thing by moving TPM to BUS
it belongs to but lacked a proper resource declaration.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VIRTIO_INPUT_CFG_ABS_INFO was not implemented for pass-through input
devices. This patch follows the existing design and pre-fetches the
config for all absolute axes using EVIOCGABS at realize time.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1460558603-18331-1-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The reported maximum was wrong. The X and Y coordinates are 0-based
so if size is 8000 maximum must be 7FFF.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1460128893-10244-1-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
virtio-input is simple enough that it doesn't need to xfer any state.
Still we have to wire up savevm manually, so the generic pci and virtio
are saved correctly.
Additionally we need to do some post-load processing to figure whenever
the guest uses the device or not, so we can give input routing hints to
the qemu input layer using qemu_input_handler_{activate,deactivate}.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1459859501-16965-1-git-send-email-kraxel@redhat.com
The write path for pass-through devices, commonly used for controlling
keyboard LEDs via EV_LED, was not implemented. This commit adds the
necessary plumbing to connect the status virtio queue to the host evdev
file descriptor.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1459511146-12060-1-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
VIRTIO_INPUT_CFG_EV_BITS with subsel of EV_LED was always
returning an empty bitmap for pass-through input devices.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1459418028-7473-1-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
KEY_PAUSE is flat out missing. KEY_SYSRQ already has a keycode
assigned but it's not what I'm seeing on my system. The mapping
doesn't appear to have to be unique so both keycodes now map to
KEY_SYSRQ which is what the "Keyboard PrintScreen", HID usage ID
0x46, translates to.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1459343240-19483-1-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
"qemu -device ivshmem-{plain,doorbell}" will crash, because the device
doesn't check that the required argument is provided. (screwed up in
commit 5400c02)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Restart of ATAPI DMA used to be unreachable, because the request to do
so wasn't indicated in bus->error_status due to the lack of spare bits, and
ide_restart_bh() would return early doing nothing.
This patch makes use of the observation that not all bit combinations were
possible in ->error_status. In particular, IDE_RETRY_READ only made sense
together with IDE_RETRY_DMA or IDE_RETRY_PIO. This allows to re-use
IDE_RETRY_READ alone as an indicator of ATAPI DMA restart request.
To makes things more uniform, ATAPI DMA gets its own value for ->dma_cmd.
As a means against confusion, macros are added to test the state of
->error_status.
The patch fixes the restart of both in-flight and pending ATAPI DMA,
following the scheme similar to that of IDE DMA.
[Including a fixup patch:
Message-id: 1460465594-15777-1-git-send-email-pbutsykin@virtuozzo.com
--js]
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1459924806-306-4-git-send-email-den@openvz.org
Signed-off-by: John Snow <jsnow@redhat.com>
ide_atapi_dma_restart() used to just complete the DMA with an error,
under the assumption that there isn't enough information to restart it.
However, as the contents of the ->io_buffer is preserved, it looks safe to
just re-evaluate it and dispatch the ATAPI command again.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1459924806-306-3-git-send-email-den@openvz.org
Signed-off-by: John Snow <jsnow@redhat.com>
If the migration occurs after the IDE DMA has been set up but before it
has been initiated, the state gets lost upon save/restore. Specifically,
->dma_cb callback gets cleared, so, when the guest eventually starts bus
mastering, the DMA never completes, causing the guest to time out the
operation.
OTOH all the infrastructure is already in place to restart the DMA if
the migration happens while the DMA is in progress.
So reuse that infrastructure, by setting bus->error_status based on
->dma_cmd in pre_save if ->dma_cb callback is already set but DMAING is
clear. This will indicate the need for restart and make sure ->dma_cb
is restored in ide_restart_bh(); howeover since DMAING is clear the state
upon restore will be exactly "ready for DMA" as before the save.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1459924806-306-2-git-send-email-den@openvz.org
Signed-off-by: John Snow <jsnow@redhat.com>
After commit e5e7855 (blockdev: Separate BB name management), starting a
guest with PVHVM support result in this assert:
qemu-system-i386: block/block-backend.c:173: blk_delete: Assertion `!blk->name' failed.
A backtrace show that a caller is pci_piix3_xen_ide_unplug().
This patch fix it.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-id: 1460382666-29885-1-git-send-email-anthony.perard@citrix.com
Signed-off-by: John Snow <jsnow@redhat.com>
In commit ac0487e1 ("xenfb.c: avoid expensive loops when prod <=
out_cons"), ">=" was used. In fact, a full ring is a legit state.
Correct the test to use ">".
Reported-by: "Hao, Xudong" <xudong.hao@intel.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Tested-by: "Hao, Xudong" <xudong.hao@intel.com>
Acked-by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
When receiving packets over Stellaris ethernet controller, it
uses receive buffer of size 2048 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.
Reported-by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1460095428-22698-1-git-send-email-ppandit@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Feeling a bit nervous putting the full live migration support
patch (https://patchwork.ozlabs.org/patch/606902/) in that
late in the 2.6 devel cycle as it carries some non-trivial
changes. So disable migration in case virtio-gpu is present
for now.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a the new qemu_create_displaysurface_pixman function, to create
a DisplaySurface backed by an existing pixman image. In that case
there is no need to create a new pixman image pointing to the same
backing storage. We can just use the existing image directly.
This does not only simplify things a bit, but most importantly it
gets the reference counting right, so the backing storage for the
pixman image wouldn't be released underneath us.
Use new function in virtio-gpu, where using it actually fixes
use-after-free crashes.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1459499240-742-1-git-send-email-kraxel@redhat.com
Fixes all over the place. Most notably, fixes migration
for systems with pci express bridges, and random crashes
observed with virtio blk and scsi dataplane.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJXB2OKAAoJECgfDbjSjVRpcfkH/jRxiCmaC2hArYS+xK1HXJuf
FUuI6JpbrvcuxdoNNm3E3BLM+vweYJ2JsVfY47l7zFcQVnBCmdRw4JMUetpqvt5C
ekEwuokvWcSwdCT6y3mUW3euZ5iBoiFxYAvjwtM4p2Aut6urZUA3oeBSsSikGWJj
itKsdeaaGN5bzC7BJYROUIs8VuK4VfIPpRBGF20smfMjXZq2liOnW7NK5mQ8BClg
klXfLnZ39IdR+MQ37J1uo9TgY/HcdOGRK1KYezq+6nsO4jlAfGrDr99+D9bsheLJ
G1jZLDEa+oTjfoCvqVkCK3+8O2vvHQe2BeR9sCEipYY9401gQPCqq4fmhv8/+O0=
=U4yU
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci, virtio, acpi: fixes for 2.6
Fixes all over the place. Most notably, fixes migration
for systems with pci express bridges, and random crashes
observed with virtio blk and scsi dataplane.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 08 Apr 2016 08:53:46 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
hw/pci-bridge: Add missing unref in case register-bus fails
virtio: merge virtio_queue_aio_set_host_notifier_handler with virtio_queue_set_aio
virtio-scsi: use aio handler for data plane
virtio-blk: use aio handler for data plane
virtio: add aio handler
virtio-scsi: fix disabled mode
virtio-blk: fix disabled mode
virtio: make virtio_queue_notify_vq static
tests/bios-tables-test: fix assert
virtio-balloon: reset the statistic timer to load device
Migration: Add i82801b11 migration data
Sort the fw_cfg file list
xen: piix reuse pci generic class init function
pci-testdev: fast mmio support
acpi: Add missing GCC_FMT_ATTR
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Just a single bugfix for spapr in this batch, but I want to make sure
it gets in for 2.6.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXBzt1AAoJEGw4ysog2bOSyGQQAIL4aADwOhNoVLjtvBN3eoPQ
cP+Ps3DCK/9Z9l00cMR6/8zk5Q2Nb1FLf2Y3f3c2JVEFER8XCnsYPIyYfOZaMex4
/8DUfVueTh0RmpxhWwA4vQJtDqrilB0tUkkqgWFPE2luJcTVTUU7mig788d2yrmp
J35ncNaMcrXGy0Uh/wBlnOpfHD17ds8Sgpw02TT9QusqIjq8MWIkgat0v+h4RmRL
lzEE5N1Vp8vOvJENTEnuuKFbFTxcvhBS+A2K1y+s10k7c1CuFFJpAZY7g3T4hpqU
NZAirty5WeMlSYk9A0gQhgHWq2XSgbDWWj6tMGd5sCEQH5D6Kty0TPWnCpzSxjgu
aqGr7BqAV+NV/Rr/jGy4gvE432f1pZWUIxq271OH9H5aniCWSYFBR7w4UEaM1BPQ
I5tzkp7P1PMWIm/K5ryFVo083kU08KFXZDSbQR/vu4O+DuohPUKYid5cv4wJj/W+
GSzBwTwtp8iY2rs/nbMptSYHKYFYtd5PuALf4BoK62sF72NtWq+41X3QV8I4cIQd
hM03NyuObgnY7aygPmo9OGsvW/Dx8DKKoEO0QX+2gFa22rJ+j7RLSu7pHFW1JEXa
5VkVlTtN8L5NeeG0PdkgkChcgiqahUA6bRjekpFzdoncfsmmiPkiP5xQqK1DVKhW
SoJacddcj86QGpT1aioU
=4ZAr
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160408' into staging
ppc patch queue for 2016-04-08
Just a single bugfix for spapr in this batch, but I want to make sure
it gets in for 2.6.
# gpg: Signature made Fri 08 Apr 2016 06:02:45 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160408:
spapr: Fix ibm,lrdr-capacity
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* Debug code bitrot from Emilio
* HPET fix from Bill
* ps2kbd fix from Hervé
* PKU fix from myself
* Coverity fixes from Gonglei
* More memory.txt update from Jiangang
* .gitignore maintenance from Changlong
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJXBtpMAAoJEL/70l94x66Dz18H/im1nR9+DII3V4Ch2DOA/xaL
gCDHOr+KOXnzLcKQytfBCRjVsXZKnd2kayxtDrFc6ebCh7+Uar5+Z21P0wzCry5a
Z/LohJXg7rdHCNaF+9yW7uVbvDjuaepnS1fO9PaMnLviLjqxbIhZixYvuYRPwXD7
oFp88+cbgONIFTe7riXaVGBjjjsyTV8DyXyyS7zlBsuaTihfSYDuYy8upgrCk3Af
k9P/VqHxoNpnR1dMPVJdjyN6UQwWYB6i1g4GOEFessRu3H7wz0zCtzUuFL8dGIEs
3VZ7pQ/1QCPr5zaMn7EXofusxeZhU+gZwDnp/bsgGSI73MnMZkTB4Ws8CPf6AXA=
=1rhX
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NBD fixes from Alex and Eric
* Debug code bitrot from Emilio
* HPET fix from Bill
* ps2kbd fix from Hervé
* PKU fix from myself
* Coverity fixes from Gonglei
* More memory.txt update from Jiangang
* .gitignore maintenance from Changlong
# gpg: Signature made Thu 07 Apr 2016 23:08:12 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream:
target-i386: check for PKU even for non-writable pages
tests: ignore test-logging
translate-all: add missing fold of tb_ctx into tcg_ctx
hostmem-file: fix memory leak
spapr: fix possible Negative array index read
nbd: do not hang nbd_wr_syncv if outside a coroutine and no available data
nbd: Don't kill server when client requests unknown option
nbd: Fix NBD unsupported options
qemu-nbd: Document -x option
nbd: Improve debug traces on little-endian
nbd: Avoid bitrot in TRACE() usage
nbd: Return correct error for write to read-only export
docs: fix typo in memory.txt
hw/timer: Revert "hpet: inverse polarity when pin above ISA_NUM_IRQS"
ps2kbd: default to scancode_set 2, as with KBD_CMD_RESET
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix off-by-one error in ITC Tag read.
Remove the switch as we just want to check if index is in valid range
rather than test against list of values.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
ibm,lrdr-capacity has a field to describe the maximum address in bytes
and therefore, the most memory that can be allocated to this guest. We
are using maxmem for this field, but instead should use the actual RAM
address corresponding to the end of hotplug region.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This reverts commit 0d63b2dd31.
This change was originally intended to correct the HPET behavior
in conjunction with Linux, however the behavior that it actually creates
is not compatible with the ioapic.c implementation; it used to be
compatible with KVM's own IOAPIC but it is not anymore.
Signed-off-by: Bill Paul <wpaul@windriver.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <201604051558.20070.wpaul@windriver.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This line has been added in commit ef74679a81 with
other initializations. However, scancode set 0 doesn't exist (only 1, 2, 3).
This works well as long as operating system is resetting keyboard, or overwriting
the current scancode set with the one it wants.
This fixes IBM 40p firmware, which doesn't bother sending KBD_CMD_RESET or KBD_CMD_SCANCODE.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1458714100-28885-1-git-send-email-hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The error paths after a successful qdev_create/pci_bus_new
should contain a object_unref/object_unparent.
pxb_dev_init_common() did not yet, so add it.
Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Eliminating the reentrancy is actually a nice thing that we can do
with the API that Michael proposed, so let's make it first class.
This also hides the complex assign/set_handler conventions from
callers of virtio_queue_aio_set_host_notifier_handler, which in
fact was always called with assign=true.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In addition to handling IO in vcpu thread and in io thread, dataplane
introduces yet another mode: handling it by AioContext.
This reuses the same handler as previous modes, which triggers races as
these were not designed to be reentrant. Use a separate handler just
for aio, and disable regular handlers when dataplane is active.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In addition to handling IO in vcpu thread and in io thread, blk dataplane
introduces yet another mode: handling it by AioContext.
Currently, this reuses the same handler as previous modes,
which triggers races as these were not designed to be reentrant.
Add instead a separate handler just for aio; this will make
it possible to disable regular handlers when dataplane is active.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add two missing checks for s->dataplane_fenced. In one case, QEMU
would skip injecting an IRQ due to a write to an uninitialized
EventNotifier's file descriptor.
In the second case, the dataplane_disabled field was used by mistake;
in fact after fixing this occurrence it is completely unused.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We must not call virtio_blk_data_plane_notify if dataplane is
disabled: we would hit a segmentation fault in notify_guest_bh as
s->guest_notifier has not been setup and is NULL.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If before loading snapshot we had set the timer of statistics, then after
applying snapshot the expiry time would be irrelevant for the restored
state of the virtual clocks. A simple fix is just to restart the timer
after loading snapshot.
For the user it may look like a long delay of statistics update after switch
to the snapshot.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The i82801b11 bridge didn't have a vmsd and thus didn't send
any migration data, including that of its parent PCIBridge object.
The symptom being if the guest used any devices behind the bridge
the guest crashed (mostly with various interrupt related issues).
Note: This will cause migration from old qemus that used this device to
explicitly fail during migration as opposed to the guest crashing.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Entries are inserted in filename order instead of being
appended to the end in case sorting is enabled.
This will avoid any future issues of moving the file creation
around, it doesn't matter what order they are created now,
the will always be in filename order.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Added machine type handling for compatibility. This was
a fairly complex change, this will preserve the order of fw_cfg
for older versions no matter what order the firmware files
actually come in. A list is kept of the correct legacy order
and the entries will be inserted based upon their order in
the list. Except that some entries are ordered (in a specific
area of the list) based upon what order they appear on the
command line. Special handling is added for those entries.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
piix3_ide_xen_class_init is identical to piix3_ide_class_init
except it's buggy as it does not set exit and does not disable
hotplug properly.
Switch to the generic one.
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Teach PCI testdev to use fast MMIO when kvm makes it available.
Before:
mmio-wildcard-eventfd:pci-mem 2271
After:
mmio-wildcard-eventfd:pci-mem 1218
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Through CP_TX_OWN and CP_RX_OWN points to the same bit, we'd better use
CP_TX_OWN for tx descriptor handling.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Currently spapr doesn't support "aborting" hotplug of PCI
devices by allowing device_del to immediately remove the
device if we haven't signalled the presence of the device
to the guest.
In the past this wasn't an issue, since we always immediately
signalled device attach and simply relied on full guest-aware
add->remove path for device removal. However, as of 788d259,
we now defer signalling for PCI functions until function 0
is attached, so now we need to deal with these "abort" operations
for cases where a user hotplugs a non-0 function, then opts to
remove it prior hotplugging function 0. Currently they'd have to
reboot before the unplug completed. PCIe multifunction hotplug
does not have this requirement however, so from a management
implementation perspective it would be good to address this within
the same release as 788d259.
We accomplish this by simply adding a 'signalled' flag to track
whether a device hotplug event has been sent to the guest. If it
hasn't, we allow immediate removal under the assumption that the
guest will not be using the device. Devices present at boot/reset
time are also assumed to be 'signalled'.
For CPU/memory/etc, signalling will still happen immediately
as part of device_add, so only PCI functions should be affected.
Cc: bharata@linux.vnet.ibm.com
Cc: david@gibson.dropbear.id.au
Cc: sbhat@linux.vnet.ibm.com
Cc: qemu-ppc@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: This fixes a regression where an incorrect hot-add of a non-zero
function can no longer be backed out until function 0 is added]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
This patch fixes the current AIL implementation for POWER8. The
interrupt vector address can be calculated directly from LPCR when the
exception is handled. The excp_prefix update becomes useless and we
can cleanup the H_SET_MODE hcall.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: Removed LPES0/1 handling for HV vs. !HV
Fixed LPCR_ILE case for POWERPC_EXCP_POWER8 ]
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
[dwg: This was written as a cleanup, but it also fixes a real bug
where setting an alternative interrupt location would not be
correctly migrated]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Wire up the CPU timer interrupts in the right order, with the
nonsecure physical timer on cntpnsirq, the hyp timer on cnthpirq,
and the secure physical timer on cntpsirq. (We did get the
virt timer right, at least.)
Reported-by: Antonio Huete Jiménez <tuxillo@quantumachine.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1458210790-6621-1-git-send-email-peter.maydell@linaro.org
Indicate that autonegotiation is complete in the MII BMSR. This fixes
networking on xtfpga platform in linux v4.5.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-12-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-11-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implements FSR register, it is used for busy waits.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-10-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Adds fast read and 4bytes commands family.
This work is based on Pawel Lenkow patch from v1.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-9-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Use the setting from the volatile cfg register to correctly
set the number of dummy cycles.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-8-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds both volatile and non volatile configuration registers
and commands to allow modify them. It is needed for proper handling
dummy cycles. Initialization of those registers and flash state
has been included as well.
Some of this registers are used by kernel.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Acked-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-7-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds only 4byte address mode (does not cover dummy cycles).
This mode is needed to access more than 16 MiB of flash.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-6-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend address mode allows to switch flash 16 MiB banks,
allowing user to access all flash sectors.
This access mode is used by u-boot.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-5-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Extend the width of the flags variable to support the already existing
(but unused) WR_1 flag, which is above the range of 8 bits.
This allows support of EEPROM emulation which requires the WR_1 feature.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-4-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-3-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1458719789-29868-2-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
There is a problem for power button that it will not work if an early
system_powerdown request happens before guest gpio driver loads.
Fix this problem by using gpio_key.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1458221140-15232-3-git-send-email-zhaoshenglong@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will be used by ARM virt machine as a power button.
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1458221140-15232-2-git-send-email-zhaoshenglong@huawei.com
[PMM: Use hyphen rather than underscore in type names;
add a comment briefly describing what the device does]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Changes:
* add initial MIPS CPS support
* implement ITU block
* implement MAAR
-----BEGIN PGP SIGNATURE-----
iQEcBAABAgAGBQJW+43VAAoJEFIRjjwLKdprLmoH/1iWT4WsUJDF+9KX7PpFANbQ
DT+QSDBJr6K+jCenLlqfvB30txS+NDRFzmW65J8hlawVOwhamg1X+pcQTbAYy0sm
Du3Wexye0uw5YKUmqK2oCrgLJCKm3AqsmraaITE8q1URlkrQpuOuzazlIx5UA+RW
RgF/DsPAlit8TkZMHwaVIOeXUl8vl8152fU26QvwOGAT6J3lV+lQJ+gMPGRSAWOw
dcuVGNOTV0g3+kzOWisiqZc/V0Wp2Yu5IPezEVkFjZ4iyTTpnR8gkzuTNebzoRZo
Zmws4mqaZAX6ijveevd+ueh5sR+AX+mFBunoXQVSSBvRFlqBcEEL2nrwb9wOUX4=
=Ns6n
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160329-2' into staging
MIPS patches 2016-03-29
Changes:
* add initial MIPS CPS support
* implement ITU block
* implement MAAR
# gpg: Signature made Wed 30 Mar 2016 09:27:01 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"
* remotes/lalrae/tags/mips-20160329-2: (21 commits)
target-mips: add MAAR, MAARI register
target-mips: use CP0_CHECK for gen_m{f|t}hc0
hw/mips/cps: enable ITU for multithreading processors
target-mips: make ITC Configuration Tags accessible to the CPU
target-mips: check CP0 enabled for CACHE instruction also in R6
hw/mips: implement ITC Storage - Bypass View
hw/mips: implement ITC Storage - P/V Sync and Try Views
hw/mips: implement ITC Storage - Empty/Full Sync and Try Views
hw/mips: implement ITC Storage - Control View
hw/mips: implement ITC Configuration Tags and Storage Cells
target-mips: enable CM GCR in MIPS64R6-generic CPU
hw/mips_malta: add CPS to Malta board
hw/mips_malta: move CPU creation to a separate function
hw/mips_malta: remove redundant irq and clock init
hw/mips_malta: remove CPUMIPSState from the write_bootloader()
hw/mips/cps: create CPC block inside CPS
hw/mips: add initial Cluster Power Controller support
hw/mips/cps: create GCR block inside CPS
hw/mips: add initial Global Config Register support
target-mips: add CMGCRBase register
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Bypass View does not cause issuing thread to block and does not affect
any of the cells state bit.
Read from a FIFO cell returns the value of the oldest entry.
Store to a FIFO cell changes the value of the newest entry.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
P/V Synchronized and Try Views can be used to access Semaphore cells.
Load returns current value and post-decrements the value in the cell
(until it reaches zero). Stores increment the value (until it saturates
at 0xFFFF).
P/V Synchronized View causes the issuing thread to block on read if value
is 0. P/V Try View does not block the thread, it returns 0 in this case.
Cell's Empty and Full bits are not modified.
Trap bit (i.e. Gating Storage exceptions) not implemented.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Empty/Full Synchronized and Try views can be used to access FIFO cells.
Store to the FIFO cell pushes the value into the queue, load pops the oldest
element from the queue. Cell's Full and Empty bits are automatically updated
to reflect new state of the cell.
Empty/Full Synchronized View causes the issuing thread to block when FIFO is
empty while thread is performing a read, or FIFO is full while thread is
performing a write.
Empty/Full Try View never blocks the thread. If cell is full then write is
ignored, if cell is empty then load returns 0.
Trap bit (i.e. Gating Storage exceptions) not implemented.
Store Conditional support for E/F Try View (i.e. indicate failure if FIFO
is full) not implemented.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Control view is used to access the ITC Storage Cell Tags. It never causes
the issuing thread to block.
Guest can empty the FIFO cell by setting Empty bit to 1.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Implement ITC as a single object consisting of two memory regions:
1) tag_io: ITC Configuration Tags (i.e. ITCAddressMap{0,1} registers) which
are accessible by the CPU via CACHE instruction. Also adding
MemoryRegion *itc_tag to the CPUMIPSState so that CACHE instruction will
dispatch reads/writes directly.
2) storage_io: memory-mapped ITC Storage whose address space is configurable
(i.e. enabled/remapped/resized) by writing to ITCAddressMap{0,1} registers.
ITC Storage contains FIFO and Semaphore cells. Read-only FIFO bit in the
ITC cell tag indicates the type of the cell. If the ITC Storage contains
both types of cells then FIFOs are located before Semaphores.
Since issuing thread can get blocked on the access to a cell (in E/F
Synchronized and P/V Synchronized Views) each cell has a bitmap to track
which threads are currently blocked.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
If the user specifies smp > 1 and the CPU with CM GCR support, then
create Coherent Processing System (which takes care of instantiating CPUs)
rather than CPUs directly and connect i8259 and cbus to the pins exposed by
CPS. However, there is no GIC yet, thus CPS exposes CPU's IRQ pins so use
the same pin numbers as before.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Global smp_cpus is never zero (even if user provides -smp 0), thus clocks
and irqs are always initialized for each created CPU in the loop at the
beginning of mips_malta_init.
These two lines cause a leak of already allocated timer and irqs for the
first CPU - remove them.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Create Cluster Power Controller and add a link to the CPC MemoryRegion
in GCR. Guest can enable / map CPC to any physical address by writing to
the memory-mapped GCR_CPC_BASE register.
Set vp-start-reset property to 1 to allow only first VP to run from reset.
Others are brought up by the guest via CPC memory-mapped registers.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Cluster Power Controller (CPC) is responsible for power management in
multiprocessing system. It provides registers to control the power and the
clock frequency of the individual elements in the system.
This patch implements only three registers that are used to control the
power state of each VP on a single core:
* VP Run is a write-only register used to set each VP to the run state
* VP Stop is a write-only register used to set each VP to the suspend state
* VP Running is a read-only register indicating the run state of each VP
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Add initial GCR support to indicate number of VPs present in the system,
L2 bypass mode and revision number.
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
[leon.alrae@imgtec.com:
* removed GIC part,
* changed commit message,
* replaced %lx format spec. with PRIx64,
* renamed mips_gcr.{c,h} to mips_cmgcr.{c,h},
* replaced CONFIG_MIPS_GIC with CONFIG_MIPS_CPS]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Implement generic MIPS Coherent Processing System (CPS) which in this
commit just creates VPs, but it will serve as a container also for
other components like Global Configuration Registers and Cluster Power
Controller.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
This reverts commit 9596ef7c7b.
This workaround in order to fix endless interrupts is no
longer needed because it was superseded by the previous patch
(e1000: Fixing interrupt pace).
Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch introduces an upper bound for number of interrupts
per second. Without this bound an interrupt storm can occur as
it has been observed on Windows 10 when disabling the device.
According to the SPEC - Intel PCI/PCI-X Family of Gigabit
Ethernet Controllers Software Developer's Manual, section
13.4.18 - the Ethernet controller guarantees a maximum
observable interrupt rate of 7813 interrupts/sec. If there is
no upper bound this could lead to an interrupt storm by e1000
(when mit_delay < 500) causing interrupts to fire at a very high
pace.
Thus if mit_delay < 500 then the delay should be set to the
minimum delay possible which is 500. This can be calculated
easily as follows:
Interval = 10^9 / (7813 * 256) = 500.
Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
- Use 128bit math to avoid asserts with IOMMU regions (Bandan Das)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2.0.14 (GNU/Linux)
iQIcBAABAgAGBQJW+a1UAAoJECObm247sIsiqMsP/RX+lfr2LF7MA7mTEHFIya51
K0LkICysQ5Kh/nIa7yPuPKuGJaDazpYrASq4P5dwm/5IPHk3r/eE7oue7/f1itOK
Rgv1sv2nsVkd0p9adTpMkuIAYzLPyp2enL6iuFZo8urwFDfjfAxo8Q0pFd/nxWz7
u7ft69vV40uWtHDg3TZx1EH19UJc0S2ouJeP1Q/MBLZ6FrJq2/SkgujNdX/OvnAv
0zi9mKOykc+lDEzQShXyivIDnl8NYwnEDdcfMvCf3JoV5j/SkCtLyryRoKGOV2LW
53687rNucVX5HLiFEs2hBuYhpxo5/Y7XOxAgbtRkkX1Dh12oy0u1NlV21BV5sPpQ
KfiIimtVcq/EuLhs/HLMvwT83EwIRvlXmXiJxii5vWU7Nimmx+dpBqtkrwJ0qzch
SPs+SxcCKCNLYeSPQxZ5mQTaf4uNgvBWMVm0nJtQvrWfUp/iLdDMeOx5Dx2wnoT4
8ksHkJin7j2JQnmQiCrPHgLsp47NF0cJixe10DkA9AaQBUcPaExfyr/n5qZIkbvz
gxbW0QG1bkpYPD8uNxgFIblSlQGXVTpcvVUaUW7crEH+Dw3GK4h8UcTkKH7rqHGk
9eg0NfHMQDpS/Hr8MneXuOOlUOH2s9ZR0zVPsEwR1kqV1GNEmPbAz1tiLrGLwH3r
/UBzyVFjBrdeP4gAJt4/
=DpB8
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160328.0' into staging
VFIO updates 2016-03-28
- Use 128bit math to avoid asserts with IOMMU regions (Bandan Das)
# gpg: Signature made Mon 28 Mar 2016 23:16:52 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg: aka "Alex Williamson <alex@shazbot.org>"
# gpg: aka "Alex Williamson <alwillia@redhat.com>"
# gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>"
* remotes/awilliam/tags/vfio-update-20160328.0:
vfio: convert to 128 bit arithmetic calculations when adding mem regions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vfio_listener_region_add for a iommu mr results in
an overflow assert since iommu memory region is initialized
with UINT64_MAX. Convert calculations to 128 bit arithmetic
for iommu memory regions and let int128_get64 assert for non iommu
regions if there's an overflow.
Suggested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Bandan Das <bsd@redhat.com>
[missed (end - 1) on 2nd trace call, move llsize closer to use]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
* Chardev fix from Marc-André
* config.status tweak from David
* Header file tweaks from Markus, myself and Veronia (Outreachy candidate)
* get_ticks_per_sec() removal from Rutuja (Outreachy candidate)
* Coverity fix from myself
* PKE implementation from myself, based on rth's XSAVE support
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJW9ErPAAoJEL/70l94x66DJfEH/A/QkMpAhrgNdyVsahzsGrzE
wx5gHFIc1nBYxyr62w4apUb5jPB7zaXu0LA7EAWDeAe0pyP8hZzLT9kJyOEDsuJu
zwKN2QeLSNMtPbnbKN0I/YQ2za2xX1V5ruhSeOJoVslUI214hgnAURaGshhQNzuZ
2CluDT9KgL5cQifAnKs5kJrwhIYShYNQB+1eDC/7wk28dd/EH+sPALIoF+rqrSmt
Zu4Mdqd+9Ns+oKOjA6br9ULq/Hzg0aDfY82J+XLVVqfF3PXQe8rTDmuMf/7jTn+M
Un7ZOcei9oZF2/9vfAfKQpDCcgD9HvOUSbgqV/ubmkPPmN/LNJzeKj0fBhrRN+Y=
=K12D
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Log filtering from Alex and Peter
* Chardev fix from Marc-André
* config.status tweak from David
* Header file tweaks from Markus, myself and Veronia (Outreachy candidate)
* get_ticks_per_sec() removal from Rutuja (Outreachy candidate)
* Coverity fix from myself
* PKE implementation from myself, based on rth's XSAVE support
# gpg: Signature made Thu 24 Mar 2016 20:15:11 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (28 commits)
target-i386: implement PKE for TCG
config.status: Pass extra parameters
char: translate from QIOChannel error to errno
exec: fix error handling in file_ram_alloc
cputlb: modernise the debug support
qemu-log: support simple pid substitution for logs
target-arm: dfilter support for in_asm
qemu-log: dfilter-ise exec, out_asm, op and opt_op
qemu-log: new option -dfilter to limit output
qemu-log: Improve the "exec" TB execution logging
qemu-log: Avoid function call for disabled qemu_log_mask logging
qemu-log: correct help text for -d cpu
tcg: pass down TranslationBlock to tcg_code_gen
util: move declarations out of qemu-common.h
Replaced get_tick_per_sec() by NANOSECONDS_PER_SECOND
hw: explicitly include qemu-common.h and cpu.h
include/crypto: Include qapi-types.h or qemu/bswap.h instead of qemu-common.h
isa: Move DMA_transfer_handler from qemu-common.h to hw/isa/isa.h
Move ParallelIOArg from qemu-common.h to sysemu/char.h
Move QEMU_ALIGN_*() from qemu-common.h to qemu/osdep.h
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Conflicts:
scripts/clean-includes
RX buffer pools are now enabled by default for new machine types.
For older machine types, they are still disabled to avoid breaking
migration.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
tl;dr:
This patch introduces an alternate way of handling the receive
buffers of the spapr-vlan device, resulting in much better
receive performance for the guest.
Full story:
One of our testers recently discovered that the performance of the
spapr-vlan device is very poor compared to other NICs, and that
a simple "ping -i 0.2 -s 65507 someip" in the guest can result
in more than 50% lost ping packets (especially with older guest
kernels < 3.17).
After doing some analysis, it was clear that there is a problem
with the way we handle the receive buffers in spapr_llan.c: The
ibmveth driver of the guest Linux kernel tries to add a lot of
buffers into several buffer pools (with 512, 2048 and 65536 byte
sizes by default, but it can be changed via the entries in the
/sys/devices/vio/1000/pool* directories of the guest). However,
the spapr-vlan device of QEMU only tries to squeeze all receive
buffer descriptors into one single page which has been supplied
by the guest during the H_REGISTER_LOGICAL_LAN call, without
taking care of different buffer sizes. This has two bad effects:
First, only a very limited number of buffer descriptors is accepted
at all. Second, we also hand 64k buffers to the guest even if
the 2k buffers would fit better - and this results in dropped packets
in the IP layer of the guest since too much skbuf memory is used.
Though it seems at a first glance like PAPR says that we should store
the receive buffer descriptors in the page that is supplied during
the H_REGISTER_LOGICAL_LAN call, chapter 16.4.1.2 in the LoPAPR spec
declares that "the contents of these descriptors are architecturally
opaque, none of these descriptors are manipulated by code above
the architected interfaces". That means we don't have to store
the RX buffer descriptors in this page, but can also manage the
receive buffers at the hypervisor level only. This is now what we
are doing here: Introducing proper RX buffer pools which are also
sorted by size of the buffers, so we can hand out a buffer with
the best fitting size when a packet has been received.
To avoid problems with migration from/to older version of QEMU,
the old behavior is also retained and enabled by default. The new
buffer management has to be enabled via a new "use-rx-buffer-pools"
property.
Now with the new buffer pool management enabled, the problem with
"ping -s 65507" is fixed for me, and the throughput of a simple
test with wget increases from creeping 3MB/s up to 20MB/s!
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Refactor the code a little bit by extracting the code that reads
and writes the receive buffer list page into separate functions.
There should be no functional change in this patch, this is just
a preparation for the upcoming extensions that introduce receive
buffer pools.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
And move the code adjusting the MSR mask and calling kvmppc_set_papr()
to it. This allows us to add a few more things such as disabling setting
of MSR:HV and appropriate LPCR bits which will be used when fixing
the exception model.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: removed LPCR setting ]
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ePAPR defines "hcall-instructions" device-tree property which contains
code to call hypercalls in ePAPR paravirtualized guests. In general
pseries guests won't use this property, instead using the PAPR defined
hypercall interface.
However, this property has been re-used to implement a hack to allow
PR KVM to run (slightly modified) guests in some situations where it
otherwise wouldn't be able to (because the system's L0 hypervisor
doesn't forward the PAPR hypercalls to the PR KVM kernel).
Hence, this property is always present in the device tree for pseries
guests. All KVM guests use it at least to read features via the
KVM_HC_FEATURES hypercall.
The property is populated by the code returned from the KVM's
KVM_PPC_GET_PVINFO ioctl; if not implemented in the KVM, QEMU supplies
code which will fail all hypercall attempts. If QEMU does not create
the property, and the guest kernel is compiled with
CONFIG_EPAPR_PARAVIRT (which is normally the case), there is exactly
the same stub at @epapr_hypercall_start already.
Rather than maintaining this fairly useless stub implementation, it
makes more sense not to create the property in the device tree in the
first place if the host kernel does not implement it.
This changes kvmppc_get_hypercall() to return 1 if the host kernel
does not implement KVM_CAP_PPC_GET_PVINFO. The caller can use it to decide
on whether to create the property or not.
This changes the pseries machine to not create the property if KVM does
not implement KVM_PPC_GET_PVINFO. In practice this means that from now
on the property will not be created if either HV KVM or TCG is used.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
[reworded commit message for clarity --dwg]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW8FqyAAoJEDhwtADrkYZTjYcP/R1m2LcFnLTxzDjSK38nxWcw
5t/Do7nBNgXL2ZdRHfJsy7bx/9RR55k16rvzkFgW8LpUa5Ro64onRh2PfMz2p0e8
QvZRBhXTh5/y4TD61y5Y8d9xawA6Hr1oEUtwsfovI9EiXzVaLl3sLI/nleed68Rk
eAD2h8+ZcBeJ+lRK3UHEzAvqh0u+IScRMJifCxHyJuoZiylHIHVVq7x40ywg0Ejq
8wHEj/nDJZHUxbuH4sm215Lv4dK6CmIP8UzuhfY6MxAS6Jo7Zdk1zv2SjJO2DzwT
rWU4hD0+khwTz3hBR341oWxb84C5MujPwkeP7mibR46HLHCn5imQMz0W+6tj7umb
dxnwPpXzON00+56B7e4i21aXTO0IaY3AcL9QuETSAaoy3SD5BdDkt3R9XWM+jqqZ
armE5nNAv8WEN8qUYL/YpBxFDYSZ3CFgNv1enoP2pSp4DqeF/H3aP4RWu+dYqLDm
MyVhcXUkjHfTCY6NVPPBkNwSvz2vq4ft/b6t7tLN+0ZmIRsEegKxxRrI2vB6O8Ga
Gh2iKcJfMp90jwwvywfGO+DNQ8npHvhxMkioyzMHflo0QyS2ZDhlf4ubp7cXlYZ6
tj7iGXJKJQpQyJWA58k8EXR9wc2W+fgRYD/H61QTTyTUgxEo6w10KjBDTsbFwvIY
R0poHCfRR0DQ7y3GerZO
=XEMm
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/armbru/tags/pull-ivshmem-2016-03-18' into staging
ivshmem: Fixes, cleanups, device model split
# gpg: Signature made Mon 21 Mar 2016 20:33:54 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
* remotes/armbru/tags/pull-ivshmem-2016-03-18: (40 commits)
contrib/ivshmem-server: Print "not for production" warning
ivshmem: Require master to have ID zero
ivshmem: Drop ivshmem property x-memdev
ivshmem: Clean up after the previous commit
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem
ivshmem: Replace int role_val by OnOffAuto master
qdev: New DEFINE_PROP_ON_OFF_AUTO
ivshmem: Inline check_shm_size() into its only caller
ivshmem: Simplify memory regions for BAR 2 (shared memory)
ivshmem: Implement shm=... with a memory backend
ivshmem: Tighten check of property "size"
ivshmem: Simplify how we cope with short reads from server
ivshmem: Drop the hackish test for UNIX domain chardev
ivshmem: Rely on server sending the ID right after the version
ivshmem: Propagate errors through ivshmem_recv_setup()
ivshmem: Receive shared memory synchronously in realize()
ivshmem: Plug leaks on unplug, fix peer disconnect
ivshmem: Disentangle ivshmem_read()
ivshmem: Simplify rejection of invalid peer ID from server
ivshmem: Assert interrupts are set up once
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Move declarations out of qemu-common.h for functions declared in
utils/ files: e.g. include/qemu/path.h for utils/path.c.
Move inline functions out of qemu-common.h and into new files (e.g.
include/qemu/bcd.h)
Signed-off-by: Veronia Bahaa <veroniabahaa@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch replaces get_ticks_per_sec() calls with the macro
NANOSECONDS_PER_SECOND. Also, as there are no callers, get_ticks_per_sec()
is then removed. This replacement improves the readability and
understandability of code.
For example,
timer_mod(fdctrl->result_timer,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + (get_ticks_per_sec() / 50));
NANOSECONDS_PER_SECOND makes it obvious that qemu_clock_get_ns
matches the unit of the expression on the right side of the plus.
Signed-off-by: Rutuja Shah <rutu.shah.26@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DMA_transfer_handler is actually an ISA thing, and as such has no
business in qemu-common.h. Move it to hw/isa/isa.h, and rename it to
IsaDmaTransferHandler.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu-common.h should only be included by .c files. Its file comment
explains why: "No header file should depend on qemu-common.h, as this
would easily lead to circular header dependencies."
hw/pci/pci.h includes qemu-common.h, but its users only need pcibus_t
and PCIHostDeviceAddress from it. Move them to hw/pci/pci.h and drop
the ill-advised include. Include hw/pci/pci.h where the moved stuff
is now missing. Except we can't in target-i386/kvm_i386.h, because
that would break the i386-linux-user compile. Add
PCIHostDeviceAddress to qemu/typedefs.h instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu-common.h should only be included by .c files. Its file comment
explains why: "No header file should depend on qemu-common.h, as this
would easily lead to circular header dependencies."
hw/hw.h includes qemu-common.h, but its users generally need only
hw_error() and qemu/module.h from it. Move the former to hw/hw.h,
include the latter there, and drop the ill-advised include.
hw/misc/cbus.c now misses hw_error(), so include hw/hw.h there.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Manually drop redundant includes that scripts/clean-includes misses,
e.g. because they're hidden in generator programs, or they use the
wrong kind of delimiter.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Re-run scripts/clean-includes to apply the previous commit's
corrections and updates. Besides redundant qemu/typedefs.h, this only
finds a redundant config-host.h include in ui/egl-helpers.c. No idea
how that escaped the previous runs.
Some manual whitespace trimming around dropped includes squashed in.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 57cb38b included qapi/error.h into qemu/osdep.h to get the
Error typedef. Since then, we've moved to include qemu/osdep.h
everywhere. Its file comment explains: "To avoid getting into
possible circular include dependencies, this file should not include
any other QEMU headers, with the exceptions of config-host.h,
compiler.h, os-posix.h and os-win32.h, all of which are doing a
similar job to this file and are under similar constraints."
qapi/error.h doesn't do a similar job, and it doesn't adhere to
similar constraints: it includes qapi-types.h. That's in excess of
100KiB of crap most .c files don't actually need.
Add the typedef to qemu/typedefs.h, and include that instead of
qapi/error.h. Include qapi/error.h in .c files that need it and don't
get it now. Include qapi-types.h in qom/object.h for uint16List.
Update scripts/clean-includes accordingly. Update it further to match
reality: replace config.h by config-target.h, add sysemu/os-posix.h,
sysemu/os-win32.h. Update the list of includes in the qemu/osdep.h
comment quoted above similarly.
This reduces the number of objects depending on qapi/error.h from "all
of them" to less than a third. Unfortunately, the number depending on
qapi-types.h shrinks only a little. More work is needed for that one.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
[Fix compilation without the spice devel packages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migration with ivshmem needs to be carefully orchestrated to work.
Exactly one peer (the "master") migrates to the destination, all other
peers need to unplug (and disconnect), migrate, plug back (and
reconnect). This is sort of documented in qemu-doc.
If peers connect on the destination before migration completes, the
shared memory can get messed up. This isn't documented anywhere. Fix
that in qemu-doc.
To avoid messing up register IVPosition on migration, the server must
assign the same ID on source and destination. ivshmem-spec.txt leaves
ID assignment unspecified, however.
Amend ivshmem-spec.txt to require the first client to receive ID zero.
The example ivshmem-server complies: it always assigns the first
unused ID.
For a bit of additional safety, enforce ID zero for the master. This
does nothing when we're not using a server, because the ID is zero for
all peers then.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-40-git-send-email-armbru@redhat.com>
Move code to more sensible places. Use the opportunity to reorder and
document IVShmemState members.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-38-git-send-email-armbru@redhat.com>
ivshmem can be configured with and without interrupt capability
(a.k.a. "doorbell"). The two configurations have largely disjoint
options, which makes for a confusing (and badly checked) user
interface. Moreover, the device can't tell the guest whether its
doorbell is enabled.
Create two new device models ivshmem-plain and ivshmem-doorbell, and
deprecate the old one.
Changes from ivshmem:
* PCI revision is 1 instead of 0. The new revision is fully backwards
compatible for guests. Guests may elect to require at least
revision 1 to make sure they're not exposed to the funny "no shared
memory, yet" state.
* Property "role" replaced by "master". role=master becomes
master=on, role=peer becomes master=off. Default is off instead of
auto.
* Property "use64" is gone. The new devices always have 64 bit BARs.
Changes from ivshmem to ivshmem-plain:
* The Interrupt Pin register in PCI config space is zero (does not use
an interrupt pin) instead of one (uses INTA).
* Property "x-memdev" is renamed to "memdev".
* Properties "shm" and "size" are gone. Use property "memdev"
instead.
* Property "msi" is gone. The new device can't have MSI-X capability.
It can't interrupt anyway.
* Properties "ioeventfd" and "vectors" are gone. They're meaningless
without interrupts anyway.
Changes from ivshmem to ivshmem-doorbell:
* Property "msi" is gone. The new device always has MSI-X capability.
* Property "ioeventfd" defaults to on instead of off.
* Property "size" is gone. The new device can only map all the shared
memory received from the server.
Guests can easily find out whether the device is configured for
interrupts by checking for MSI-X capability.
Note: some code added in sub-optimal places to make the diff easier to
review. The next commit will move it to more sensible places.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
In preparation of making it a qdev property.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-36-git-send-email-armbru@redhat.com>
ivshmem_realize() puts the shared memory region in a container region.
Used to be necessary to permit delayed mapping of the shared memory.
However, we recently moved to synchronous mapping, in "ivshmem:
Receive shared memory synchronously in realize()" and the commit
following it. The container is redundant since then. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-33-git-send-email-armbru@redhat.com>
ivshmem has its very own code to create and map shared memory.
Replace that with an implicitly created memory backend. Reduces the
number of ways we create BAR 2 from three to two.
The memory-backend-file is currently available only with CONFIG_LINUX,
so this adds a second Linuxism to ivshmem (the other one is eventfd).
Should we ever need to make it portable to systems where
memory-backend-file can't be made to serve, we could create a
memory-backend-shmem that allocates memory with shm_open().
Bonus fix: shared memory files are now created with permissions 0655
instead of 0777.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-32-git-send-email-armbru@redhat.com>
If size_t is narrower than 64 bits, passing uint64_t ivshmem_size to
mmap() truncates. Reject such sizes.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-31-git-send-email-armbru@redhat.com>
Short reads from a UNIX domain sockets are exceedingly unlikely when
the other side always sends eight bytes and we always read eight
bytes. We cope with them anyway. However, the code doing that is
rather convoluted. Dumb it down radically.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-30-git-send-email-armbru@redhat.com>
The chardev must be capable of transmitting SCM_RIGHTS ancillary
messages. We check it by comparing CharDriverState member filename to
"unix:". That's almost as brittle as it is disgusting.
When the actual transmission all happened asynchronously, this check
was all we could do in realize(), and thus better than nothing. But
now we receive at least one SCM_RIGHTS synchronously in realize(),
it's not worth its keep anymore. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-29-git-send-email-armbru@redhat.com>
The protocol specification (ivshmem-spec.txt, formerly
ivshmem_device_spec.txt) has always required the ID message to be sent
right at the beginning, and ivshmem-server has always complied. The
device, however, accepts it out of order. If an interrupt setup
arrived before it, though, it would be misinterpreted as connect
notification. Fix the latent bug by relying on the spec and
ivshmem-server's actual behavior.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-28-git-send-email-armbru@redhat.com>
This kills off the funny state described in the previous commit.
Simplify ivshmem_io_read() accordingly, and update documentation.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-27-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
When configured for interrupts (property "chardev" given), we receive
the shared memory from an ivshmem server. We do so asynchronously
after realize() completes, by setting up callbacks with
qemu_chr_add_handlers().
Keeping server I/O out of realize() that way avoids delays due to a
slow server. This is probably relevant only for hot plug.
However, this funny "no shared memory, yet" state of the device also
causes a raft of issues that are hard or impossible to work around:
* The guest is exposed to this state: when we enter and leave it its
shared memory contents is apruptly replaced, and device register
IVPosition changes.
This is a known issue. We document that guests should not access
the shared memory after device initialization until the IVPosition
register becomes non-negative.
For cold plug, the funny state is unlikely to be visible in
practice, because we normally receive the shared memory long before
the guest gets around to mess with the device.
For hot plug, the timing is tighter, but the relative slowness of
PCI device configuration has a good chance to hide the funny state.
In either case, guests complying with the documented procedure are
safe.
* Migration becomes racy.
If migration completes before the shared memory setup completes on
the source, shared memory contents is silently lost. Fortunately,
migration is rather unlikely to win this race.
If the shared memory's ramblock arrives at the destination before
shared memory setup completes, migration fails.
There is no known way for a management application to wait for
shared memory setup to complete.
All you can do is retry failed migration. You can improve your
chances by leaving more time between running the destination QEMU
and the migrate command.
To mitigate silent memory loss, you need to ensure the server
initializes shared memory exactly the same on source and
destination.
These issues are entirely undocumented so far.
I'd expect the server to be almost always fast enough to hide these
issues. But then rare catastrophic races are in a way the worst kind.
This is way more trouble than I'm willing to take from any device.
Kill the funny state by receiving shared memory synchronously in
realize(). If your hot plug hangs, go kill your ivshmem server.
For easier review, this commit only makes the receive synchronous, it
doesn't add the necessary error propagation. Without that, the funny
state persists. The next commit will do that, and kill it off for
real.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
close_peer_eventfds() cleans up three things: ioeventfd triggers if
they exist, eventfds, and the array to store them.
Commit 98609cd (v1.2.0) fixed it not to clean up ioeventfd triggers
when they don't exist (property ioeventfd=off, which is the default).
Unfortunately, the fix also made it skip cleanup of the eventfds and
the array then. This is a memory and file descriptor leak on unplug.
Additionally, the reset of nb_eventfds is skipped. Doesn't matter on
unplug. On peer disconnect, however, this permanently wedges the
interrupt vectors used for that peer's ID. The eventfds stay behind,
but aren't connected to a peer anymore. When the ID gets recycled for
a new peer, the new peer's eventfds get assigned to vectors after the
old ones. Commonly, the device's number of vectors matches the
server's, so the new ones get dropped with a "Too many eventfd
received" message. Interrupts either don't work (common case) or go
to the wrong vector.
Fix by narrowing the conditional to just the ioeventfd trigger
cleanup.
While there, move the "invalid" peer check to the only caller where it
can actually happen, and tighten it to reject own ID.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-25-git-send-email-armbru@redhat.com>
ivshmem_read() processes server messages. These are 64 bit signed
integers. -1 is shared memory setup, 16 bit unsigned is a peer ID,
anything else is invalid.
ivshmem_read() rejects invalid negative messages right away, silently.
Invalid positive messages get rejected only in resize_peers(), and
ivshmem_read() then prints the rather cryptic message "failed to
resize peers array".
Extend the first check to cover all invalid messages, make it report
"server sent invalid message", and drop the second check.
Now resize_peers() can't fail anymore; simplify.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-23-git-send-email-armbru@redhat.com>
An interrupt is set up when the interrupt's file descriptor is
received. Each message applies to the next interrupt vector.
Therefore, each vector cannot be set up more than once.
ivshmem_add_kvm_msi_virq() half-heartedly tries not to rely on this by
doing nothing then, but that's not going to recover from this error
should it become possible in the future. watch_vector_notifier()
doesn't even try.
Simply assert what is the case, so we get alerted if we ever screw it
up.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-22-git-send-email-armbru@redhat.com>
The ivshmem device can either use MSI-X or legacy INTx for interrupts.
With MSI-X enabled, peer interrupt events trigger an MSI as they
should. But software can still raise INTx via interrupt status and
mask register in BAR 0. This is explicitly prohibited by PCI Local
Bus Specification Revision 3.0, section 6.8.3.3:
While enabled for MSI or MSI-X operation, a function is prohibited
from using its INTx# pin (if implemented) to request service (MSI,
MSI-X, and INTx# are mutually exclusive).
Fix the device model to leave INTx alone when using MSI-X.
Document that we claim to use INTx in config space even when we don't.
Unlike other devices, ivshmem does *not* use INTx when configured for
MSI-X and MSI-X isn't enabled by software.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-21-git-send-email-armbru@redhat.com>
There are three predicates related to MSI-X:
* ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X
variant of the device is selected with msi=off.
* msix_present() is true when the device has the PCI capability MSI-X.
It's initially false, and becomes true during successful realize of
the MSI-X variant of the device. Thus, it's the same as
ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices.
* msix_enabled() is true when msix_present() is true and guest software
has enabled MSI-X.
Code that differs between the non-MSI-X and the MSI-X variant of the
device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or
by msix_present(), except the latter works only for realized devices.
Code that depends on whether MSI-X is in use needs to be guarded with
msix_enabled().
Code review led me to two minor messes:
* ivshmem_vector_notify() calls msix_notify() even when
!msix_enabled(), unlike most other MSI-X-capable devices. As far as
I can tell, msix_notify() does nothing when !msix_enabled(). Add
the guard anyway.
* Most callers of ivshmem_use_msix() guard it with
ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because
ivshmem_use_msix() does nothing when !msix_present(). That's
ivshmem's only use of msix_present(), though. Guard it
consistently, and drop the now redundant msix_present() check.
While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use().
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
If pci_ivshmem_realize() fails after it created its migration blocker,
the blocker is left in place. Fix that by creating it last.
Likewise, if it fails after it called fifo8_create(), it leaks fifo
memory. Fix that the same way.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-18-git-send-email-armbru@redhat.com>
We reuse errp after passing it host_memory_backend_get_memory(). If
both host_memory_backend_get_memory() and the reuse set an error, the
reuse will fail the assertion in error_setv(). Fortunately,
host_memory_backend_get_memory() can't fail.
Pass it &error_abort to make our assumption explicit, and to get the
assertion failure in the right place should it become invalid.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-17-git-send-email-armbru@redhat.com>
Yes, the chardev is commonly useless after we read a bad version from
it, but destroying it is inappropriate anyway: the user created it, so
the user should be able to hold on to it as long as he likes. We
don't destroy it on other errors. Screwed up in commit 5105b1d.
Stop reading instead.
Also note QEMU's behavior in ivshmem-spec.txt.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-16-git-send-email-armbru@redhat.com>
IVShmemState member eventfd_chr is useless since commit 9940c32. Drop
it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-14-git-send-email-armbru@redhat.com>
USB Ehci emulation supports host controller capability registers.
But its mmio '.write' function was missing, which lead to a null
pointer dereference issue. Add a do nothing 'ehci_caps_write'
definition to avoid it; Do nothing because capability registers
are Read Only(RO).
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1454072434-16045-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
inotify_init1 usage was guarded by a check for linux but does not
exist on older distributions like CentOS 5 resulting in build
failures.
Signed-off-by: Matthew Fortune <matthew.fortune@imgtec.com>
Message-id: 6D39441BF12EF246A7ABCE6654B023536BB85D4A@hhmail02.hh.imgtec.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
All the callers for xhci_dma_write_u32s() are using mostly 5 * uint32_t
in len. To avoid unbound stack warning for the function, make it
statically allocated, and assert when it's not big enough in the
future.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 1457661106-9569-1-git-send-email-peterx@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Mingw-w64 does not provide sys/ioctl.h and Linux builds don't need it,
so remove that include statement.
ERROR is defined by wingdi.h (included via windows.h). Undefine it before
it is redefined to avoid a compiler warning / error.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1458159439-32322-1-git-send-email-sw@weilnetz.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Simple unions were carrying a special case that hid their 'data'
QMP member from the resulting C struct, via the hack method
QAPISchemaObjectTypeVariant.simple_union_type(). But by using
the work we started by unboxing flat union and alternate
branches, coupled with the ability to visit the members of an
implicit type, we can now expose the simple union's implicit
type in qapi-types.h:
| struct q_obj_ImageInfoSpecificQCow2_wrapper {
| ImageInfoSpecificQCow2 *data;
| };
|
| struct q_obj_ImageInfoSpecificVmdk_wrapper {
| ImageInfoSpecificVmdk *data;
| };
...
| struct ImageInfoSpecific {
| ImageInfoSpecificKind type;
| union { /* union tag is @type */
| void *data;
|- ImageInfoSpecificQCow2 *qcow2;
|- ImageInfoSpecificVmdk *vmdk;
|+ q_obj_ImageInfoSpecificQCow2_wrapper qcow2;
|+ q_obj_ImageInfoSpecificVmdk_wrapper vmdk;
| } u;
| };
Doing this removes asymmetry between QAPI's QMP side and its
C side (both sides now expose 'data'), and means that the
treatment of a simple union as sugar for a flat union is now
equivalent in both languages (previously the two approaches used
a different layer of dereferencing, where the simple union could
be converted to a flat union with equivalent C layout but
different {} on the wire, or to an equivalent QMP wire form
but with different C representation). Using the implicit type
also lets us get rid of the simple_union_type() hack.
Of course, now all clients of simple unions have to adjust from
using su->u.member to using su->u.member.data; while this touches
a number of files in the tree, some earlier cleanup patches
helped minimize the change to the initialization of a temporary
variable rather than every single member access. The generated
qapi-visit.c code is also affected by the layout change:
|@@ -7393,10 +7393,10 @@ void visit_type_ImageInfoSpecific_member
| }
| switch (obj->type) {
| case IMAGE_INFO_SPECIFIC_KIND_QCOW2:
|- visit_type_ImageInfoSpecificQCow2(v, "data", &obj->u.qcow2, &err);
|+ visit_type_q_obj_ImageInfoSpecificQCow2_wrapper_members(v, &obj->u.qcow2, &err);
| break;
| case IMAGE_INFO_SPECIFIC_KIND_VMDK:
|- visit_type_ImageInfoSpecificVmdk(v, "data", &obj->u.vmdk, &err);
|+ visit_type_q_obj_ImageInfoSpecificVmdk_wrapper_members(v, &obj->u.vmdk, &err);
| break;
| default:
| abort();
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-13-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.
In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".
If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only remaining users of machine_init() only call
qemu_add_opts(). Rename machine_init() to opts_init() and move it
closer to the qemu_add_opts() calls on vl.c.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Change all machine_init() users that simply call type_register*()
to use type_init().
Cc: Evgeny Voevodin <e.voevodin@samsung.com>
Cc: Maksim Kozlov <m.kozlov@samsung.com>
Cc: Igor Mitsyanko <i.mitsyanko@gmail.com>
Cc: Dmitry Solodkiy <d.solodkiy@samsung.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: "Hervé Poussineau" <hpoussin@reactos.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The SD card object is not a SysBusDevice, so don't create it with
qdev_create() if we're not assigning it to a specific bus; use
object_new() instead.
This was causing 'info qtree' to segfault on boards with SD cards,
because qdev_create(NULL, TYPE_FOO) puts the created object on the
system bus, and then we may try to run functions like sysbus_dev_print()
on it, which fail when casting the object to SysBusDevice.
(This is the same mistake that we made with the NAND device
and fixed in commit 6749695eaaf346c1.)
Reported-by: xiaoqiang.zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: xiaoqiang.zhao <zxq_yx_007@163.com>
Message-id: 1458061009-7733-1-git-send-email-peter.maydell@linaro.org
At present, all DMA transfers complete inline (so a looping descriptor
queue will lock up the device). We also do not model pause/abort,
arbitrarion/priority, or debug features.
Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-6-git-send-email-Andrew.Baumann@microsoft.com
[AB: implement 2D mode, cleanup/refactoring for upstream submission]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The property channel driver now interfaces with the framebuffer device
to query and set framebuffer parameters. As a result of this, the "get
ARM RAM size" query now correctly returns the video RAM base address
(not total RAM size), and the ram-size property is no longer relevant
here.
Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-5-git-send-email-Andrew.Baumann@microsoft.com
[AB: cleanup/refactoring for upstream submission]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The framebuffer occupies the upper portion of memory (64MiB by
default), but it can only be controlled/configured via a system
mailbox or property channel (to be added by a subsequent patch).
Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-4-git-send-email-Andrew.Baumann@microsoft.com
[AB: added Windows (BGR) support and cleanup/refactoring for upstream submission]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At present only the core UART functions (data path for tx/rx) are
implemented, which is enough for UEFI to boot. The following
features/registers are unimplemented:
* Line/modem control
* Scratch register
* Extra control
* Baudrate
* SPI interfaces
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1457467526-8840-3-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1457467526-8840-2-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The new machine is a thin layer over the AST2400 ARM926-based SoC[1].
Between the minimal machine and the current SoC implementation there is
enough functionality to boot an aspeed_defconfig Linux kernel to
userspace. Nothing yet is specific to the Palmetto's BMC (other than
using an AST2400 SoC), but creating specific machine types is preferable
to a generic machine that doesn't match any particular hardware.
[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-5-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
While the ASPEED AST2400 SoC[1] has a broad range of capabilities this
implementation is minimal, comprising an ARM926 processor, ASPEED VIC
and timer devices, and a 8250 UART.
[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-4-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement a basic ASPEED VIC device model for the AST2400 SoC[1], with
enough functionality to boot an aspeed_defconfig Linux kernel. The model
implements the 'new' (revised) register set: While the hardware exposes
both the new and legacy register sets, accesses to the model's legacy
register set will not be serviced (however the access will be logged).
[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-3-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Implement basic ASPEED timer functionality for the AST2400 SoC[1]: Up to
8 timers can independently be configured, enabled, reset and disabled.
Some hardware features are not implemented, namely clock value matching
and pulse generation, but the implementation is enough to boot the Linux
kernel configured with aspeed_defconfig.
[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-2-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
EPIT, GPT and other i.MX timers are using "abstract" clocks among which
a CLK_IPG_HIGH clock.
On i.MX25 and i.MX31 CLK_IPG and CLK_IPG_HIGH are mapped to the same clock
but on other SOC like i.MX6 they are mapped to distinct clocks.
This patch add the CLK_IPG_HIGH to prepare for SOC where these 2 clocks are
different.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 224bf650194760284cb40630e985867e1373276a.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Most clocks supported by the CCM are useless to the qemu framework.
Only clocks related to timers (EPIT, GPT, PWM, WATCHDOG, ...) are usefull
to QEMU code.
Therefore this patch removes clock computation handling for all clocks but:
* CLK_NONE,
* CLK_IPG,
* CLK_32k
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 9e7222efb349801032e60c0f6b0fbad0e5dcf648.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This way all CCM clock defines/enums are named CLK_XXX
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 8537df765c1713625c7a8b9aca4c7ca60b42e0c0.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
GPT timer need to rollover when it reaches 0xffffffff.
It also need to reset to 0 when in "restart mode" and crossing the
compare 1 register.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 6e2b36117a249a78bf822dd59a390368f407136e.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch moves the common class initialization code from
"virt-2.6" to the new abstract class. An empty property is added to
"virt-2.6" machine. In the meanwhile, related funtions are renamed
to "virt_2_6_*" for consistency.
Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1457717778-17727-3-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preparation for future ARM virt machine types, this patch creates
an abstract type for all ARM machines. The current machine type in
QEMU (i.e. "virt") is renamed to "virt-2.6", whose naming scheme is
similar to other architectures. For the purpose of backward compatibility,
"virt" is converted to an alias, pointing to "virt-2.6". With this patch,
"qemu -M ?" lists the following virtual machine types along with others:
virt QEMU 2.6 ARM Virtual Machine (alias of virt-2.6)
virt-2.6 QEMU 2.6 ARM Virtual Machine
Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1457717778-17727-2-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Accumulated patches for target-ppc, pseries machine type and related
devices. As we are now in soft freeze, these are mostly fixes.
* Fix KVM migration for several SPRs that qemu didn't handle
* Clean up handling of SDR1, which allows a fix to the gdbstub
* Fix a race in spapr_rng
* Fix a bug with multifunction hotplug
The exception is the 7 patches to allow EEH on spapr-pci-host-bridge
devices (rather than the special and poorly designed
spapr-vfio-pci-host-bridge device). I believe these are low risk of
breaking non-EEH cases, and EEH cases were little used in practice
previously (since libvirt did not support the special device amongst
other things). It did have a draft posted before the soft freeze,
removes a very ugly VFIO interface, and removes device we'd like to
deprecate sooner rather than later. So, I'm hoping we can squeeze
these in during the soft freeze.
This includes two patches to the VFIO code, which Alex Williamson has
indicated he's ok with coming through my tree.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJW6Ol0AAoJEGw4ysog2bOSGs8QAMOnP0WTA7NXB5l5OBqM+pGI
cud7SnDr/GazPqvh1/1Enc+5M77gomdT6VHqlZGvgU23Iduil6a9mHeE89QY/b8B
wNM+mPvQH8TIp8Z9/GyayhgsK65LKa904Mw9C3vGh3Ecx9tAKm55IxDqZir15m2U
D9EHJQKkR4K6H5UyHr4eK8ACCWQdwn32VByEQ8hBV3wVszWR0+AKgCV30bH38c9/
rmAGr3VKtMUquGfVyMtvShoRmwHkLSL+Waxdqkfff6csCdAYH40N9CRCLQBdo08o
opd8dLbjCZlnwDoDIbw92i5P7oMysrIhCqOVqkiEGgUwkOIkR21SO0/DZdTWxkan
baFcTAYc8mcWc9fXUGHBcCBrU+ChRLI9h94x+BK8PzFDHY8SPGC3V+7lzzcooZHT
dmvPIY/JCdUVFYFQSYOznr3nni1L8M6Ol7OKyRrtr1dssCETLfI9fMCUxURybLUy
iolnJ8QGdcoSp620ewy4i33AKR+Y1Baby5AMK1iDGlwDlo/S7zqgIXVOyjU3ARZ6
yf6DZO1/0iLTu8nODqPd25GWuCB3GUmB5P6naZu+rEyvchBSTL5f6LOiXOWoso5f
Nk3oQ9GrlmIuOPemlIiO/yHbsb7GO9WYtl6fhpRp4CRdEvCj/i9xegZ+xX53VHWr
6kOHD8SHpF4qS+POWU5l
=7P3U
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160316' into staging
ppc patch queue for 2016-03-16
Accumulated patches for target-ppc, pseries machine type and related
devices. As we are now in soft freeze, these are mostly fixes.
* Fix KVM migration for several SPRs that qemu didn't handle
* Clean up handling of SDR1, which allows a fix to the gdbstub
* Fix a race in spapr_rng
* Fix a bug with multifunction hotplug
The exception is the 7 patches to allow EEH on spapr-pci-host-bridge
devices (rather than the special and poorly designed
spapr-vfio-pci-host-bridge device). I believe these are low risk of
breaking non-EEH cases, and EEH cases were little used in practice
previously (since libvirt did not support the special device amongst
other things). It did have a draft posted before the soft freeze,
removes a very ugly VFIO interface, and removes device we'd like to
deprecate sooner rather than later. So, I'm hoping we can squeeze
these in during the soft freeze.
This includes two patches to the VFIO code, which Alex Williamson has
indicated he's ok with coming through my tree.
# gpg: Signature made Wed 16 Mar 2016 05:04:52 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.6-20160316:
vfio: Eliminate vfio_container_ioctl()
spapr_pci: Remove finish_realize hook
spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
spapr_pci: Allow EEH on spapr-pci-host-bridge
spapr_pci: Eliminate class callbacks
spapr_pci: Switch to vfio_eeh_as_op() interface
vfio: Start improving VFIO/EEH interface
spapr_rng: fix race with main loop
target-ppc: Eliminate kvmppc_kern_htab global
target-ppc: Add helpers for updating a CPU's SDR1 and external HPT
target-ppc: Split out SREGS get/put functions
spapr_pci: fix multifunction hotplug
target-ppc: Add PVR for POWER8NVL processor
ppc: Add a few more P8 PMU SPRs
ppc: Fix migration of the TAR SPR
ppc: Define the PSPB register on POWER8
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vfio_container_ioctl() was a bad interface that bypassed abstraction
boundaries, had semantics that sat uneasily with its name, and was unsafe
in many realistic circumstances. Now that spapr-pci-vfio-host-bridge has
been folded into spapr-pci-host-bridge, there are no more users, so remove
it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass. So,
we can fold that implementation into its (single) caller, and remove the
hook. That's the last thing left in sPAPRPHBClass, so that can go away as
well.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
1. automatically sizes its DMA window to match the host IOMMU
2. checks if the attached VFIO container is backed by the
VFIO_SPAPR_TCE_IOMMU type on the host
(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.
Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea. It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.
(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.
It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely. This patch
reduces it to just a stub for backwards compatibility.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead. We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.
Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.
However, it's not really any worse than the current situation. Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState. So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper. For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.
On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
interface.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
At present the code handling IBM's Enhanced Error Handling (EEH) interface
on VFIO devices operates by bypassing the usual VFIO logic with
vfio_container_ioctl(). That's a poorly designed interface with unclear
semantics about exactly what can be operated on.
In particular it operates on a single vfio container internally (hence the
name), but takes an address space and group id, from which it deduces the
container in a rather roundabout way. groupids are something that code
outside vfio shouldn't even be aware of.
This patch creates new interfaces for EEH operations. Internally we
have vfio_eeh_container_op() which takes a VFIOContainer object
directly. For external use we have vfio_eeh_as_ok() which determines
if an AddressSpace is usable for EEH (at present this means it has a
single container with exactly one group attached), and vfio_eeh_as_op()
which will perform an operation on an AddressSpace in the unambiguous case,
and otherwise returns an error.
This interface still isn't great, but it's enough of an improvement to
allow a number of cleanups in other places.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Since commit "60253ed1e6ec rng: add request queue support to rng-random",
the use of a spapr_rng device may hang vCPU threads.
The following path is taken without holding the lock to the main loop mutex:
h_random()
rng_backend_request_entropy()
rng_random_request_entropy()
qemu_set_fd_handler()
The consequence is that entropy_available() may be called before the vCPU
thread could even queue the request: depending on the scheduling, it may
happen that entropy_available() does not call random_recv()->qemu_sem_post().
The vCPU thread will then sleep forever in h_random()->qemu_sem_wait().
This could not happen before 60253ed1e6 because entropy_available() used
to call random_recv() unconditionally.
This patch ensures the lock is held to avoid the race.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
fa48b43 "target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM"
purports to remove a hack in the handling of hash page tables (HPTs)
managed by KVM instead of qemu. However, it actually went in the wrong
direction.
That patch requires anything looking for an external HPT (that is one not
managed by the guest itself) to check both env->external_htab (for a qemu
managed HPT) and kvmppc_kern_htab (for a KVM managed HPT). That's a
problem because kvmppc_kern_htab is local to mmu-hash64.c, but some places
which need to check for an external HPT are outside that, such as
kvm_arch_get_registers(). The latter was subtly broken by the earlier
patch such that gdbstub can no longer access memory.
Basically a KVM managed HPT is much more like a qemu managed HPT than it is
like a guest managed HPT, so the original "hack" was actually on the right
track.
This partially reverts fa48b43, so we again mark a KVM managed external HPT
by putting a special but non-NULL value in env->external_htab. It then
goes further, using that marker to eliminate the kvmppc_kern_htab global
entirely. The ppc_hash64_set_external_hpt() helper function is extended
to set that marker if passed a NULL value (if you're setting an external
HPT, but don't have an actual HPT to set, the assumption is that it must
be a KVM managed HPT).
This also has some flow-on changes to the HPT access helpers, required by
the above changes.
Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>