Commit d0ed8076cb converted the PCI config access to the memory
API, but also inadvertantly changed it to accept unaligned writes,
and corrupt the index register in the process. This causes a regression
booting NetBSD.
Fix by ignoring unaligned or non-dword writes.
https://bugs.launchpad.net/qemu/+bug/897771
Reported-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Slot present bit is cleared apparently for each device. Hotplug and non
hotplug devices should not mix normally, and we only set the bit when we
add a device so it should all work out, but it's more robust to
explicitly account for more than one device per slot.
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The PCI hotplug eject register has always returned 0, so let's redefine
it as a hotplug feature register. The existing model of using separate
up & down read-only registers and an eject via write to this register
becomes the base implementation. As we make use of new interfaces we'll
set bits here to allow the BIOS and AML implementation to optimize for
the platform implementation.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Clarify this register as read-only and remove write code. No
change in existing behavior.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As Michael Tsirkin demonstrated, current PCI hotplug is vulnerable
to a few races. The first is a race with other hotplug operations
because we clear the up & down registers at each event. If a new
event comes before the last is processed, up/down is cleared and
the event is lost.
To fix this for the down register, we create a life cycle for
the event request that starts with the hot unplug request in
piix4_device_hotplug() and ends when the device is ejected.
This allows us to mask and clear individual bits, preserving them
against races. For the up register, we have no clear end point
for when the event is finished. We could modify the BIOS to
acknowledge the bit and clear it, but this creates BIOS compatibiliy
issues without offering a complete solution. Instead we note that
gratuitous ACPI device checks are not harmful, which allows us to
issue a device check for every slot. We know which slots are present
and we know which slots are hotpluggable, so we can easily reduce
this to a more manageable set for the guest.
The other race Michael noted was that an unplug request followed
by reset may also lose the eject notification, which may also
result in the eject request being lost which a subsequent add
or remove. Once we're in reset, the device is unused and we can
flush the queue of device removals ourselves. Previously if a
device_del was issued to a guest without ACPI PCI hotplug support,
it was necessary to shutdown the guest to recover the device.
With this, a guest reboot is sufficient.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The write side of these registers is never used and actually can't be
used as defined because any read/modify/write sequence from the guest
potentially races with qemu. Drop the write support and define these
as read-only registers.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the virtio balloon device, when using the virtio-pci interface
advertises itself with PCI class code MEMORY_RAM. This is wrong; the
balloon is vaguely related to memory, but is nothing like a PCI memory
device in the meaning of the class code, and this code is not required
or suggested by the virtio PCI specification.
Worse, this patch causes problems on the pseries machine, because the
firmware, seeing this class code, advertises the device as memory in the
device tree, and then a guest kernel bug causes it to see this "memory"
before the real system memory, leading to a crash in early boot.
This patch fixes the problem by removing the bogus PCI class code on the
balloon device. The backwards compatibility PC machines get new compat
properties so that they don't change.
Cc: Rusty Russell <rusty@rustcorp.com.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ivshmem used msix but didn't call it on either reset or
config write paths. This used to partically work since
guests don't use all of msi-x configuration fields,
and reset is rarely used, but the patch 'msix: track function masked
in pci device state' broke that. Fix by adding appropriate calls.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reported-by: Cam Macdonell <cam@cs.ualberta.ca>
Tested-by: Cam Macdonell <cam@cs.ualberta.ca>
It's clear from the surrounding code that
start < end so it's enough to assert end < log_size.
However, it's better to make this explicit in case
we refactor the code again.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When the vhost log is resized, we want to sync up to
the size of the old log. With that end address in place,
ignore regions that start after then end rather than
hitting assert.
This also addresses the following crash report:
When migrating a vm using vhost-net we hit the following assertion:
qemu-kvm: /usr/src/packages/BUILD/qemu-kvm-0.15.1/hw/vhost.c:30:
vhost_dev_sync_region: Assertion `start / (0x1000 * (8 *
sizeof(vhost_log_chunk_t))) < dev->log_size' failed.
The cases which the end < start check is intended to catch, such as
for vga video memory, will also likely trigger the assertion.
Reorder the code to handle this correctly.
Reported-by: Josh Durgin <josh.durgin@dreamhost.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make it easier to add compat properties, by
adding macros for properties duplicated across
machine types.
Note: there could be bugs in compat properties,
this patch does not attempt to address them,
the code is bug for bug identical to the original.
Tested by: generated a preprocessed file, sorted and
compared to sorted original.
Lightly tested on x86_64.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There is a typo in i440FX init code. This is causing problems when
somebody wants to access the 64bit PCI range.
Signed-off-by: Alexey Korolev <alexey.korolev@endace.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* commit 'ff71f2e8cacefae99179993204172bc65e4303df': (21 commits)
rtl8139: do the network/host communication only in normal operating mode
rtl8139: correctly check the opmode
net: move compute_mcast_idx() to net.h
rtl8139: support byte read to TxStatus registers
rtl8139: remove unused marco
rtl8139: limit transmission buffer size in c+ mode
pci_regs: Add PCI_EXP_TYPE_PCIE_BRIDGE
virtio-net: add DATA_VALID flag
pci_bridge: upper 32 bit are long registers
pci: fix bridge IO/BASE
pcie: drop functionality moved to core
pci: set memory type for memory behind the bridge
pci: add standard bridge device
slotid: add slot id capability
shpc: standard hot plug controller
pci_bridge: user-friendly default bus name
pci: make another unused extern function static
pci: don't export an internal function
pci_regs: Fix value of PCI_EXP_TYPE_RC_EC.
pci: Do not check if a bus exist in pci_parse_devaddr.
...
* kwolf/for-anthony: (46 commits)
qed: remove incoming live migration blocker
qed: honor BDRV_O_INCOMING for incoming live migration
migration: clear BDRV_O_INCOMING flags on end of incoming live migration
qed: add bdrv_invalidate_cache to be called after incoming live migration
blockdev: open images with BDRV_O_INCOMING on incoming live migration
block: add a function to clear incoming live migration flags
block: Add new BDRV_O_INCOMING flag to notice incoming live migration
block stream: close unused files and update ->backing_hd
qemu-iotests: Fix call syntax for qemu-io
qemu-iotests: Fix call syntax for qemu-img
qemu-iotests: Test unknown qcow2 header extensions
qemu-iotests: qcow2.py
sheepdog: fix send req helpers
sheepdog: implement SD_OP_FLUSH_VDI operation
block: bdrv_append() fixes
qed: track dirty flag status
qemu-img: add dirty flag status
qed: image fragmentation statistics
qemu-img: add image fragmentation statistics
block: document job API
...
This FIXME has already been actioned. Deleted comment.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
It currently uses qerror_report(), but next commit will convert
the drive_del command to the QAPI and this requires using
error_set().
One particularity of qerror_report() is that it knows when it's
running on monitor context or command-line context and prints the
error message accordingly. error_set() doesn't do this, so we
have to be careful not to drop error messages.
qdev_unplug() has three kinds of usages:
1. It's called when hot adding a device fails, to undo anything
that has been done before hitting the error
2. It's called by function monitor functions like device_del(),
to unplug a device
3. It's used by xen_platform.c in a way that doesn't _seem_ to
be in monitor context
Only item 2 can print an error message to the user, this commit
maintains that.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
The official spelling is QEMU.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
[blauwirbel@gmail.com: fixed comment style in hw/sun4m.c]
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Currently dma_bdrv_io() takes a 'to_dev' boolean parameter to
determine the direction of DMA it is emulating. We already have a
DMADirection enum designed specifically to encode DMA directions.
This patch uses it for dma_bdrv_io() as well. This involves removing
the DMADirection definition from the #ifdef it was inside, but since that
only existed to protect the definition of dma_addr_t from places where
config.h is not included, there wasn't any reason for it to be there in
the first place.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Storage interfaces like virtio-blk can be configured with block size
information so that the guest can take advantage of efficient I/O
request sizes.
According to the SCSI Block Commands (SBC) standard a device's block
size is "almost always greater than one byte and may be a multiple of
512 bytes". QEMU currently has a 512 byte minimum block size because
the block layer functions work at that granularity. Furthermore, the
block size should be a power of 2 because QEMU calculates bitmasks from
the value.
Introduce a "blocksize" property type so devices can enforce these
constraints on block size values. If the constraints are relaxed in the
future then this property can be updated.
Introduce the new PropertyValueNotPowerOf2 QError so QMP clients know
exactly why a block size value was rejected.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Allow the user to specify a disk's World Wide Name.
Linux guests can address disks by their unique World Wide Name number
(e.g. /dev/disk/by-id/wwn-0x5001517959123522). This patch adds support
for assigning a World Wide Name number to a virtual IDE disk.
Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
strncpy may not null-terminate the destination string.
Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Allow the user to override the default disk model name "QEMU HARDDISK".
Some Linux distributions use the /dev/disk/by-id/scsi-SATA_name-of-disk-
model_serial addressing scheme when refering to partitions in /etc/fstab
and elsewhere. This causes problems when starting a disk image taken from
an existing physical server under qemu, because when running under qemu
name-of-disk-model is always "QEMU HARDDISK".
This patch introduces a model=s option which in combination with the
existing serial=s option can be used to fake the disk the operating
system was previously on, allowing the OS to boot properly.
Cc: kwolf@redhat.com
Signed-off-by: Floris Bos <dev@noc-ps.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
And remove several block_int.h inclusions that should not be there.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The vector interrupt has higher priority than interrupt_level_n.
Also check only interrupt_level_n concurency when TL > 0, the traps of
other types may be nested.
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Don't produce stray irq 5, don't overwrite ivec_data if still busy with
processing of the previous interrupt.
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
A strong limitation of QOM right now is that unconverted ports
(e.g. all...) do not give a canonical path to devices that are
part of the board. This in turn makes it impossible to replace
PROP_PTR with a QOM link for example.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We want the composition tree to to be in order by the time we call
qdev_init, so that a single set of the toplevel realize property can
propagate all the way down the composition tree.
This is not the case so far. Unfortunately, this is incompatible
with calling qdev_init in the constructor wrappers for devices,
so for now we need to unattach some devices that are created through
those wrappers. This will be fixed by removing qdev_init and instead
setting the toplevel realize property after machine init.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is QOM "mkdir -p". It is useful when referring to
container objects such as "/machine".
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We never actually clear the TEMT (transmit sending register empty) flag when
populating the TSR. We set the flag, but since it's never cleared, setting it
is sort of pointless..
I found this with a unit test case.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
I'm not sure if the retry logic has ever worked when not using FIFO mode. I
found this while writing a test case although code inspection confirms it is
definitely broken.
The TSR retry logic will never actually happen because it is guarded by an
'if (s->tsr_rety > 0)' but this is the only place that can ever make the
variable greater than zero. That effectively makes the retry logic an 'if (0)'.
I believe this is a typo and the intention was >= 0. Once this is fixed though,
I see double transmits with my test case. This is because in the non FIFO
case, serial_xmit may get invoked while LSR.THRE is still high because the
character was processed but the retransmit timer was still active.
We can handle this by simply checking for LSR.THRE and returning early. It's
possible that the FIFO paths also need some attention.
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This bug existed since the first commit. Fortunately, the affected
registers have no functionality in qemu. This will only prevent the
following warning:
milkymist_vgafb: write access to unknown register 0x00000034
Signed-off-by: Michael Walle <michael@walle.cc>
The new version introduces the following new registers:
- SoC clock frequency: read-only of system clock used on the SoC
- debug scratchpad: 8 bit scratchpad register
- debug write lock: write once register, without any function on QEMU
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since /i440fx/piix3 is being removed from the composition tree, the
IO-APIC is placed under /i440fx. This is wrong and should be changed
as soon as the /i440fx/piix3 path is put back.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This lets the user specify the desired semantics. By default, the RTC
will follow adjustments from the host's NTP client, and will remain in
sync when the virtual machine is stopped. The previous behavior, which
provides determinism with both icount and qtest, remains available with
"-rtc clock=vm".
pl031 supports migration, so we need to convert the time base from
rtc_clock to vm_clock and back for backwards compatibility. (The
rtc_clock may not be synchronized on the two machines, especially with
savevm/loadvm, so the conversion is needed anyway. And since any time
base will do, why not pick the one base that is backwards compatible).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This lets the user specify the desired semantics. By default, the RTC
will follow adjustments from the host's NTP client. "-rtc clock=vm" will
improve determinism with both icount and qtest. Finally, the previous
behavior is available with "-rtc clock=rt".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The output of the pulse generator needs to be deterministic when
running in -icount mode, and to remain constant whenever the VM is
stopped. So the right clock to use is vm_clock.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* sstabellini/disk_io:
xen_disk: when using AIO flush after the operation is completed
xen_disk: open disk with BDRV_O_NOCACHE | BDRV_O_CACHE_WB | BDRV_O_NATIVE_AIO
We need to detach the blkdev from the BlockDriverState before calling
bdrv_delete.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
The first console has a different location compared to other PV devices
(console, rather than device/console/0) and doesn't obey the xenstore
state protocol. We already special case the first console in con_init
and con_initialise, we should also do it in con_disconnect.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Commit 45efb16124 optimized a bit too
much. We can skip the vga_invalidate_display() in case no console
switch happened because we don't need a full redraw then. We can *not*
skip vga_hw_update() though, because the screen content will be stale
then in case nobody else calls vga_hw_update().
Trigger: vga textmode with vnc display and no client connected.
Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Alon Levy <alevy@redhat.com>
Tested-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* sstabellini/saverestore-8:
xen: do not allocate RAM during INMIGRATE runstate
xen mapcache: check if memory region has moved.
xen: record physmap changes to xenstore
Set runstate to INMIGRATE earlier
Introduce "xen-save-devices-state"
cirrus_vga: do not reset videoram
Conflicts:
qapi-schema.json
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>