This patch can be applied to both qemu-xen and qemu and adds support
for empty write barriers to xen_disk.
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Acked-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
An old version of this patch was applied to master, so this contains the
differences between v1 and v2.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently device hotplug removal code is tied to device removal via
ACPI. All pci devices that are removable via device_del() require the
guest to respond to the request. In some cases the guest may not
respond leaving the device still accessible to the guest. The management
layer doesn't currently have a reliable way to revoke access to host
resource in the presence of an uncooperative guest.
This patch implements a new monitor command, drive_del, which
provides an explicit command to revoke access to a host block device.
drive_del first quiesces the block device (qemu_aio_flush;
bdrv_flush() and bdrv_close()). This prevents further IO from being
submitted against the host device. Finally, drive_del cleans up
pointers between the drive object (host resource) and the device
object (guest resource).
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
SCSI read/write requests should not be re-issued before the current
fragment of I/O completes. There are asserts in scsi-disk.c that guard
this constraint but they trigger on SPARC Linux 2.4. It turns out that
the asserts are too early in the code path and don't allow for read
requests to terminate.
Only the read assert needs to be moved but move the write assert too for
consistency.
Reported-by: Nigel Horne <njh@bandsman.co.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Rename the members of target_ucontext so that they don't conflict
with possible host macros for ucontext members. This has already
been done for the other targets.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar@axis.com>
Errors should be logged using error_report() so they go to the
appropriate monitor.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
As pointed out by avi the vgabios update is guest-visible and thus has
migration implications.
One change is that the vga has a valid pci rom bar now. We already have
a pci bus property to enable/disable the rom bar and we'll load the bios
via fw_cfg as fallback for the no-rom-bar case. So we just have to add
compat properties to handle this case.
A second change is that the magic bochs lfb @ 0xe0000000 is gone. When
live-migrating a guest from a older qemu version it might be using the
lfb though, so we have to keep it for the old machine types. The patch
enables the bochs lfb in case we don't have the pci rom bar enabled
(i.e. we are in 0.13+older compat mode).
This patch depends on these patches which add (and use) the pc-0.13
machine type:
http://patchwork.ozlabs.org/patch/70797/http://patchwork.ozlabs.org/patch/70798/
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Cc: avi@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
While not explicitly stated in the spec, it was observed on real systems
that enabling loopback testing on the pcnet controller disables
reception of external frames. And some legacy software relies on it, so
provide this behavior.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The current ioport callbacks are not type-safe, in that they accept an "opaque"
pointer as an argument whose type must match the argument to the registration
function; this is not checked by the compiler.
This patch adds an alternative that is type-safe. Instead of an opaque
argument, both registation and the callback use a new IOPort type. The
callback then uses container_of() to access its main structures.
Currently the old and new methods exist side by side; once the old way is gone,
we can also save a bunch of memory since the new method requires one pointer
per ioport instead of 6.
Acked-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
VM state change notifications are invoked from vm_start()/vm_stop().
Trace these state changes so we can reason about the state of the VM
from trace output.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch enables MSI-X for virtfs-9p-pci. It also adds a
compat property to pc-0.13 which turns it of there to stay
compatible to 0.13-stable.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
fprintf_function adds format checking with GCC_FMT_ATTR.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Neither DECLARE_SPRINTF nor BAD_SPRINTF are needed for QEMU.
QEMU won't support systems with missing or bad declarations
for sprintf. The unused code was detected while looking for
functions with missing format checking. Instead of adding
GCC_FMT_ATTR, the unused code was removed.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Stefan Weil <weil@mail.berlios.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We have an OS which writes to port 0x400 when probing for special hardware.
This causes an exit of the VM. With SeaBIOS this port isn't used anyway.
Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-By: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bernhard Kohl <bernhard.kohl@nsn.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
signrom.sh has multiple bugs:
- the last byte is considered when calculating the existing checksum, but not
when computing the correction
- apprently the 'expr' expression overflows and produces incorrect results with
larger roms
- if the checksum happened to be zero, we calculated the correction byte to be
256
Instead of rewriting this in half a line of python, this patch fixes the bugs.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Block migration can submit multiple AIO reads for the same sector/chunk, but
completion of such reads can happen out of order:
migration guest
- get_dirty(N)
- aio_read(N)
- clear_dirty(N)
write(N)
set_dirty(N)
- get_dirty(N)
- aio_read(N)
If the first aio_read completes after the second, stale data will be
migrated to the destination.
Fix by not allowing multiple AIOs inflight for the same sector.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Sectors are marked dirty in the bitmap on AIO submission. This is wrong
since data has not reached storage.
Set a given sector as dirty in the dirty bitmap on AIO completion, so that
reading a sector marked as dirty is guaranteed to return uptodate data.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Otherwise upper 32 bits of bitmap entries are not correctly calculated.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This introduces generation of a qemu.stp/qemu-system-XXX.stp
files which provides tapsets with friendly names for static
probes & their arguments. Instead of
probe process("qemu").mark("qemu_malloc") {
printf("Malloc %d %p\n", $arg1, $arg2);
}
It is now possible todo
probe qemu.system.i386.qemu_malloc {
printf("Malloc %d %p\n", size, ptr);
}
There is one tapset defined per target arch, for both
user and system emulators.
* Makefile.target: Generate stp files for each target
* tracetool: Support for generating systemtap tapsets
* configure: Check for whether systemtap is available
with the DTrace backend
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This introduces a new tracing backend that targets the SystemTAP
implementation of DTrace userspace tracing. The core functionality
should be applicable and standard across any DTrace implementation
on Solaris, OS-X, *BSD, but the Makefile rules will likely need
some small additional changes to cope with OS specific build
requirements.
This backend builds a little differently from the other tracing
backends. Specifically there is no 'trace.c' file, because the
'dtrace' command line tool generates a '.o' file directly from
the dtrace probe definition file. The probe definition is usually
named with a '.d' extension but QEMU uses '.d' files for its
external makefile dependancy tracking, so this uses '.dtrace' as
the extension for the probe definition file.
The 'tracetool' program gains the ability to generate a trace.h
file for DTrace, and also to generate the trace.d file containing
the dtrace probe definition.
Example usage of a dtrace probe in systemtap looks like:
probe process("qemu").mark("qemu_malloc") {
printf("Malloc %d %p\n", $arg1, $arg2);
}
* .gitignore: Ignore trace-dtrace.*
* Makefile: Extra rules for generating DTrace files
* Makefile.obj: Don't build trace.o for DTrace, use
trace-dtrace.o generated by 'dtrace' instead
* tracetool: Support for generating DTrace data files
Reviewed-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
We can't let the compiler define the alignment for qemu_cfg data.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fix a makefile error that meant that qemu would not compile if
the source and object directories were the same.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since commit 4bed983730 an .fd_read()
handler that deletes its IOHandler is exposed to .fd_write() being
called on the deleted IOHandler.
This patch fixes deletion so that .fd_read() and .fd_write() are never
called on an IOHandler that is marked for deletion.
Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Some devices seem to choke on receiving a USB_REQ_GET_CONFIGURATION ctrl msg
(witnessed with a digital picture frame usb id 1908:1320).
When usb_fs_type == USB_FS_SYS, the active configuration can be read directly
from sysfs, which allows using this device through qemu's usb redirection.
More in general it seems a good idea to not send needless control msg's to
devices, esp. as the code in question is called every time a set_interface
is done. Which happens multiple times during virtual machine startup, and
when device drivers are activating the usb device.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The next patch in this series introduces multiple ways to get the
configuration dependent upon usb_fs_type, it is cleaner to put this
into its own function.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This allows us to recreate the sysfspath used during scanning later
(which will be used in a later patch in this series).
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch adds missing braces around if/else statements that call
macros which are likely to result in errors if the macro is
changed. It also makes the code comply better with CODING_STYLE.
Signed-off-by: Jes Sorensen <Jes.Sorensen@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This introduces generation of a qemu.stp/qemu-system-XXX.stp
files which provides tapsets with friendly names for static
probes & their arguments. Instead of
probe process("qemu").mark("qemu_malloc") {
printf("Malloc %d %p\n", $arg1, $arg2);
}
It is now possible todo
probe qemu.system.i386.qemu_malloc {
printf("Malloc %d %p\n", size, ptr);
}
There is one tapset defined per target arch.
* Makefile: Generate a qemu.stp file for systemtap
* tracetool: Support for generating systemtap tapsets
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This introduces a new tracing backend that targets the SystemTAP
implementation of DTrace userspace tracing. The core functionality
should be applicable and standard across any DTrace implementation
on Solaris, OS-X, *BSD, but the Makefile rules will likely need
some small additional changes to cope with OS specific build
requirements.
This backend builds a little differently from the other tracing
backends. Specifically there is no 'trace.c' file, because the
'dtrace' command line tool generates a '.o' file directly from
the dtrace probe definition file. The probe definition is usually
named with a '.d' extension but QEMU uses '.d' files for its
external makefile dependancy tracking, so this uses '.dtrace' as
the extension for the probe definition file.
The 'tracetool' program gains the ability to generate a trace.h
file for DTrace, and also to generate the trace.d file containing
the dtrace probe definition.
Example usage of a dtrace probe in systemtap looks like:
probe process("qemu").mark("qemu_malloc") {
printf("Malloc %d %p\n", $arg1, $arg2);
}
* .gitignore: Ignore trace-dtrace.*
* Makefile: Extra rules for generating DTrace files
* Makefile.obj: Don't build trace.o for DTrace, use
trace-dtrace.o generated by 'dtrace' instead
* tracetool: Support for generating DTrace data files
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A via -kernel supplied x86_64 ELF image is being started in 32bit mode.
Detect and exit if a 64bit image has been supplied.
Signed-off-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
local_apics are allocated sequentially and never removed, so
we can stop any iterations that go to MAX_APICS as soon as we
hit the first NULL. Looking at a small guest running a virtio-net
workload with oprofile, this drops apic_get_delivery_bitmask()
from #3 in the profile to down in the noise.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch fixes hot unplug of cold plugged devices
(those present at system start), which got broken by
5beb8ad503 .
Signed-off-by: Isaku Yamahata <yamahata@valinux.co.jp>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cam Macdonell <cam@cs.ualberta.ca>
Tested-by: Cam Macdonell <cam@cs.ualberta.ca>
Reported-by: Cam Macdonell <cam@cs.ualberta.ca>.
pcibus_dev_print() was erroneously retrieving the device bus
number from the secondary bus number offset of the device
instead of the bridge above the device. This ends of landing
in the 2nd byte of the 3rd BAR for devices, which thankfully
is usually zero.
Note: pcibus_get_dev_path() copied this code,
inheriting the same bug. pcibus_get_dev_path() is used for
ramblock naming, so changing it can effect migration. However,
I've only seen this byte be non-zero for an assigned device,
which can't migrate anyway, so hopefully we won't run into
any issues.
This patch does not touch pcibus_get_dev_path, as
bus number is guest assigned for nested buses,
so using it for migration is broken anyway.
Fix it properly later.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When adding the length to the pseudo header, we're not properly
accounting for overflow.
From: Mark Wu <dwu@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio-net expects set_offload to succeed after
peer cleanup.
Since we don't have an open fd anymore, make it so.
Fixes warning about the failure of offload setting.
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Frontends calling tap_get_vhost_net get an invalid pointer after the
peer backend has been deleted. Jason Wang <jasowang@redhat.com> reports
this leading to a crash in ack_features when we remove the vhost-net
bakend of a virtio nic.
The fix is simply to clear the backend pointer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>