Version 3 ram_load depends on ram_addrs, which are not stable. Version 4
was introduced in 0.13 (and RHEL 6), so this means live migration from 0.12
and earlier to 1.1 or later will not work.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
ram_addr is (a) unstable (b) going away. Sort by idstr instead.
Commit b2e0a138e initially introduced the sorting for the purpose
of improving debuggability. After this patch, the order is still
stable, but perhaps less usable by a human.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Avoid using ram_addr_t, instead use (MemoryRegion *, offset) pairs.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
As a step in moving live migration from RAMBlocks to MemoryRegions,
store the MemoryRegion in a RAMBlock.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avi Kivity <avi@redhat.com>
Currently creating a memory region automatically registers it for
live migration. This differs from other state (which is enumerated
in a VMStateDescription structure) and ties the live migration code
into the memory core.
Decouple the two by introducing a separate API, vmstate_register_ram(),
for registering a RAM block for migration. Currently the same
implementation is reused, but later it can be moved into a separate list,
and registrations can be moved to VMStateDescription blocks.
Signed-off-by: Avi Kivity <avi@redhat.com>
Changed From V1:
Use DPRINTF instead of fprintf,because it is not an error.
When testing ipod on QEMU by He Jie Xu<xuhj@linux.vnet.ibm.com>,qemu made a assertion.
We found that the ipod with 2 configurations,and the usb-linux did not parse the descriptor correctly.
The descr_len returned is the total length of the all configurations,not one configuration.
The older version will through the other configurations instead of skip,continue parsing the descriptor of interfaces/endpoints in other configurations,then went wrong.
This patch will put the configuration descriptor parse in loop outside and dispel the other configurations not requested.
Signed-off-by: Cao,Bing Bu <mars@linux.vnet.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Juha Riihimäki <juha.riihimaki@nokia.com>
[Riku Voipio: Fixes and restructuring patchset]
Signed-off-by: Riku Voipio <riku.voipio@iki.fi>
[Peter Maydell: More fixes and cleanups for upstream submission]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a post-load hook which invalidates the display. In particular, if we
don't do this and the display size we've just reloaded is larger than
the default then we will segfault trying to read off the end of the buffer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The add_del/running_cpu code and env->halted are tracking stopped cpus.
Sleeping cpus (idle and enabled for interrupts) are waiting inside the
kernel.
No interrupt besides the restart can move a cpu from stopped to
operational. This is already handled over there. So lets just remove
the bogus wakup from the common interrupt delivery, otherwise any
interrupt will wake up a cpu, even if this cpu is stopped (Thus leading
to strange hangs on sigp restart)
This fixes
echo 0 > /sys/devices/system/cpu/cpu0/online
echo 1 > /sys/devices/system/cpu/cpu0/online
in the guest
Signed-off-by: Christian Borntraeger<borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Newer gcc versions (or glibc?) also generate code that tries to EXECUTE
the TR opcode. Implement it so that we don't break valid guests.
Reported-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
All architectures can now use drive_add on the monitor. This of course
does not mean that there is hotplug support for the specific platform,
so in order to actually make use of the new drives you still need to
have a hotplug capable device.
Signed-off-by: Alexander Graf <agraf@suse.de>
The monitor command for hotplugging is in i386 specific code. This is just
plain wrong, as S390 just learned how to do hotplugging too and needs to
get drives for that.
So let's add a generic copy to generic code that handles drive_add in a
way that doesn't have pci dependencies. All pci specific code can then
be handled in a pci specific function.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- align generic drive_add to pci specific one
- rework to split between generic and pci code
v2 -> v3:
- remove comment
All guest targets could potentially implement hotplugging. With the next
patches in this set I will also reflect this in the monitor interface.
So let's always compile it in. It shouldn't hurt.
Signed-off-by: Alexander Graf <agraf@suse.de>
I just submitted a few patches that enable the s390 virtio bus to receive
a hotplug add event. This patch implements the qemu side of it, so that new
hotplug events can be submitted to the guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
v1 -> v2:
- make s390 virtio hoplug code emulate-capable
* qemu-kvm/memory/page_desc: (22 commits)
Remove cpu_get_physical_page_desc()
sparc: avoid cpu_get_physical_page_desc()
virtio-balloon: avoid cpu_get_physical_page_desc()
vhost: avoid cpu_get_physical_page_desc()
kvm: avoid cpu_get_physical_page_desc()
memory: remove CPUPhysMemoryClient
xen: convert to MemoryListener API
memory: temporarily add memory_region_get_ram_addr()
xen, vga: add API for registering the framebuffer
vhost: convert to MemoryListener API
kvm: convert to MemoryListener API
kvm: switch kvm slots to use host virtual address instead of ram_addr_t
memory: add API for observing updates to the physical memory map
memory: replace cpu_physical_sync_dirty_bitmap() with a memory API
framebuffer: drop use of cpu_physical_sync_dirty_bitmap()
loader: remove calls to cpu_get_physical_page_desc()
framebuffer: drop use of cpu_get_physical_page_desc()
memory: introduce memory_region_find()
memory: add memory_region_is_logging()
memory: add memory_region_is_rom()
...
This core is found on chips such as p4080, p3041, p2040, and p5020.
More needs to be done to make this viable for TCG (such as missing SPRs
and instructions), but this suffices to get KVM running with appropriate
kernel support.
Signed-off-by: Varun Sethi <Varun.Sethi@freescale.com>
[scottwood@freescale.com: tweak some flags]
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Check that devices on the spapr vio bus aren't given duplicate
addresses. Currently we will not run with duplicate devices, the
fdt code will spot it, but the error reporting is not great. With
this patch we can report the error nicely in terms of the device
names given by the user.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
There is a device tree property "/chosen/linux,stdout-path" which indicates
which device should be used as stdout - ie. "the console".
Currently we don't specify anything, which means both firmware and Linux
choose something arbitrarily. Use the routine we added in the last patch
to pick a default vty and specify it as stdout.
Currently SLOF doesn't use the property, but we are hoping to update it
to do so.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
In vty_lookup() we have a special case for supporting early debug in
the kernel. This accepts reg == 0 as a special case to mean "any vty".
We implement this by searching the vtys on the bus and returning the
first we find. This means that the vty we chose depends on the order
the vtys are specified on the QEMU command line - because that determines
the order of the vtys on the bus.
We'd rather the command line order was irrelevant, so instead return
the vty with the lowest reg value. This is still a guess as to what the
user really means, but it is at least stable WRT command line ordering.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf] fix braces
Although in theory the device tree has no inherent ordering, in practice
the order of nodes in the device tree does effect the order that devices
are detected by software.
Currently the ordering is determined by the order the devices appear on
the QEMU command line. Although that does give the user control over the
ordering, it is fragile, especially when the user does not generate the
command line manually - eg. when using libvirt etc.
So order the device tree based on the reg value, ie. the address of on
the VIO bus of the devices. This gives us a sane and stable ordering.
Signed-off-by: Michael Ellerman <michael@ellerman.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf] add braces
Add NUMA specific properties to guest's device tree to boot a multi-node
guests. This patch adds the following properties:
ibm,associativity
ibm,architecture-vec-5
ibm,associativity-reference-points
With this, it becomes possible to use -numa option on pseries targets.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
For forgotten historical reasons, PAPR hypercalls for specific virtual IO
devices (oh which there are quite a number) are registered via a callback
in the VIOsPAPRDeviceInfo structure.
This is kind of ugly, so this patch instead registers hypercalls from
device_init() functions for each device type. This works just as well,
and is cleaner.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When guest reset, we need to halt secondary cpus until guest kick them.
This already works for tcg. The patch add the support for kvm.
Signed-off-by: Liu Yu <yu.liu@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
[agraf: remove in-kernel irqchip code]
When trying to create a screen dump without having any VGA adapter
inside the guest, QEMU segfaults.
This is because it's trying to switch back to the "previous" screen
it was on before dumping the VGA screen. Unfortunately, in my case
there simply is no previous screen so it accesses a NULL pointer.
Fix it by checking if previous_active_console is actually available.
This is 1.0 material.
Signed-off-by: Alexander Graf <agraf@suse.de>
When run with a PPC Book3S (server) CPU Currently 'info tlb' in the
qemu monitor reports "dump_mmu: unimplemented". However, during
bringup work, it can be quite handy to have the SLB entries, which are
available in the CPUPPCState. This patch adds an implementation of
info tlb for book3s, which dumps the SLB.
Signed-off-by: Nishanth Aravamudan <nacc@us.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
when I tried qemu with -virtio-console pty the guest hangs and attaching
on /dev/pts/<x> does not return anything if the attachment is too late.
This results in pty_chr_write() returning 0, which causes the port to
get throttled. This results in the guest getting frozen as the
guest->host virtio_console writes don't return until the host releases
the vq element back to the guest.
For the virtio-serial use case we don't want to lose data but for the
console case we better drop data instead of "killing" the guest
console. If we get chardev->frontend notification and a better behaving
virtio-console we can revert this fix.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Make's multiple output syntax
x.c x.h: x.template
gen < x.template
actually invokes the command once for x.c and once for x.h (with differing $@
in each invocation). During a parallel build, the two commands may be invoked
in parallel; this opens up a race, where the second invocation trashes a file
supposedly produced during the first, and now in use by a dependent command.
The various qapi code generators are susceptible to this; fix by making them
generate just one file per invocation.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* aneesh/for-upstream:
scripts/analyse-9p-simpletrace.py: Add symbolic names for 9p operations.
hw/9pfs: iattr_valid flags are kernel internal flags map them to 9p values.
hw/9pfs: Use the correct signed type for different variables
hw/9pfs: replace iovec manipulation with QEMUIOVector
* bonzini/nbd-for-anthony: (26 commits)
nbd: add myself as maintainer
qemu-nbd: throttle requests
qemu-nbd: asynchronous operation
qemu-nbd: add client pointer to NBDRequest
qemu-nbd: move client handling to nbd.c
qemu-nbd: use common main loop
link the main loop and its dependencies into the tools
qemu-nbd: introduce NBDRequest
qemu-nbd: introduce NBDExport
qemu-nbd: introduce nbd_do_receive_request
qemu-nbd: more robust handling of invalid requests
qemu-nbd: introduce nbd_do_send_reply
qemu-nbd: simplify nbd_trip
move corking functions to osdep.c
qemu-nbd: remove data_size argument to nbd_trip
qemu-nbd: remove offset argument to nbd_trip
Update ioctl order in nbd_init() to detect EBUSY
nbd: add support for NBD_CMD_TRIM
nbd: add support for NBD_CMD_FLUSH
nbd: add support for NBD_CMD_FLAG_FUA
...
qemu-kvm passes numa/SRAT topology information for smp_cpus to SeaBIOS. However
SeaBIOS always expects to setup max_cpus number of SRAT cpu entries
(MaxCountCPUs variable in build_srat function of Seabios). When qemu-kvm runs
with smp_cpus != max_cpus (e.g. -smp 2,maxcpus=4), Seabios will mistakenly use
memory SRAT info for setting up CPU SRAT entries for the offline CPUs. Wrong
SRAT memory entries are also created. This breaks NUMA in a guest.
Fix by setting up SRAT info for max_cpus in qemu-kvm.
Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
The latter was already commented out, the former is redundant as well.
We always get the latest changes after return from the guest via
kvm_arch_post_run.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Keep a per-VCPU xsave buffer for kvm_put/get_xsave instead of
continuously allocating and freeing it on state sync.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Field 0 (FCW+FSW) and 1 (FTW+FOP) were hard-coded so far.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Limiting the number of in-flight requests is implemented very simply
with a can_read callback. It does not require a semaphore, unlike the
client side in block/nbd.c, because we can throttle directly the creation
of coroutines. The client side can have a coroutine created at any time
when an I/O request is made.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using coroutines enable asynchronous operation on both the network and
the block side. Network can be owned by two coroutines at the same time,
one writing and one reading. On the send side, mutual exclusion is
guaranteed by a CoMutex. On the receive side, mutual exclusion is
guaranteed because new coroutines immediately start receiving data,
and no new coroutines are created as long as the previous one is receiving.
Between receive and send, qemu-nbd can have an arbitrary number of
in-flight block transfers. Throttling is implemented by the next
patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
By attaching a client to an NBDRequest, we can avoid passing around the
socket descriptor and data buffer.
Also, we can now manage the reference count for the client in
nbd_request_get/put request instead of having to do it ourselved in
nbd_read. This simplifies things when coroutines are used.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch sets up the fd handler in nbd.c instead of qemu-nbd.c. It
introduces NBDClient, which wraps the arguments to nbd_trip in a single
structure, so that we can add a notifier to it. This way, qemu-nbd can
know about disconnections.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using a single main loop for sockets will help yielding from the socket
coroutine back to the main loop, and later reentering it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>