All of universal-obj-y, user-obj-y (right now unused) and common-obj-y can
be unified into common-obj-y if we take care of defining CONFIG_SOFTMMU
and CONFIG_USER_ONLY in the toplevel makefile. This is similar to how
we define symbols for hardware components.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
It is also needed if !CONFIG_SOFTMMU, unlike everything that surrounds it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
I had missed the introduction of the gcov-files-* variables.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
The compatible string is changed to fsl,mpic on all e500 platforms, to
advertise the existence of BRR1. This matches what the device tree will
have on real hardware.
With MPIC v4.2 max_cpu can be increased from 15 to 32.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
MPIC+0xa0 is IACK for the current CPU. MPIC+0x200a0 is IACK for CPU 0.
This fix allows EPR to work with an SMP target.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Besides the new value in the version register, this provides:
- ILR support, which includes:
- IDR becoming a pure CPU bitmap, allowing 32 CPUs
- machine check output support (though other parts of QEMU need to
be fixed for it to do something other than immediately reboot the
guest)
- dummy error interrupt support (EISR0/EIMR0 read as zero)
- actually all FSL MPICs get all summary registers returning zero for now,
which includes EISR0/EIMR0
Various refactoring is done to support these changes and to ease
new functionality (e.g. a more flexible way of declaring regions).
Just as the code was already not a full implementation of MPIC v2.0,
this is not a full implementation of MPIC v4.2 -- e.g. it still has only
one bank of MSIs.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The timer memory range begins at 0x10f0, so that address 0x1120 shows
up as 0x30, 0x1130 shows up as 0x40, etc. However, the address
decoding (other than TFRR) is not adjusted for this, causing the
wrong registers to be accessed.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
openpic_update_irq() was checking idr rather than destmask, treating
it as if it were a simple bitmap of cpus. Changed to use destmask.
IPI delivery was removing bits directly from .idr, without calling
write_IRQreg_idr so that the change could be conveyed to destmask.
Changed to use destmask directly.
Save/restore destmask when serializing, as due to the IPI change it
cannot be reproduced from idr.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, if VIO devices for pseries don't have addresses explicitly
allocated, they get automatically numbered from 0x1000. This is in the
same general range that libvirt will typically assign VIO device addresses.
That means that if there is a device libvirt doesn't know about, and it
gets an address assigned before the libvirt assigned devices are processed,
we can end up with an address conflict (qemu will abort with an error).
While the real solution is to teach libvirt about the other devices, so it
can correctly manage the whole allocation, this patch reduces the interim
inconvenience by moving qemu allocations to a range that libvirt is less
likely to conflict with.
Because the guest gets the device addresses through the device tree, these
addresses are truly arbitrary and can be changed without breaking guests.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Multiple - even many - PCI host bridges (i.e. PCI domains) are very
common on real PAPR compliant hardware. For reasons related to the
PAPR specified IOMMU interfaces, PCI device assignment with VFIO will
generally require at least two (virtual) PHBs and possibly more
depending on which devices are assigned.
At the moment the qemu PAPR PCI code will not deal with this well,
leaving several crucial parameters of PHBs other than the default one
uninitialized. This patch reworks the code to allow this.
Every PHB needs a unique BUID (Bus Unit Identifier, the id used for
the PAPR PCI related interfaces) and a unique LIOBN (Logical IO Bus
Number, the id used for the PAPR IOMMU related interfaces). In
addition they need windows in CPU real address space to access PCI
memory space, PCI IO space and MSIs. Properties are added to the PCI
host bridge qdevice to allow configuration of all these.
To simplify configuration of multiple PHBs for common cases, a
convenience "index" property is also added. This can be set instead
of the low-level properties, and will generate suitable values for the
other parameters, different for each index value.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the target-ppc tcg code only supports a single thread. You can
specify more, but they're treated identically to multiple cores. On KVM
we obviously can't support more threads than the hardware; if more are
specified it will cause strange and cryptic errors.
This patch clarifies the situation by giving a simple meaningful error if
more threads are specified than we can support.
Signed-off-by: Mike Qiu <qiudayu@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Replace the global adb_bus with a CUDA-internal one, accessed using
regular qdev child bus accessor.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
They were not qdev'ified before. Derive ADBDevice from DeviceState and
convert reset callbacks to DeviceClass::reset, ADBDevice::opaque pointer
to ADBDevice subtypes for mouse and keyboard and adb_{kbd,mouse}_init()
to regular qdev functions.
Fixing Coding Style issues and splitting keyboard and mouse off into
their own files is left for a later point in time.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
It was not a qbus before, turn it into a first-class bus and initialize
it properly from CUDA. Leave it a global variable as long as devices are
not QOM'ified yet.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
It was not qdev'ified before. Turn it into a SysBusDevice and embed it
in MacIO.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
It was not qdev'ified before. Turn it into a SysBusDevice.
Embed them into the MacIO devices.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
It was not qdev'ified before. Turn it into a SysBusDevice and
initialize it via static properties.
Prepare Old World specific MacIO state and embed the NVRAM state there.
Drop macio_nvram_setup_bar() in favor of sysbus_mmio_map() or
direct use of Memory API.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
The state data field is accessed in uint8_t quantities, so switch from
uint32_t argument and return value to uint8_t.
Fix debug format specifiers while at it.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Let the machines create two different types. This prepares to move
knowledge about sub-devices from the machines into the devices.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
This turns macio_bar_setup() into an implementation detail of the qdev
initfn, to be removed step by step.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Move bar MemoryRegion initialization to an instance_init.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add comments to help static analysers detect that these cases are
intentional, and clean up some whitespace in the environment of these
comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
The qmp monitor command to mirror a disk was passing -1 for size
along with the disk's backing file. This size of the resulting disk
is the size of the backing file, which is incorrect if the disk
has been resized. Therefore we should always pass in the size of
the current disk.
Signed-off-by: Vishvananda Ishaya <vishvananda@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Jason tested these patches by migrating Windows 7 and Fedora 17 guests
(while under I/O) on both piix with ahci attached and on q35 (which has
a built-in AHCI controller).
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The size of an int depends on the host, so in order to be able to
migrate these fields, make them either int32_t or bool, depending on the
use.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'dma_status' and 'dma_cb' are written to, but never read.
Remove these fields in preparation for AHCI migration bits.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
hbitmap_iter_init causes an out-of-bounds access when the "first"
argument is or greater than or equal to the size of the bitmap.
Forbid this with an assertion, and remove the failing testcase.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On a zero-sized disk we need to break out of the job successfully
before bdrv_dirty_iter_init is called, otherwise you will get an
assertion failure with the next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vdi_open did not check for a bad signature.
This check was only in vdi_probe.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
vdi_open returned -1 in case of any error, but it should return an
error code (negative value of errno or -EMEDIUMTYPE).
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The signature is a 32 bit value and needs up to 8 hex digits for printing.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk
when a file with the wrong format is selected.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block drivers need a special error code for "wrong format".
From the available error codes EMEDIUMTYPE fits best.
It is not available on all platforms, so a definition in
qemu-common.h and a specific error report are needed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Yet another optimization is to extend the mirroring iteration to include more
adjacent dirty blocks. This limits the number of I/O operations and makes
mirroring efficient even with a small granularity. Most of the infrastructure
is already in place; we only need to put a loop around the computation of
the origin and sector count of the iteration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
With AIO support in place, we can start copying more than one chunk
in parallel. This patch introduces the required infrastructure for
this: the buffer is split into multiple granularity-sized chunks,
and there is a free list to access them.
Because of copy-on-write, a single operation may already require
multiple chunks to be available on the free list.
In addition, two different iterations on the HBitmap may want to
copy the same cluster. We avoid this by keeping a bitmap of in-flight
I/O operations, and blocking until the previous iteration completes.
This should be a pretty rare occurrence, though; as long as there is
no overlap the next iteration can start before the previous one finishes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is really no change in the behavior of the job here, since
there is still a maximum of one in-flight I/O operation between
the source and the target. However, this patch already introduces
the AIO callbacks (which are unmodified in the next patch)
and some of the logic to count in-flight operations and only
complete the job when there is none.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.
Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When mirroring runs, the backing files for the target may not yet be
ready. However, this means that a copy-on-write operation on the target
would fill the missing sectors with zeros. Copy-on-write only happens
if the granularity of the dirty bitmap is smaller than the cluster size
(and only for clusters that are allocated in the source after the job
has started copying). So far, the granularity was fixed to 1MB; to avoid
the problem we detected the situation and required the backing files to
be available in that case only.
However, we want to lower the granularity for efficiency, so we need
a better solution. The solution is to always copy a whole cluster the
first time it is touched. The code keeps a bitmap of clusters that
have already been allocated by the mirroring job, and only does "manual"
copy-on-write if the chunk being copied is zero in the bitmap.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is needed in the following patch.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This actually uses the dirty bitmap in the block layer, and converts
mirroring to use an HBitmapIter.
Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts)
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
HBitmaps provides an array of bits. The bits are stored as usual in an
array of unsigned longs, but HBitmap is also optimized to provide fast
iteration over set bits; going from one bit to the next is O(logB n)
worst case, with B = sizeof(long) * CHAR_BIT: the result is low enough
that the number of levels is in fact fixed.
In order to do this, it stacks multiple bitmaps with progressively coarser
granularity; in all levels except the last, bit N is set iff the N-th
unsigned long is nonzero in the immediately next level. When iteration
completes on the last level it can examine the 2nd-last level to quickly
skip entire words, and even do so recursively to skip blocks of 64 words or
powers thereof (32 on 32-bit machines).
Given an index in the bitmap, it can be split in group of bits like
this (for the 64-bit case):
bits 0-57 => word in the last bitmap | bits 58-63 => bit in the word
bits 0-51 => word in the 2nd-last bitmap | bits 52-57 => bit in the word
bits 0-45 => word in the 3rd-last bitmap | bits 46-51 => bit in the word
So it is easy to move up simply by shifting the index right by
log2(BITS_PER_LONG) bits. To move down, you shift the index left
similarly, and add the word index within the group. Iteration uses
ffs (find first set bit) to find the next word to examine; this
operation can be done in constant time in most current architectures.
Setting or clearing a range of m bits on all levels, the work to perform
is O(m + m/W + m/W^2 + ...), which is O(m) like on a regular bitmap.
When iterating on a bitmap, each bit (on any level) is only visited
once. Hence, The total cost of visiting a bitmap with m bits in it is
the number of bits that are set in all bitmaps. Unless the bitmap is
extremely sparse, this is also O(m + m/W + m/W^2 + ...), so the amortized
cost of advancing from one bit to the next is usually constant.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We can provide fast versions based on the other functions defined
by host-utils.h. Some care is required on glibc, which provides
ffsl already.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>