Move bar MemoryRegion initialization to an instance_init.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add comments to help static analysers detect that these cases are
intentional, and clean up some whitespace in the environment of these
comments.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Jason tested these patches by migrating Windows 7 and Fedora 17 guests
(while under I/O) on both piix with ahci attached and on q35 (which has
a built-in AHCI controller).
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Jason Baron <jbaron@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The size of an int depends on the host, so in order to be able to
migrate these fields, make them either int32_t or bool, depending on the
use.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'dma_status' and 'dma_cb' are written to, but never read.
Remove these fields in preparation for AHCI migration bits.
Signed-off-by: Jason Baron <jbaron@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The statistics are now available through device properties via a
polling mechanism. First a client has to enable polling, then it
can query available stats.
Polling is enabled by setting an update interval (in seconds)
to a property named guest-stats-polling-interval, like this:
{ "execute": "qom-set",
"arguments": { "path": "/machine/peripheral-anon/device[1]",
"property": "guest-stats-polling-interval", "value": 4 } }
Then the available stats can be retrieved by querying the
guest-stats property. The returned object is a dict containing
all available stats. Example:
{ "execute": "qom-get",
"arguments": { "path": "/machine/peripheral-anon/device[1]",
"property": "guest-stats" } }
{
"return": {
"stats": {
"stat-swap-out": 0,
"stat-free-memory": 844943360,
"stat-minor-faults": 219028,
"stat-major-faults": 235,
"stat-total-memory": 1044406272,
"stat-swap-in": 0
},
"last-update": 1358529861
}
}
Please, check the next commit for full documentation.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Next commit will re-enable balloon stats with a different interface, but
this old code conflicts with it. Let's drop it.
It's important to note that the QMP and HMP interfaces are also dropped
by this commit. That shouldn't be a problem though, because:
1. All QMP fields are optional
2. This feature has always been disabled
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This reverts commit 67c5322d70:
I'm not sure if the retry logic has ever worked when not using FIFO mode. I
found this while writing a test case although code inspection confirms it is
definitely broken.
The TSR retry logic will never actually happen because it is guarded by an
'if (s->tsr_rety > 0)' but this is the only place that can ever make the
variable greater than zero. That effectively makes the retry logic an 'if (0)
I believe this is a typo and the intention was >= 0. Once this is fixed thoug
I see double transmits with my test case. This is because in the non FIFO
case, serial_xmit may get invoked while LSR.THRE is still high because the
character was processed but the retransmit timer was still active.
We can handle this by simply checking for LSR.THRE and returning early. It's
possible that the FIFO paths also need some attention.
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Even if the previous logic was never worked, new logic breaks stuff -
namely,
qemu -enable-kvm -nographic -kernel /boot/vmlinuz-$(uname -r) -append console=ttyS0 -serial pty
the above command will cause the virtual machine to stuck at startup
using 100% CPU till one connects to the pty and sends any char to it.
Note this is rather typical invocation for various headless virtual
machines by libvirt.
So revert this change for now, till a better solution will be found.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This is a trivial patch to harmonize the coding style on
hw/etraxfs_eth.c. This is in preparation to split off the bitbang mdio
code into a separate file.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Paul Brook <paul@codesourcery.com>
Cc: Edgar E. Iglesias <edgar.iglesias@gmail.com>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Grant Likely <grant.likely@secretlab.ca>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@gmail.com>
# By Peter Lieven (3) and others
# Via Paolo Bonzini
* bonzini/scsi-next:
scsi: Drop useless null test in scsi_unit_attention()
lsi: use qbus_reset_all to reset SCSI bus
scsi: fix segfault with 0-byte disk
iscsi: add support for iSCSI NOPs [v2]
iscsi: partly avoid iovec linearization in iscsi_aio_writev
iscsi: add iscsi_create support
req was created by scsi_req_alloc(), which initializes req->dev to a
value it dereferences. req->dev isn't changed anywhere else.
Therefore, req->dev can't be null.
Drop the useless null test; it spooks Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
When a 0-sized disk is found, READ CAPACITY will return a
LUN NOT READY error. However, because it returns -1 instead
of zero, the HBA will call scsi_req_continue. This will
typically cause a segmentation fault or an assertion failure.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Basically the same as usb-storage, but without automatic scsi
device setup. Also features support for up to 16 LUNs.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This is a simpler solution to 869981, where migration breaks since qxl's
rom bar size has changed. Instead of ignoring fields in QXLRom, which is what has
actually changed, we remove some of the modes, a mechanism already
accounted for by the guest. The modes left allow for portrait and
landscape only modes, corresponding to orientations 0 and 1.
Orientations 2 and 3 are dropped.
Added assert so that rom size will fit the future QXLRom increases via
spice-protocol changes.
This patch has been tested with 6.1.0.10015. With the newer 6.1.0.10016
there are problems with both "(flipped)" modes prior to the patch, and
the patch loses the ability to set "Portrait" modes. But this is a
separate bug to be fixed in the driver, and besides the patch doesn't
affect the new arbitrary mode setting functionality.
Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This reverts commit a1cbfd554e.
Test isn't useless. scsi_req_enqueue() may finish the request (will
actually happen for requests which don't trigger any I/O such as
INQUIRY), then call usb_msd_command_complete() which in turn will
set s->req to NULL after unref'ing it.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Replace by SYS_BUS_DEVICE() QOM cast macro using a scripted conversion.
Avoids the old macro creeping into new code.
Resolve a Coding Style warning in openpic code.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
A virtio-s390-bus is created during the init. So one VirtIODevice can be
connected on the virtio-s390-device through this bus.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This add the virtio-s390-bus which extends virtio-bus. So one VirtIODevice can
be connected on this bus.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Create the virtio-pci device which is abstract. This transport device will
create a virtio-pci-bus, so one VirtIODevice can be connected.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce virtio-pci-bus, which extends virtio-bus. It is used with virtio-pci
transport device.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Create the virtio-device which is abstract. All the virtio-device can extend
this class. It also add some functions to virtio-bus.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce virtio-bus. Refactored transport device will create a bus which
extends virtio-bus.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a max_dev field to BusClass to specify the maximum amount of devices allowed
on the bus (has no effect if max_dev=0)
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
STATUS_TIMEOUT is defined in winnt.h:
CC hw/tpci200.o
hw/tpci200.c:34:0:
warning: "STATUS_TIMEOUT" redefined [enabled by default]
/usr/lib/gcc/x86_64-w64-mingw32/4.6/../../../../x86_64-w64-mingw32/include/winnt.h:1036:0:
note: this is the location of the previous definition
Use STATUS_TIME instead of STATUS_TIMEOUT as suggested by Alberto Garcia.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
-acpitable {file|data}=file reads the content of file, but it is
in binary form, so the file should be opened usin O_BINARY flag.
On *nix it is a no-op, but on windows and other weird platform
it is really needed.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
defineition -> definition
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas F=E4rber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# By Kevin Wolf (4) and others
# Via Stefan Hajnoczi
* stefanha/block:
dataplane: support viostor virtio-pci status bit setting
dataplane: avoid reentrancy during virtio_blk_data_plane_stop()
win32-aio: use iov utility functions instead of open-coding them
win32-aio: Fix memory leak
win32-aio: Fix vectored reads
aio: Fix return value of aio_poll()
ide: Remove wrong assertion
block: fix null-pointer bug on error case in block commit
84f2d0ea added an argument to function usb_host_info.
The stub function must match the declaration in usb.h.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Code mixes uint32_t, int and size_t. Very unlikely to go wrong in
practice, but clean it up anyway.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Many callers pass size_t, which gets silently truncated to uint32_t.
Harmless, because all practical sizes are well below 4GiB. Clean it
up anyway. Size overflow now fails assertions.
Bonus: saves a whole bunch of silly casts.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
No caller is checking the value, so all errors get ignored, usually
silently. assert() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf:
PPC: KVM: Add support for EPR with KVM
openpic: export e500 epr enable into a ppc.c function
Update Linux kernel headers
PPC: e500: Change in-memory order of load blobs
PPC: Provide zero SVR for -cpu e500mc and e5500
PPC: E500: Calculate loading blob offsets properly
openpic: set mixed mode as supported
openpic: unify gcr mode mask updates
openpic: move gcr write into a function
Allow virtio machines to register for different diag500 function
codes and convert s390-virtio to use it.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This enables qemu -cpu help to return a list of supported CPU models
on s390 and also to query for cpu definitions in the monitor.
Initially only cpu model = host is returned. This needs to be reworked
into a full-fledged CPU model handling later on.
This change is needed to allow libvirt exploiters (like OpenStack)
to specify a CPU model.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
[agraf: fix s390x-linux-user, adjust header locations]
Signed-off-by: Alexander Graf <agraf@suse.de>
Lets move the code to setup IPL for external kernel
or via the zipl rom into a separate file. This allows to
- define a reboot handler, setting up the PSW appropriately
- enhance the boot code to IPL disks that contain a bootmap that
was created with zipl under LPAR or z/VM (future patch)
- reuse that code for several machines (e.g. virtio-ccw and virtio-s390)
- allow different machines to provide different defaults
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
[agraf: symbolify initial psw, adjust header file location, fix for QOM]
Signed-off-by: Alexander Graf <agraf@suse.de>
Enabling and disabling the EPR capability (mpic_proxy) is a system
wide operation. As such, it belongs into the ppc.c file, since that's
where PPC specific machine wide logic happens.
Signed-off-by: Alexander Graf <agraf@suse.de>
Today, we load
<kernel> <initrd> <dtb>
into memory in that order. However, Linux has a bug where it can only
handle the dtb if it's within the first 64MB of where <kernel> starts.
So instead, let's change the order to
<kernel> <dtb> <initrd>
making Linux happy.
Signed-off-by: Alexander Graf <agraf@suse.de>
We have 3 blobs we need to load when booting the system:
- kernel
- initrd
- dtb
We place them in physical memory in that order. At least we should.
This patch fixes the location calculation up to take any module into
account, fixing the dtb offset along the way.
Reported-by: Stuart Yoder <stuart.yoder@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The Raven MPIC implementation supports the "Mixed" mode to work with
an i8259. While we don't implement mixed mode, we should mark it as
a supported mode in the mode bitmap.
Signed-off-by: Alexander Graf <agraf@suse.de>
The mode mask already masks out bits we don't care about, so the
actual handling code can stay intact regardless.
Signed-off-by: Alexander Graf <agraf@suse.de>
The GCR register contains too much functionality to be covered inside
of the register switch statement. Move it out into a separate function.
Signed-off-by: Alexander Graf <agraf@suse.de>
The viostor virtio-blk driver for Windows does not use the
VIRTIO_CONFIG_S_DRIVER bit. It only sets the VIRTIO_CONFIG_S_DRIVER_OK
bit.
The viostor driver refreshes the virtio-pci status byte sometimes while
the guest is running. We misinterpret 0x4 (VIRTIO_CONFIG_S_DRIVER_OK)
as an indication that virtio-blk-data-plane should be stopped since 0x2
(VIRTIO_CONFIG_S_DRIVER) is missing. The result is that the device
becomes unresponsive.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When dataplane is stopping, the s->vdev->binding->set_host_notifier(...,
false) call can invoke the virtqueue handler if an ioeventfd
notification is pending. This causes hw/virtio-blk.c to invoke
virtio_blk_data_plane_start() before virtio_blk_data_plane_stop()
returns!
The result is that we try to restart dataplane while trying to stop it
and the following assertion is raised:
msix_set_mask_notifier: Assertion `!dev->msix_mask_notifier' failed.
Although the code was intended to prevent this scenario, the s->started
boolean isn't enough. Add s->stopping so that we can postpone clearing
s->started until we've completely stopped dataplane.
This way, virtqueue handler calls during virtio_blk_data_plane_stop()
are ignored. When dataplane is legitimately started again later we
already self-kick ourselves to resume processing.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# By Wenchao Xia
# Via Luiz Capitulino
* luiz/queue/qmp:
HMP: add sub command table to info
HMP: move define of mon_cmds
HMP: add infrastructure for sub command
HMP: delete info handler
HMP: add QDict to info callback handler
Order of arguments of kvm_virtio_pci_irqfd_release
got mixed up in all calls.
As a result users see assertions during cleanup.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Add a documentation section "Methods" and discuss among others how to
handle overriding virtual methods.
Clarify DeviceClass::realize documentation and refer to the above.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch change all info call back function to take
additional QDict * parameter, which allow those command
take parameter. Now it is set to NULL at default case.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
The Bus Master IDE Active bit (BM_STATUS_DMAING) is not only set when
the request is still in flight, but also when it has completed and the
size of the physical memory regions in the PRDT was larger than the
transfer size.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This obsoletes tmp105_set() and allows for better error handling.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce TYPE_ constant and cast macro.
Move the state struct to the new header to allow for future embedding.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
An early length postincrement in the TMP105's I2C TX path led to
transfers of more than one byte to place the second byte in the third
byte's place within the buffer and the third byte to get discarded.
Fix this by explictly incrementing the length after the checks but
before the callback is called, which again checks the length.
Adjust the Coding Style while at it.
Signed-off-by: Alex Horn <alex.horn@cs.ox.ac.uk>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Allows value sharing with qtest.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
scsi_req_new() never returns null, and scsi_req_enqueue() dereferences
the pointer, so checking for null is useless.
Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
g_strdup_printf already handles OOM errors, so some error handling in
QEMU code can be removed.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Without this default q35/ppc405 based machines would no longer boot
after commit e4ada29e90
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Introduce the QOM realizefn suggested by Anthony.
Detailed documentation is supplied in the qdev header.
For now this implements a default DeviceClass::realize callback that
just wraps DeviceClass::init, which it deprecates.
Once all devices have been converted to DeviceClass::realize,
DeviceClass::init is to be removed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Cc: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Whether the device was initialized or not is QOM-level information and
currently unused. Drop it from device. This leaves the boolean state of
whether or not DeviceClass::init was called or not, a.k.a. "realized".
Suggested-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch removes the default boot order for pseries machine. This allows
the machine to handle a NULL boot order in case no -boot option is provided.
Thus it helps SLOF firmware to verify if boot order is specified in command
line or not. If no boot order is provided SLOF tries to boot from the
device set in the nvram.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This patch makes default boot order machine specific instead of
set globally. The default boot order can be set per machine in
QEMUMachine boot_order. This also allows a machine to receive a
NULL boot order when -boot isn't used and take an appropriate action
accordingly. This helps machine boots from the devices as set in
guest's non-volatile memory location in case no boot order is
provided by the user.
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Avik Sil <aviksil@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* afaerber/memory-ioport:
acpi_piix4: Do not use old_portio-style callbacks
xen_platform: Do not use old_portio-style callbacks
hw/dma.c: Fix conversion of ioport_register* to MemoryRegion
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* stefanha/block:
block: Fix how mirror_run() frees its buffer
win32-aio: Fix how win32_aio_process_completion() frees buffer
scsi-disk: qemu_vfree(NULL) is fine, simplify
w32: Make qemu_vfree() accept NULL like the POSIX implementation
sheepdog: clean up sd_aio_setup()
sheepdog: multiplex the rw FD to flush cache
block: clear dirty bitmap when discarding
ide: issue discard asynchronously but serialize the pieces
ide: fix TRIM with empty range entry
block: make discard asynchronous
raw: support discard on block devices
raw-posix: remember whether discard failed
raw-posix: support discard on more filesystems
block: fix initialization in bdrv_io_limits_enable()
qcow2: Fix segfault on zero-length write
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* afaerber/qom-cpu:
target-i386: Use switch in check_hw_breakpoints()
target-i386: Avoid goto in hw_breakpoint_insert()
target-i386: Introduce hw_{local,global}_breakpoint_enabled()
target-i386: Define DR7 bit field constants
target-i386: Move kvm_check_features_against_host() check to realize time
target-i386: cpu_x86_register() consolidate freeing resources
target-i386: Move setting defaults out of cpu_x86_parse_featurestr()
target-i386: check/enforce: Check all feature words
target-i386/cpu.c: Add feature name array for ext4_features
target-i386: kvm_check_features_against_host(): Use feature_word_info
target-i386/cpu: Introduce FeatureWord typedefs
target-i386: Disable kvm_mmu by default
kvm: Add fake KVM constants to avoid #ifdefs on KVM-specific code
exec: Return CPUState from qemu_get_cpu()
xen: Simplify halting of first CPU
kvm: Pass CPUState to kvm_init_vcpu()
cpu: Move cpu_index field to CPUState
cpu: Move numa_node field to CPUState
target-mips: Clean up mips_cpu_map_tc() documentation
cpu: Move nr_{cores,threads} fields to CPUState
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
[AF: Used HWADDR_PRIx for hwaddr PIIX4_DPRINTF()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The commit 5822993368 introduced a 1-shift for
some offset in DMA emulation.
Before the previous commit, which converted ioport_register_* to
MemoryRegion, the DMA controller registered 8 ioports with the following
formula:
base + ((8 + i) << d->shift) where 0 <= i < 8
When an IO occured within a Memory Region, DMA callback receives an
offset relative to the start address. Here the start address is:
base + (8 << d->shift).
The offset should be: (i << d->shift). After the shift is reverted, the
offsets are 0..7 not 1..8.
Fixes LP#1089996.
Reported-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that discard can take a long time, make it asynchronous.
Each LBA range entry is processed separately because discard
can be an expensive operation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ATA-ACS-3 says "If the two byte range length is zero, then the LBA
Range Entry shall be discarded as padding." iovecs are used as if
they are linearized, so it is incorrect to discard the rest of
this iovec.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Move the declaration to qemu/cpu.h and add documentation.
The implementation still depends on CPUArchState for CPU iteration.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Note that target-alpha accesses this field from TCG, now using a
negative offset. Therefore the field is placed last in CPUState.
Pass PowerPCCPU to [kvm]ppc_fixup_cpu() to facilitate this change.
Move common parts of mips cpu_state_reset() to mips_cpu_reset().
Acked-by: Richard Henderson <rth@twiddle.net> (for alpha)
[AF: Rebased onto ppc CPU subclasses and openpic changes]
Signed-off-by: Andreas Färber <afaerber@suse.de>
To facilitate the field movements, pass MIPSCPU to malta_mips_config();
avoid that for mips_cpu_map_tc() since callers only access MIPS Thread
Contexts, inside TCG helpers.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Mingw32 headers define FAR, causing this warning:
/src/qemu/hw/pc87312.c:38:0: warning: "FAR" redefined [enabled by default]
In file included from /usr/local/lib/gcc/i686-mingw32msvc/4.7.0/../../../../i686-mingw32msvc/include/windows.h:48:0,
from /src/qemu/include/sysemu/os-win32.h:29,
from /src/qemu/include/qemu-common.h:46,
from /src/qemu/include/exec/ioport.h:27,
from /src/qemu/hw/isa.h:6,
from /src/qemu/hw/pc87312.h:28,
from /src/qemu/hw/pc87312.c:26:
/usr/local/lib/gcc/i686-mingw32msvc/4.7.0/../../../../i686-mingw32msvc/include/windef.h:34:0: note: this is the location of the previous definition
Avoid the warning by expanding the macros.
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Prepare an instance_init function for the MemoryRegion init.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Fix the compilation error introduced by msg new field.
CC hw/9pfs/virtio-9p.o
In file included from /home/konradf/Documents/safe/greensocs/virtio-project/x86-qemu/qemu/hw/9pfs/virtio-9p.c:17:0:
/home/konradf/Documents/safe/greensocs/virtio-project/x86-qemu/qemu/hw/virtio-pci.h:30:16: erreur: field ‘msg’ has incomplete type
make: *** [hw/9pfs/virtio-9p.o] Erreur 1
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
virtio_pci_set_guest_notifiers() now takes an additional argument to
specify the number of virtqueues to assign a guest notifier for. This
causes a build breakage for CONFIG_VIRTIO_BLK_DATA_PLANE builds:
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c: In function
‘virtio_blk_data_plane_start’:
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c:451:47: error: too
few arguments to function ‘s->vdev->binding->set_guest_notifiers’
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c: In function
‘virtio_blk_data_plane_stop’:
/home/mdroth/w/qemu2.git/hw/dataplane/virtio-blk.c:511:5: error: too few
arguments to function ‘s->vdev->binding->set_guest_notifiers’
make[1]: *** [hw/dataplane/virtio-blk.o] Error 1
make[1]: *** Waiting for unfinished jobs....
make: *** [subdir-x86_64-softmmu] Error 2
Fix this by passing 1 as the number of virtqueues to assign notifiers
for.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Fixes the following:
/home/mdroth/w/qemu2.git/hw/virtio-pci.c: In function
‘kvm_virtio_pci_vector_unmask’:
/home/mdroth/w/qemu2.git/hw/virtio-pci.c:673:12: error: ‘ret’ may be
used uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [hw/virtio-pci.o] Error 1
make: *** Waiting for unfinished jobs....
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The GE IP-Octal 232 is an IndustryPack module that implements eight
RS-232 serial ports, each one of which can be redirected to a
character device in the host.
Signed-off-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The TPCI200 is a PCI board that supports up to 4 IndustryPack modules.
A new bus type called 'IndustryPack' has been created so any
compatible module can be attached to this board.
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alberto Garcia <agarcia@igalia.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This protocol extension reuses the same set of grant pages for all
transactions between the front/back drivers, avoiding expensive tlb
flushes, grant table lock contention and switches between userspace
and kernel space. The full description of the protocol can be found in
the public blkif.h header.
http://xenbits.xen.org/gitweb/?p=xen.git;a=blob_plain;f=xen/include/public/io/blkif.h
Speed improvement with 15 guests performing I/O is ~450%.
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
On ioreq_release the full ioreq was memset to 0, loosing all the data
and memory allocations inside the QEMUIOVector, which leads to a
memory leak. Create a new function to specifically reset ioreq.
Reported-by: Maik Wessler <maik.wessler@yahoo.com>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
* kraxel/usb.76:
usb-host: Initialize dev->port the obviously safe way
usb-host: Drop superfluous null test from usb_host_auto_scan()
ehci: Assert state machine is sane w.r.t. EHCIQueue
xhci: nuke transfe5rs on detach
xhci: call xhci_detach_slot on root port detach too
xhci: create xhci_detach_slot helper function
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This further optimizes MSIX handling in virtio-pci.
Also included is pci cleanup by Paolo, and pci device
assignment fix by Alex.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQ7ZaiAAoJECgfDbjSjVRpFhcIAJkY4VQ3i7TLnLsnEDOR+FrP
66YLEDwCSiKZ/UW7WERGN3p3tm0hAXLhPoHFqMGRPPV9pdcXI+Eb8v+u0IHVlt+7
DsQ9TIemZkpSMuUJjQbu/RF8k9JV8+X7M6CKnWahq68p0UD/vDX+OgCiGKO/l/zY
tENJhwD6M1MMzbxyzd4nCnkf3CPrHFvpPt2VAqQnkCw3wLAtR34SucBjr/dXcjuT
arPiV8dNmXHTosdKvcodAWA+0YLLE7Bhz0nLK6eTt5L/UsfdbRN8q9Xdhd5nJjji
DjKBJBfwdG5n3r96g7dlb/XdHuQjbFBq3uLmc8H2OdWOrk5PyqeoUA5fdBQxkb8=
=vKSI
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci,virtio
This further optimizes MSIX handling in virtio-pci.
Also included is pci cleanup by Paolo, and pci device
assignment fix by Alex.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* mst/tags/for_anthony:
pci-assign: Enable MSIX on device to match guest
pci: use constants for devices under the 1B36 device ID, document them
ivshmem: use symbolic constant for PCI ID, add to pci-ids.txt
virtio-9p: use symbolic constant, add to pci-ids.txt
reorganize pci-ids.txt
docs: move pci-ids.txt to docs/specs/
vhost: backend masking support
vhost: set started flag while start is in progress
virtio-net: set/clear vhost_started in reverse order
virtio: backend virtqueue notifier masking
virtio-pci: cache msix messages
kvm: add stub for update msi route
msix: add api to access msix message
virtio: don't waste irqfds on control vqs
Coverity worries the strcpy() could overrun the destination. It
can't, because the source always points to usb_host_scan()'s auto
port[], which has the same size. Use pstrcpy() anyway, to hush the
checker.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Coverity points out that port is later passed to usb_host_open(),
which dereferences it. It actually can't be null: it always points to
usb_host_scan()'s auto port[]. Drop the superfluous port == NULL
test.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Coverity worries the EHCIQueue pointer could be null when we pass it
to functions that reference it. The state machine ensures it can't be
null then. Assert that, to hush the checker.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
O_DIRECT on Linux has alignment requirements on I/O buffers and
misaligned requests result in -EINVAL. The Linux virtio_blk guest
driver usually submits aligned requests so I forgot to handle misaligned
requests.
It turns out that virtio-win guest drivers submit misaligned requests.
Handle them using a bounce buffer that meets alignment requirements.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Extract code for read/write command processing into do_rdwr_cmd(). This
brings together pieces that are spread across process_request().
The real motivation is to set the stage for handling misaligned
requests, which the next patch tackles.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
interface_set_client_capabilities() copies only the first few bits,
because it falls into a Classic C trap: you can declare a parameter
uint8_t caps[58], but the resulting parameter type is uint8_t *, not
uint8_t[58]. In particular, sizeof(caps) is sizeof(uint8_t *), not
the intended sizeof(uint8_t[58]).
Harmless, because the bits aren't used, yet. Broken in commit
c10018d6. Spotted by Coverity.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The pointer arithmetic there is safe, but ugly. Coverity grouses
about it. However, the actual comparison is off by one: <= end
instead of < end. Fix by rewriting the check in a cleaner way.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The directory descent mechanism, and a less-flat tree both helped
in making some *-obj-y definitions very short. Many of these
often end up in universal-obj-y, and used to be separate only
because of libuser (which is now part of history...).
Consolidate these variables in a single one.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf: (31 commits)
PPC: linux-user: Calculate context pointer explicitly
target-ppc: Error out for -cpu host on unknown PVR
target-ppc: Slim conversion of model definitions to QOM subclasses
PPC: Bring EPR support closer to reality
PPC: KVM: set has-idle in guest device tree
kvm: Update kernel headers
openpic: fix CTPR and de-assertion of interrupts
openpic: move IACK to its own function
openpic: IRQ_check: search the queue a word at a time
openpic: fix sense and priority bits
openpic: add some bounds checking for IRQ numbers
openpic: use standard bitmap operations
Revert "openpic: Accelerate pending irq search"
openpic: always call IRQ_check from IRQ_get_next
openpic/fsl: critical interrupts ignore mask before v4.1
openpic: make ctpr signed
openpic: rework critical interrupt support
openpic: make register names correspond better with hw docs
ppc/booke: fix crit/mcheck/debug exceptions
openpic: lower interrupt when reading the MSI register
...
The commit c02e1eac88 broke the compilation
for i386. ULL need to be specify for uint64_t value.
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
* stefanha/trivial-patches:
hw/pc.c: Fix converting of ioport_register* to MemoryRegion
Replace remaining gmtime, localtime by gmtime_r, localtime_r
savevm: Remove MinGW specific code which is no longer needed
qga/channel-posix.c: Explicitly include string.h
configure: Fix comment (copy+paste bug)
readline: avoid memcpy() of overlapping regions
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* afaerber-or/prep-up:
prep: Use pc87312 device instead of collection of random ISA devices
prep: Add pc87312 Super I/O emulation
prep: Include devices for ppc64 as well
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The commit 258711 introduced MemoryRegion to replace ioport_region*
for ioport 80h and F0h.
A MemoryRegion needs to have both read and write callback otherwise a segfault
will occur when an access is made.
The previous behaviour of this both ioport is to return 0xffffffffffffffff.
So keep this behaviour.
Reported-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Julien Grall <julien.grall@citrix.com>
Tested-by: Adam Lackorzynski <adam@os.inf.tu-dresden.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This allows removing of MinGW specific code and improves
reentrancy for POSIX hosts.
[Removed unused ret variable in qemu_get_timedate() to fix warning:
vl.c: In function ‘qemu_get_timedate’:
vl.c:451:16: error: variable ‘ret’ set but not used [-Werror=unused-but-set-variable]
-- Stefan Hajnoczi]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Starting with release 1.4 we have a fully functional q35 machine type,
i.e. "qemu -M q35" JustWorks[tm]. Update machine type names to reflect
that:
* pc-1.4 becomes pc-i440fx-1.4
* q35-next becomes pc-q35-1.4
The pc-1.3 (+older) names are maintained for compatibility reasons.
For the same reason the "pc" and "q35" aliases are kept. pc-piix-1.4
continues to be the default machine type, again for compatibility
reasons.
Also updated the description (shown by "qemu -M ?") with host bridge
name, south bridge name and chipset release year.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When the device is reset, the SCSI bus should also be reset so
that in-flight I/O is cancelled.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Since 39bffca203 (qdev: register all
types natively through QEMU Object Model), TypeInfo as used in
the common, non-iterative pattern is no longer amended with information
and should therefore be const.
Fix the documented QOM examples:
sed -i 's/static TypeInfo/static const TypeInfo/g' include/qom/object.h
Since frequently the wrong examples are being copied by contributors of
new devices, fix all types in the tree:
sed -i 's/^static TypeInfo/static const TypeInfo/g' */*.c
sed -i 's/^static TypeInfo/static const TypeInfo/g' */*/*.c
This also avoids to piggy-back these changes onto real functional
changes or other refactorings.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When a guest enables MSIX on a device we evaluate the MSIX vector
table, typically find no unmasked vectors and don't switch the device
to MSIX mode. This generally works fine and the device will be
switched once the guest enables and therefore unmasks a vector.
Unfortunately some drivers enable MSIX, then use interfaces to send
commands between VF & PF or PF & firmware that act based on the host
state of the device. These therefore may break when MSIX is managed
lazily. This change re-enables the previous test used to enable MSIX
(see qemu-kvm a6b402c9), which basically guesses whether a vector
will be used based on the data field of the vector table.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
VFIO_PCI_NUM_REGIONS and VFIO_PCI_NUM_IRQS should never have been
used in this manner as it locks a specific kernel implementation.
Future features may introduce new regions or interrupt entries
(VGA may add legacy ranges, AER might add an IRQ for error
signalling). Fix this before it gets us into trouble.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org
Guests typically enable MSI-X with all of the vectors in the MSI-X
vector table masked. Only when the vector is enabled does the vector
get unmasked, resulting in a vector_use callback. These two points,
enable and unmask, correspond to pci_enable_msix() and request_irq()
for Linux guests. Some drivers rely on VF/PF or PF/fw communication
channels that expect the physical state of the device to match the
guest visible state of the device. They don't appreciate lazily
enabling MSI-X on the physical device.
To solve this, enable MSI-X with a single vector when the MSI-X
capability is enabled and immediate disable the vector. This leaves
the physical device in exactly the same state between host and guest.
Furthermore, the brief gap where we enable vector 0, it fires into
userspace, not KVM, so the guest doesn't get spurious interrupts.
Ideally we could call VFIO_DEVICE_SET_IRQS with the right parameters
to enable MSI-X with zero vectors, but this will currently return an
error as the Linux MSI-X interfaces do not allow it.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org
Commit 667d22d1ae (qdev: move bus removal
to object_unparent) made the assumption that at unparenting time
parent_bus is not NULL. This assumption is unjustified since
object_unparent() may well be called directly after object_initialize(),
without any qdev_set_parent_bus().
This did not cause any issues yet because qdev_[try_]create() does call
qdev_set_parent_bus(), falling back to SysBus if unsupplied.
While at it, ensure that this new function uses the device_ prefix and
make the name more neutral in light of this semantic change.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Tested-by: Igor Mammedov <imammedo@redhat.com>
The code depends on some functions from qemu-option.o, so add
qemu-option.o to universal-obj-y to make sure it's included.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Due to disagreement on a name that is generic enough for hw/pci/pci.h,
the symbolic constants are placed in the .c files.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* kraxel/usb.75: (32 commits)
uhci: stop using portio lists
usbredir: Add support for buffered bulk input (v2)
exynos4210: Add EHCI support
usb/ehci: Add SysBus EHCI device for Exynos4210
usb/ehci: Move capsbase and opregbase into SysBus EHCI class
usb/ehci: Clean up SysBus and PCI EHCI split
xhci: call set-address with dummy usbpacket
usb-redir: Add debugging to bufpq save / restore
usbredir: Add usbredir_init_endpoints() helper
usbredir: Verify we have 32 bits bulk length cap when redirecting to xhci
usbredir: Add ep_stopped USBDevice method
usbredir: Add USBEP2I and I2USBEP helper macros
usbredir: Add an usbredir_stop_ep helper function
usb: Add an usb_device_ep_stopped USBDevice method
usb: Fix usb_ep_find_packet_by_id
hid: Change idle handling to use a timer
uhci: Maximize how many frames we catch up when behind
uhci: Limit amount of frames processed in one go
uhci: Add a QH_VALID define
uhci: Fix pending interrupts getting lost on migration
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Buffered bulk mode is intended for bulk *input* endpoints, where the data is
of a streaming nature (not part of a command-response protocol). These
endpoints' input buffer may overflow if data is not read quickly enough.
So in buffered bulk mode the usb-host takes care of the submitting and
re-submitting of bulk transfers.
Buffered bulk mode is necessary for reliable operation with the bulk in
endpoints of usb to serial convertors. Unfortunatelty buffered bulk input
mode will only work with certain devices, therefor this patch also adds a
usb-id table to enable it for devices which need it, while leaving the
bulk ep handling for other devices unmodified.
Note that the bumping of the required usbredir from 0.5.3 to 0.6 does
not mean that we will now need a newer usbredir release then qemu-1.3,
.pc files reporting 0.5.3 have only ever existed in usbredir builds directly
from git, so qemu-1.3 needs the 0.6 release too.
Changes in v2:
-Split of quirk handling into quirks.c
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Support backend guest notifier masking in vhost-net:
create eventfd at device init, when masked,
make vhost use that as eventfd instead of
sending an interrupt.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This makes it possible to use started flag for sanity checking
of callbacks that happen during start/stop.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As vhost started is cleared last thing on stop,
set it first things on start. This makes it
possible to use vhost_started while start is in
progress which is used by follow-up patches.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
some backends (notably vhost) can mask events
at their source in a way that is more efficient
than masking through kvm.
Specifically
- masking in kvm uses rcu write side so it has high latency
- in kvm on unmask we always send an interrupt
masking at source does not have these issues.
Add such support in virtio.h and use in virtio-pci.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Some guests mask a vector then unmask without changing it.
Store vectors to avoid kvm system calls in this case.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Pass nvqs to set_guest_notifiers. This makes it possible to
save on irqfds by not allocating one for the control vq
for virtio-net.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We already used to support the external proxy facility of FSL MPICs,
but only implemented it halfway correctly.
This patch adds support for
* dynamic enablement of the EPR facility
* interrupt acknowledgement only when the interrupt is delivered
This way the implementation now is closer to real hardware.
Signed-off-by: Alexander Graf <agraf@suse.de>
On e500mc, the platform doesn't provide a way for the CPU to go idle.
To still not uselessly burn CPU time, expose an idle hypercall to the guest
if kvm supports it.
Signed-off-by: Stuart Yoder <stuart.yoder@freescale.com>
[agraf: adjust for current code base, add patch description, fix non-kvm case]
Signed-off-by: Alexander Graf <agraf@suse.de>
Properly implement level-triggered interrupts by withdrawing an
interrupt from the raised queue if the interrupt source de-asserts.
Also withdraw from the raised queue if the interrupt becomes masked.
When CTPR is written, check whether we need to raise or lower the
interrupt output.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Besides making the code cleaner, we will need a separate way to access
IACK in order to implement EPR (external proxy) interrupt delivery.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Search the queue more efficiently by first looking for a non-zero word,
and then using the common bit-searching function to find the bit within
the word. It would be even nicer if bitops_ffsl() could be hooked up
to the compiler intrinsic so that bit-searching instructions could be
used, but that's another matter.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously, the sense and priority bits were masked off when writing
to IVPR, and all interrupts were treated as edge-triggered (despite
the existence of code for handling level-triggered interrupts).
Polarity is implemented only as storage. We don't simulate the
bad effects that you'd get on real hardware if you set this incorrectly,
but at least the guest sees the right thing when it reads back the register.
Sense now controls level/edge on FSL external interrupts (and all
interrupts on non-FSL MPIC). FSL internal interrupts do not have a sense
bit (reads as zero), but are level. FSL timers and IPIs do not have
sense or polarity bits (read as zero), and are edge-triggered. To
accommodate FSL internal interrupts, QEMU's internal notion of whether an
interrupt is level-triggered is separated from the IVPR bit.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The two checks with abort() guard against potential QEMU-internal
problems, but the EOI check stops the guest from causing updates to queue
position -1 and other havoc if it writes EOI with no interrupt in
service.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: remove hunk in code that didn't get applied yet]
Signed-off-by: Alexander Graf <agraf@suse.de>
Besides the private implementation being redundant, namespace collisions
prevented the use of other things in bitops.h.
Serialization does get a bit more awkward, unfortunately, since the
standard bitmap operations are "unsigned long" rather than "uint32_t",
though in exchange we will get faster queue lookups on 64-bit hosts once
we search a word at a time.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This reverts commit a9bd83f4c65de0058659ede009fa1a241f379edd.
This counting approach is not robust against setting a bit that
was already set, or clearing a bit that was already clear. Perhaps
that is considered a bug, but besides the lack of any documentation
for that restriction, it's a pretty unpleasant way for the problem
to manifest itself.
It could be made more robust by testing the current value of the
bit before changing the count, but a later patch speeds up IRQ_check
in all cases, not just when there's nothing pending. Hopefully that
should be adequate to address performance concerns.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously the code relied on the queue's "next" field getting
set to -1 sometime between an update to the bitmap, and the next
call to IRQ_get_next. Sometimes this happened after the update.
Sometimes it happened before the check. Sometimes it didn't happen
at all.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Other priorities are signed, so avoid comparisons between
signed and unsigned.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Critical interrupts on FSL MPIC are not supposed to pay
attention to priority, IACK, EOI, etc. On the currently modeled
version it's not supposed to pay attention to the mask bit either.
Also reorganize to make it easier to implement newer FSL MPIC models,
which encode interrupt level information differently and support
mcheck as well as crit, and to reduce problems for later patches
in this set.
Still missing is the ability to lower the CINT signal to the core,
as IACK/EOI is not used. This will come with general IRQ-source-driven
lowering in the next patch.
New state is added which is not serialized, but instead is recomputed
in openpic_load() by calling the appropriate write_IRQreg function.
This should have the side effect of causing the IRQ outputs to be
raised appropriately on load, which was missing.
The serialization format is altered by swapping ivpr and idr (we'd like
IDR to be restored before we run the IVPR logic), and moving interrupts
to the end (so that other state has been restored by the time we run the
IDR/IVPR logic. Serialization for this driver is not yet in a state
where backwards compatibility is reasonable (assuming it works at all),
and the current serialization format was not built for extensibility.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: fix for current code state]
Signed-off-by: Alexander Graf <agraf@suse.de>
The base openpic specification doesn't provide abbreviated register
names, so it's somewhat understandable that the QEMU code made up
its own, except that most of the names that QEMU used didn't correspond
to the terminology used by any implementation I could find.
In some cases, like PCTP, the phrase "processor current task priority"
could be found in the openpic spec when describing the concept, but
the register itself was labelled "current task priority register"
and every implementation seems to use either CTPR or the full phrase.
In other cases, individual implementations disagree on what to call
the register. The implementations I have documentation for are
Freescale, Raven (MCP750), and IBM. The Raven docs tend to not use
abbreviations at all. The IBM MPIC isn't implemented in QEMU. Thus,
where there's disagreement I chose to use the Freescale abbreviations.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: rebase on current state of the code]
Signed-off-by: Alexander Graf <agraf@suse.de>
This will stop things from breaking once it's properly treated as a
level-triggered interrupt. Note that it's the MPIC's MSI cascade
interrupts that are level-triggered; the individual MSIs are
edge-triggered.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Fix various format errors when debug prints are enabled. Also
cause error checking to happen even when debug prints are not
enabled, and consistently use 0x for hex output.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: adjust for more recent code base, prettify DPRINTF macro]
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch install the timer reset handler. This will be called when
the guest is reset.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: adjust for QOM'ification]
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch fixes the following coding style violations:
- structs have to be typedef and be CamelCase
- if()s are always surrounded by curly braces
Signed-off-by: Alexander Graf <agraf@suse.de>
If we access a register via the QEMU memory inspection commands (e.g.
"xp") rather than from guest code, we won't have a CPU context.
Gracefully fail to access the register in that case, rather than
crashing.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
"opp->nb_irqs-1" would have been a minor coding style error,
but putting in one space but not the other makes it look
confusingly like a numeric literal "-1".
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
It's in the address range that normally contains a magic redirection
to the CPU-specific region of the curretn CPU, but it isn't actually
a per-CPU register. On real hardware BRR1 shows up only at 0x40000,
not at 0x60000 or other non-magic per-CPU areas. Plus, this makes
it possible to read the register on the QEMU command line with "xp".
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously only the spurious vector was sized appropriately
to the openpic model.
Also, instances of "IPVP_VECTOR(opp->spve)" were replace with
just "opp->spve", as opp->spve is already just a vector and not
an IVPR.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
I could not find this register in any spec (FSL, IBM, or OpenPIC)
and the code doesn't do anything with it but initialize, save,
or restore it.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Deefine symbolic names for some register bits, and use some that
have already been defined.
Also convert some register values from hex to decimal when it improves
readability.
IPVP_PRIORITY_MASK is corrected from (0x1F << 16) to (0xF << 16), in
conjunction with making wider use of the symbolic name. I looked at
Freescale and IBM MPIC docs and at the base OpenPIC spec, and all three
had priority as 4 bits rather than 5. Plus, the magic nubmer that is
being replaced with symbolic values treated the field as 4 bits wide.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Add EHCI USB host controller to exynos4210.
Signed-off-by: Liming Wang <walimisdev@gmail.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Reviewed-by: Igor Mitsyanko <i.mitsyanko@samsung.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
It uses a different capsbase and opregbase than the Xilinx device.
Signed-off-by: Liming Wang <walimisdev@gmail.com>
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Cc: Igor Mitsyanko <i.mitsyanko@samsung.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This allows specific derived models to use different values.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
SysBus EHCI was introduced in a hurry before 1.3 Soft Freeze.
To use QOM casts in place of DO_UPCAST() / FROM_SYSBUS(), we need an
identifying type. Introduce generic abstract base types for PCI and
SysBus EHCI to allow multiple types to access the shared fields.
While at it, move the state structs being amended with macros to the
header file so that they can be embedded.
The VMSTATE_PCI_DEVICE() macro does not play nice with the QOM
parent_obj naming convention, so defer that cleanup.
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Due to the way devices are addressed with xhci (done by hardware, not
the guest os) there is no packet when invoking the set-address control
request. Create a dummy packet in that case to avoid null pointer
dereferences.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The xhci-hcd may submit bulk transfers > 65535 bytes even when not using
bulk-in pipeling, so usbredir can only be used in combination with an xhci
hcd if the client has the 32 bits bulk length capability.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
To ensure that interrupt receiving is properly stopped when the guest is
no longer interested in an interrupt endpoint.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Some usb devices (host or network redirection) can benefit from knowing when
the guest stops using an endpoint. Redirection may involve submitting packets
independently from the guest (in combination with a fifo buffer between the
redirection code and the guest), to ensure that buffers of the real usb device
are timely emptied. This is done for example for isoc traffic and for interrupt
input endpoints. But when the (re)submission of packets is done by the device
code, then how does it know when to stop this?
For isoc endpoints this is handled by detecting a set interface (change alt
setting) command, which works well for isoc endpoints. But for interrupt
endpoints currently the redirection code never stops receiving data from
the device, which is less then ideal.
However the controller emulation is aware when a guest looses interest, as
then the qh for the endpoint gets unlinked (ehci, ohci, uhci) or the endpoint
is explicitly stopped (xhci). This patch adds a new ep_stopped USBDevice
method and modifies the hcd code to call this on queue unlink / ep stop.
This makes it possible for the redirection code to properly stop receiving
interrupt input (*) data when the guest no longer has interest in it.
*) And in the future also buffered bulk input.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
usb_ep_find_packet_by_id mistakenly only checks the first packet and if that
is not a match, keeps trying the first packet! This patch fixes this.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This leads to cleaner code in usb-hid, and removes up to a 1000 calls / sec to
qemu_get_clock_ns(vm_clock) if idle-time is set to its default value of 0.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If somehow we've gotten behind a lot, simply skip ahead, like the ehci code
does.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Before this patch uhci would process an unlimited amount of frames when
behind on schedule, by setting the timer to a time already past, causing the
timer subsys to immediately recall the frame_timer function gain.
This would cause invalid cancellations of bulk queues when the catching up
processed more then 32 frames at a moment when the bulk qh was temporarily
unlinked (which the Linux uhci driver does).
This patch fixes this by processing maximum 16 frames in one go, and always
setting the timer one ms later, making the code behave more like the ehci
code.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Rather then using the magic 32 value in various places.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Re-arrange how we process frames / increase frnum / report pending interrupts,
to avoid a 1 ms delay in interrupt reporting to the guest. This increases
the packet throughput for cases where the guest submits a single packet,
then waits for its completion then re-submits from 500 pkts / sec to
1000 pkts / sec. This impacts for example the use of redirected / virtual
usb to serial convertors.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
ehci_raise_irq(s, USBSTS_PCD), gets applied immediately so there is no need
to call commit_irq after it.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
I tried lowering the time between raising an interrupt and rescanning the
async schedule to see if the guest has queued a new transfer before, but
that did not have any positive effect. I now believe the cause for this is
that lowering this time made it more likely to hit the 1 ms interrupt
threshold penalty for the next packet, as described in my
"ehci: Use uframe precision for interrupt threshold checking" commit.
Now that we do interrupt threshold handling with uframe precision, futher
lowering this time from .5 to .25 ms gives an extra 15% improvement in speed
(MB/s) reading from a simple USB-2.0 thumb-drive.
While at it also properly set the int_req_by_async flag for short packet
completions.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Before this patch, the following could happen:
1) Transfer completes, raises interrupt
2) .5 ms later we check if the guest has queued up any new transfers
3) We find and execute a new transfer
4) .2 ms later the new transfer completes
5) We re-run our frame_timer to write back the completion, but less then
1 ms has passed since our last run, so frindex is not changed, so the
interrupt threshold code delays the interrupt
6) 1 ms from the re-run our frame-timer runs again and finally delivers
the interrupt
This leads to unnecessary large delays of interrupts, this code fixes this
by changing frindex to uframe precision and using that for interrupt threshold
control, making the interrupt fire at step 5 for guest which have low interrupt
threshold settings (like Linux).
Note that the guest still sees the frindex move in steps of 8 for migration
compatibility.
This boosts Linux read speed of a simple cheap USB thumb drive by 6 %.
Changes in v2:
-Make the guest see frindex move in steps of 8 by modifying ehci_opreg_read,
rather then using a shadow variable
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
ehci_fill_queue assumes that there is a one on one relationship between an ep
and a qh, this patch adds a check to ensure this.
Note I don't expect this to ever trigger, this is just something I noticed
the guest might do while working on other stuff. The only way this check can
trigger is if a guest mixes in and out qtd-s in a single qh for a non
control ep.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Remove the short-circuiting of fetchqtd in fetchqh, so that the
qtd gets properly verified before completing the transaction.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This is not allowed, except for clearing active on cancellation, so don't
warn when the new token does not have its active bit set.
This unifies the cancellation path for modified qtd-s, and prepares
ehci_verify_qtd to be used ad an extra check inside
ehci_writeback_async_complete_packet().
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Also drop the warning printf, which was there mainly because this was an
untested code path (as the previous bug fixes to it show), but that no
longer is the case now :)
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
A device reset does not affect the link state, only set_link does.
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit b9d03e352c added link
auto-negotiation emulation, it would always set link up by
callback function. Problem exists if original link status
was down, link status should not be changed in auto-negotiation.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Discard packets longer than 16384 when !SBP to match the hardware behavior.
Signed-off-by: Michael Contreras <michael@inetric.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
pc-testdev.c cannot be compiled with MinGW (and other non POSIX hosts):
CC i386-softmmu/hw/i386/../pc-testdev.o
qemu/hw/i386/../pc-testdev.c:38:22: warning: sys/mman.h: file not found
qemu/hw/i386/../pc-testdev.c: In function ‘test_flush_page’:
qemu/hw/i386/../pc-testdev.c:103: warning: implicit declaration of function ‘mprotect’
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This typically reduces the size from 512 bytes to 128 bytes.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sys/mman.h is not needed (tested on Linux) and unavailable for MinGW,
so remove it.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
pc_fw_add_pflash_drv() ignores qemu_find_file() failure, and happily
creates a drive without a medium.
When pc_system_flash_init() asks for its size, bdrv_getlength() fails
with -ENOMEDIUM, which isn't checked either. It fails relatively
cleanly only because -ENOMEDIUM isn't a multiple of 4096:
$ qemu-system-x86_64 -S -vnc :0 -bios nonexistant
qemu: PC system firmware (pflash) must be a multiple of 0x1000
[Exit 1 ]
Fix by handling the qemu_find_file() failure.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Prehistoric leftover, zap it. We poweroff via acpi these days.
And having a port (0x501,0x502) where any random guest write will make
qemu exit -- with no way to turn it off -- is a bad joke anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a test device which supports the kvmctl ioports,
so one can run the KVM unittest suite.
Intended Usage:
qemu-system-x86_64 -nographic \
-device pc-testdev \
-device isa-debug-exit,iobase=0xf4,iosize=0x04 \
-kernel /path/to/kvm/unittests/msr.flat
Where msr.flat is one of the KVM unittests, present on a
separate repo,
git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
[ kraxel: more memory api + qom fixes ]
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Lucas Meneghel Rodrigues <lmr@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When present it makes qemu exit on any write.
Mapped to port 0x501 by default.
Without this patch Anthony doesn't allow me to
remove the bochs bios debug ports because his
test suite uses this.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The hw/dataplane/vring.c code includes linux/virtio_ring.h. Ensure that
we use linux-headers/ instead of the system-wide headers, which may be
out-of-date on older distros.
This resolves the following build error on Debian 6:
CC hw/dataplane/vring.o
cc1: warnings being treated as errors
hw/dataplane/vring.c: In function 'vring_enable_notification':
hw/dataplane/vring.c:71: error: implicit declaration of function 'vring_avail_event'
hw/dataplane/vring.c:71: error: nested extern declaration of 'vring_avail_event'
hw/dataplane/vring.c:71: error: lvalue required as left operand of assignment
Note that we now build dataplane/ for each target instead of only once.
There is no way around this since linux-headers/ is only available for
per-target objects - and it's how virtio, vfio, kvm, and friends are
built.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently, all unknown requests are treated as VIRTIO_BLK_T_IN
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-blk-data-plane feature is easy to integrate into
hw/virtio-blk.c. The data plane can be started and stopped similar to
vhost-net.
Users can take advantage of the virtio-blk-data-plane feature using the
new -device virtio-blk-pci,x-data-plane=on property.
The x-data-plane name was chosen because at this stage the feature is
experimental and likely to see changes in the future.
If the VM configuration does not support virtio-blk-data-plane an error
message is printed. Although we could fall back to regular virtio-blk,
I prefer the explicit approach since it prompts the user to fix their
configuration if they want the performance benefit of
virtio-blk-data-plane.
Limitations:
* Only format=raw is supported
* Live migration is not supported
* Block jobs, hot unplug, and other operations fail with -EBUSY
* I/O throttling limits are ignored
* Only Linux hosts are supported due to Linux AIO usage
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio-blk-data-plane is a subset implementation of virtio-blk. It only
handles read, write, and flush requests. It does this using a dedicated
thread that executes an epoll(2)-based event loop and processes I/O
using Linux AIO.
This approach performs very well but can be used for raw image files
only. The number of IOPS achieved has been reported to be several times
higher than the existing virtio-blk implementation.
Eventually it should be possible to unify virtio-blk-data-plane with the
main body of QEMU code once the block layer and hardware emulation is
able to run outside the global mutex.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Two slightly different versions of a patch to conditionally set
VIRTIO_BLK_F_CONFIG_WCE through the "config-wce" qdev property have been
applied (ea776abca and eec7f96c2). David Gibson
<david@gibson.dropbear.id.au> noticed that the "config-wce"
property is broken as a result and fixed it recently.
The fix sets the host_features VIRTIO_BLK_F_CONFIG_WCE bit from a qdev
property. Unfortunately, the virtio device then has no chance to test
for the presence of the feature bit during virtio_blk_init().
Therefore, reinstate the VirtIOBlkConf->config_wce flag. Drop the
duplicate qdev property to set the host_features bit. The
VirtIOBlkConf->config_wce flag will be used by virtio-blk-data-plane in
a later patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The IOQueue has a pool of iocb structs and a function to add new
read/write requests. Multiple requests can be added before calling the
submit function to actually tell the host kernel to begin I/O. This
allows callers to batch requests and submit them in one go.
The actual I/O is performed using Linux AIO.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Outside the safety of the global mutex we need to poll on file
descriptors. I found epoll(2) is a convenient way to do that, although
other options could replace this module in the future (such as an
AioContext-based loop or glib's GMainLoop).
One important feature of this small event loop implementation is that
the loop can be terminated in a thread-safe way. This allows QEMU to
stop the data plane thread cleanly.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-blk-data-plane cannot access memory using the usual QEMU
functions since it executes outside the global mutex and the memory APIs
are this time are not thread-safe.
This patch introduces a virtqueue module based on the kernel's vhost
vring code. The trick is that we map guest memory ahead of time and
access it cheaply outside the global mutex.
Once the hardware emulation code can execute outside the global mutex it
will be possible to drop this code.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The data plane thread needs to map guest physical addresses to host
pointers. Normally this is done with cpu_physical_memory_map() but the
function assumes the global mutex is held. The data plane thread does
not touch the global mutex and therefore needs a thread-safe memory
mapping mechanism.
Hostmem registers a MemoryListener similar to how vhost collects and
pushes memory region information into the kernel. There is a
fine-grained lock on the regions list which is held during lookup and
when installing a new regions list.
When the physical memory map changes the MemoryListener callbacks are
invoked. They build up a new list of memory regions which is finally
installed when the list has been completed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This optimizes MSIX handling in virtio-pci.
Also included is pci express capability bugfix.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQ2tD2AAoJECgfDbjSjVRpUNcIAKN2c+3iiutUWFBBII2TWppc
QAQ4Q5HK7gCtAnwNrlQMAIXcUzHBd5s6BW74BaFBZYymf/tqe4CsvmIH15qQyvm0
McdJAba3FLk0+TELG/Fmf4+faM/kr3gl5Cve3YJC69NHpcq3gi8V4696sP8cGfUt
atA+NR8AITBJDmQlcq6Vwfp+t+B1MY9D9SROT/BmfO+/kY3krkhlPL2pdcoinBa2
zKJLz+jE0tjz7kZ99bmbb2uzKImvtFwxCVZjhD0UINjDOWd9k6ao2pWQIEftv56z
zwz/L8TKCFdM2350XXPg99f4WbrvBqmg3Slb4vrsIYEuAWvArI8sUSYG3rC4fS4=
=8Jun
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci,virtio
This optimizes MSIX handling in virtio-pci.
Also included is pci express capability bugfix.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* mst/tags/for_anthony:
virtio-pci: don't poll masked vectors
msix: expose access to masked/pending state
msi: add API to get notified about pending bit poll
pcie: Fix bug in pcie_ext_cap_set_next
virtio: make bindings typesafe
There are several ARM and MIPS boards which are manufactured with
either Intel (pflash_cfi01.c) or AMD (pflash_cfi02.c) flash memory.
The Linux kernel supports both and first probes for AMD flash which
resulted in one or two warnings from the Intel flash emulation:
pflash_write: Unimplemented flash cmd sequence (offset 0000000000000000, wcycle 0x0 cmd 0x0 value 0xf000f0)
pflash_write: Unimplemented flash cmd sequence (offset 0000000000000000, wcycle 0x0 cmd 0x0 value 0xf0)
These warnings confuse users, so suppress them.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
* 'qom-cpu' of git://repo.or.cz/qemu/afaerber:
MAINTAINERS: Include X86CPU in CPU maintenance area
cpu: Move kvm_run into CPUState
cpu: Move kvm_state field into CPUState
ppc_booke: Pass PowerPCCPU to ppc_booke_timers_init()
ppc4xx_devs: Return PowerPCCPU from ppc4xx_init()
ppc_booke: Pass PowerPCCPU to {decr,fit,wdt} timer callbacks
ppc: Pass PowerPCCPU to [h]decr timer callbacks
ppc: Pass PowerPCCPU to [h]decr callbacks
ppc: Pass PowerPCCPU to ppc_set_irq()
kvm: Pass CPUState to kvm_vcpu_ioctl()
kvm: Pass CPUState to kvm_arch_*
cpu: Move kvm_fd into CPUState
qdev-properties.c: Separate core from the code used only by qemu-system-*
qdev: Coding style fixes
cpu: Introduce CPUListState struct
target-alpha: Add support for -cpu ?
target-alpha: Turn CPU definitions into subclasses
target-alpha: Avoid leaking the alarm timer over reset
alpha: Pass AlphaCPU array to Typhoon
target-alpha: Let cpu_alpha_init() return AlphaCPU
At the moment, when irqfd is in use but a vector is masked,
qemu will poll it and handle vector masks in userspace.
Since almost no one ever looks at the pending bits,
it is better to defer this until pending bits
are actually read.
Implement this optimization using the new poll notifier.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Upper 16 bits of the PCIe Extended Capability Header was truncated during update,
also breaking pcie_add_capability.
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Enable 64 bits bar emulation.
Test pass with the current seabios which already support 64bit pci bars.
Signed-off-by: Xudong Hao <xudong.hao@intel.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
* Define enum for TMP105 registers
* Move tmp105_set() from I2C to TMP105 header
* Document units and range of temperature as preconditions
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alex Horn <alex.horn@cs.ox.ac.uk>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Move bindings from opaque to DeviceState.
This gives us better type safety with no performance cost.
Add macros to make future QOM work easier.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* bonzini/header-dirs: (45 commits)
janitor: move remaining public headers to include/
hw: move executable format header files to hw/
fpu: move public header file to include/fpu
softmmu: move remaining include files to include/ subdirectories
softmmu: move include files to include/sysemu/
misc: move include files to include/qemu/
qom: move include files to include/qom/
migration: move include files to include/migration/
monitor: move include files to include/monitor/
exec: move include files to include/exec/
block: move include files to include/block/
qapi: move include files to include/qobject/
janitor: add guards to headers
qapi: make struct Visitor opaque
qapi: remove qapi/qapi-types-core.h
qapi: move inclusions of qemu-common.h from headers to .c files
ui: move files to ui/ and include/ui/
qemu-ga: move qemu-ga files to qga/
net: reorganize headers
net: move net.c to net/
...
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
This separates the qdev properties code in two parts:
- qdev-properties.c, that contains most of the qdev properties code;
- qdev-properties-system.c for code specific for qemu-system-*,
containing:
- Property types: drive, chr, netdev, vlan, that depend on code that
won't be included on *-user
- qemu_add_globals(), that depends on qemu-config.o.
This change should help on two things:
- Allowing DeviceState to be used by *-user without pulling
dependencies that are specific for qemu-system-*;
- Writing qdev unit tests without pulling too many dependencies.
The copyright/license of qdev-properties.c isn't explicitly stated at
the file, so add a simple copyright/license header pointing to the
commit ID of the original file.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>