Besides making the code cleaner, we will need a separate way to access
IACK in order to implement EPR (external proxy) interrupt delivery.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Search the queue more efficiently by first looking for a non-zero word,
and then using the common bit-searching function to find the bit within
the word. It would be even nicer if bitops_ffsl() could be hooked up
to the compiler intrinsic so that bit-searching instructions could be
used, but that's another matter.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously, the sense and priority bits were masked off when writing
to IVPR, and all interrupts were treated as edge-triggered (despite
the existence of code for handling level-triggered interrupts).
Polarity is implemented only as storage. We don't simulate the
bad effects that you'd get on real hardware if you set this incorrectly,
but at least the guest sees the right thing when it reads back the register.
Sense now controls level/edge on FSL external interrupts (and all
interrupts on non-FSL MPIC). FSL internal interrupts do not have a sense
bit (reads as zero), but are level. FSL timers and IPIs do not have
sense or polarity bits (read as zero), and are edge-triggered. To
accommodate FSL internal interrupts, QEMU's internal notion of whether an
interrupt is level-triggered is separated from the IVPR bit.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
The two checks with abort() guard against potential QEMU-internal
problems, but the EOI check stops the guest from causing updates to queue
position -1 and other havoc if it writes EOI with no interrupt in
service.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: remove hunk in code that didn't get applied yet]
Signed-off-by: Alexander Graf <agraf@suse.de>
Besides the private implementation being redundant, namespace collisions
prevented the use of other things in bitops.h.
Serialization does get a bit more awkward, unfortunately, since the
standard bitmap operations are "unsigned long" rather than "uint32_t",
though in exchange we will get faster queue lookups on 64-bit hosts once
we search a word at a time.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
This reverts commit a9bd83f4c65de0058659ede009fa1a241f379edd.
This counting approach is not robust against setting a bit that
was already set, or clearing a bit that was already clear. Perhaps
that is considered a bug, but besides the lack of any documentation
for that restriction, it's a pretty unpleasant way for the problem
to manifest itself.
It could be made more robust by testing the current value of the
bit before changing the count, but a later patch speeds up IRQ_check
in all cases, not just when there's nothing pending. Hopefully that
should be adequate to address performance concerns.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously the code relied on the queue's "next" field getting
set to -1 sometime between an update to the bitmap, and the next
call to IRQ_get_next. Sometimes this happened after the update.
Sometimes it happened before the check. Sometimes it didn't happen
at all.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Other priorities are signed, so avoid comparisons between
signed and unsigned.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Critical interrupts on FSL MPIC are not supposed to pay
attention to priority, IACK, EOI, etc. On the currently modeled
version it's not supposed to pay attention to the mask bit either.
Also reorganize to make it easier to implement newer FSL MPIC models,
which encode interrupt level information differently and support
mcheck as well as crit, and to reduce problems for later patches
in this set.
Still missing is the ability to lower the CINT signal to the core,
as IACK/EOI is not used. This will come with general IRQ-source-driven
lowering in the next patch.
New state is added which is not serialized, but instead is recomputed
in openpic_load() by calling the appropriate write_IRQreg function.
This should have the side effect of causing the IRQ outputs to be
raised appropriately on load, which was missing.
The serialization format is altered by swapping ivpr and idr (we'd like
IDR to be restored before we run the IVPR logic), and moving interrupts
to the end (so that other state has been restored by the time we run the
IDR/IVPR logic. Serialization for this driver is not yet in a state
where backwards compatibility is reasonable (assuming it works at all),
and the current serialization format was not built for extensibility.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: fix for current code state]
Signed-off-by: Alexander Graf <agraf@suse.de>
The base openpic specification doesn't provide abbreviated register
names, so it's somewhat understandable that the QEMU code made up
its own, except that most of the names that QEMU used didn't correspond
to the terminology used by any implementation I could find.
In some cases, like PCTP, the phrase "processor current task priority"
could be found in the openpic spec when describing the concept, but
the register itself was labelled "current task priority register"
and every implementation seems to use either CTPR or the full phrase.
In other cases, individual implementations disagree on what to call
the register. The implementations I have documentation for are
Freescale, Raven (MCP750), and IBM. The Raven docs tend to not use
abbreviations at all. The IBM MPIC isn't implemented in QEMU. Thus,
where there's disagreement I chose to use the Freescale abbreviations.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: rebase on current state of the code]
Signed-off-by: Alexander Graf <agraf@suse.de>
This will stop things from breaking once it's properly treated as a
level-triggered interrupt. Note that it's the MPIC's MSI cascade
interrupts that are level-triggered; the individual MSIs are
edge-triggered.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Fix various format errors when debug prints are enabled. Also
cause error checking to happen even when debug prints are not
enabled, and consistently use 0x for hex output.
Signed-off-by: Scott Wood <scottwood@freescale.com>
[agraf: adjust for more recent code base, prettify DPRINTF macro]
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch install the timer reset handler. This will be called when
the guest is reset.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
[agraf: adjust for QOM'ification]
Signed-off-by: Alexander Graf <agraf@suse.de>
This patch fixes the following coding style violations:
- structs have to be typedef and be CamelCase
- if()s are always surrounded by curly braces
Signed-off-by: Alexander Graf <agraf@suse.de>
If we access a register via the QEMU memory inspection commands (e.g.
"xp") rather than from guest code, we won't have a CPU context.
Gracefully fail to access the register in that case, rather than
crashing.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
"opp->nb_irqs-1" would have been a minor coding style error,
but putting in one space but not the other makes it look
confusingly like a numeric literal "-1".
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
It's in the address range that normally contains a magic redirection
to the CPU-specific region of the curretn CPU, but it isn't actually
a per-CPU register. On real hardware BRR1 shows up only at 0x40000,
not at 0x60000 or other non-magic per-CPU areas. Plus, this makes
it possible to read the register on the QEMU command line with "xp".
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previously only the spurious vector was sized appropriately
to the openpic model.
Also, instances of "IPVP_VECTOR(opp->spve)" were replace with
just "opp->spve", as opp->spve is already just a vector and not
an IVPR.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
I could not find this register in any spec (FSL, IBM, or OpenPIC)
and the code doesn't do anything with it but initialize, save,
or restore it.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Deefine symbolic names for some register bits, and use some that
have already been defined.
Also convert some register values from hex to decimal when it improves
readability.
IPVP_PRIORITY_MASK is corrected from (0x1F << 16) to (0xF << 16), in
conjunction with making wider use of the symbolic name. I looked at
Freescale and IBM MPIC docs and at the base OpenPIC spec, and all three
had priority as 4 bits rather than 5. Plus, the magic nubmer that is
being replaced with symbolic values treated the field as 4 bits wide.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
pc-testdev.c cannot be compiled with MinGW (and other non POSIX hosts):
CC i386-softmmu/hw/i386/../pc-testdev.o
qemu/hw/i386/../pc-testdev.c:38:22: warning: sys/mman.h: file not found
qemu/hw/i386/../pc-testdev.c: In function ‘test_flush_page’:
qemu/hw/i386/../pc-testdev.c:103: warning: implicit declaration of function ‘mprotect’
...
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
This typically reduces the size from 512 bytes to 128 bytes.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
sys/mman.h is not needed (tested on Linux) and unavailable for MinGW,
so remove it.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
pc_fw_add_pflash_drv() ignores qemu_find_file() failure, and happily
creates a drive without a medium.
When pc_system_flash_init() asks for its size, bdrv_getlength() fails
with -ENOMEDIUM, which isn't checked either. It fails relatively
cleanly only because -ENOMEDIUM isn't a multiple of 4096:
$ qemu-system-x86_64 -S -vnc :0 -bios nonexistant
qemu: PC system firmware (pflash) must be a multiple of 0x1000
[Exit 1 ]
Fix by handling the qemu_find_file() failure.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Prehistoric leftover, zap it. We poweroff via acpi these days.
And having a port (0x501,0x502) where any random guest write will make
qemu exit -- with no way to turn it off -- is a bad joke anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add a test device which supports the kvmctl ioports,
so one can run the KVM unittest suite.
Intended Usage:
qemu-system-x86_64 -nographic \
-device pc-testdev \
-device isa-debug-exit,iobase=0xf4,iosize=0x04 \
-kernel /path/to/kvm/unittests/msr.flat
Where msr.flat is one of the KVM unittests, present on a
separate repo,
git://git.kernel.org/pub/scm/virt/kvm/kvm-unit-tests.git
[ kraxel: more memory api + qom fixes ]
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Lucas Meneghel Rodrigues <lmr@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When present it makes qemu exit on any write.
Mapped to port 0x501 by default.
Without this patch Anthony doesn't allow me to
remove the bochs bios debug ports because his
test suite uses this.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The hw/dataplane/vring.c code includes linux/virtio_ring.h. Ensure that
we use linux-headers/ instead of the system-wide headers, which may be
out-of-date on older distros.
This resolves the following build error on Debian 6:
CC hw/dataplane/vring.o
cc1: warnings being treated as errors
hw/dataplane/vring.c: In function 'vring_enable_notification':
hw/dataplane/vring.c:71: error: implicit declaration of function 'vring_avail_event'
hw/dataplane/vring.c:71: error: nested extern declaration of 'vring_avail_event'
hw/dataplane/vring.c:71: error: lvalue required as left operand of assignment
Note that we now build dataplane/ for each target instead of only once.
There is no way around this since linux-headers/ is only available for
per-target objects - and it's how virtio, vfio, kvm, and friends are
built.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Currently, all unknown requests are treated as VIRTIO_BLK_T_IN
Signed-off-by: Alexey Zaytsev <alexey.zaytsev@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-blk-data-plane feature is easy to integrate into
hw/virtio-blk.c. The data plane can be started and stopped similar to
vhost-net.
Users can take advantage of the virtio-blk-data-plane feature using the
new -device virtio-blk-pci,x-data-plane=on property.
The x-data-plane name was chosen because at this stage the feature is
experimental and likely to see changes in the future.
If the VM configuration does not support virtio-blk-data-plane an error
message is printed. Although we could fall back to regular virtio-blk,
I prefer the explicit approach since it prompts the user to fix their
configuration if they want the performance benefit of
virtio-blk-data-plane.
Limitations:
* Only format=raw is supported
* Live migration is not supported
* Block jobs, hot unplug, and other operations fail with -EBUSY
* I/O throttling limits are ignored
* Only Linux hosts are supported due to Linux AIO usage
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio-blk-data-plane is a subset implementation of virtio-blk. It only
handles read, write, and flush requests. It does this using a dedicated
thread that executes an epoll(2)-based event loop and processes I/O
using Linux AIO.
This approach performs very well but can be used for raw image files
only. The number of IOPS achieved has been reported to be several times
higher than the existing virtio-blk implementation.
Eventually it should be possible to unify virtio-blk-data-plane with the
main body of QEMU code once the block layer and hardware emulation is
able to run outside the global mutex.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Two slightly different versions of a patch to conditionally set
VIRTIO_BLK_F_CONFIG_WCE through the "config-wce" qdev property have been
applied (ea776abca and eec7f96c2). David Gibson
<david@gibson.dropbear.id.au> noticed that the "config-wce"
property is broken as a result and fixed it recently.
The fix sets the host_features VIRTIO_BLK_F_CONFIG_WCE bit from a qdev
property. Unfortunately, the virtio device then has no chance to test
for the presence of the feature bit during virtio_blk_init().
Therefore, reinstate the VirtIOBlkConf->config_wce flag. Drop the
duplicate qdev property to set the host_features bit. The
VirtIOBlkConf->config_wce flag will be used by virtio-blk-data-plane in
a later patch.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The IOQueue has a pool of iocb structs and a function to add new
read/write requests. Multiple requests can be added before calling the
submit function to actually tell the host kernel to begin I/O. This
allows callers to batch requests and submit them in one go.
The actual I/O is performed using Linux AIO.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Outside the safety of the global mutex we need to poll on file
descriptors. I found epoll(2) is a convenient way to do that, although
other options could replace this module in the future (such as an
AioContext-based loop or glib's GMainLoop).
One important feature of this small event loop implementation is that
the loop can be terminated in a thread-safe way. This allows QEMU to
stop the data plane thread cleanly.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The virtio-blk-data-plane cannot access memory using the usual QEMU
functions since it executes outside the global mutex and the memory APIs
are this time are not thread-safe.
This patch introduces a virtqueue module based on the kernel's vhost
vring code. The trick is that we map guest memory ahead of time and
access it cheaply outside the global mutex.
Once the hardware emulation code can execute outside the global mutex it
will be possible to drop this code.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The data plane thread needs to map guest physical addresses to host
pointers. Normally this is done with cpu_physical_memory_map() but the
function assumes the global mutex is held. The data plane thread does
not touch the global mutex and therefore needs a thread-safe memory
mapping mechanism.
Hostmem registers a MemoryListener similar to how vhost collects and
pushes memory region information into the kernel. There is a
fine-grained lock on the regions list which is held during lookup and
when installing a new regions list.
When the physical memory map changes the MemoryListener callbacks are
invoked. They build up a new list of memory regions which is finally
installed when the list has been completed.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This optimizes MSIX handling in virtio-pci.
Also included is pci express capability bugfix.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.12 (GNU/Linux)
iQEcBAABAgAGBQJQ2tD2AAoJECgfDbjSjVRpUNcIAKN2c+3iiutUWFBBII2TWppc
QAQ4Q5HK7gCtAnwNrlQMAIXcUzHBd5s6BW74BaFBZYymf/tqe4CsvmIH15qQyvm0
McdJAba3FLk0+TELG/Fmf4+faM/kr3gl5Cve3YJC69NHpcq3gi8V4696sP8cGfUt
atA+NR8AITBJDmQlcq6Vwfp+t+B1MY9D9SROT/BmfO+/kY3krkhlPL2pdcoinBa2
zKJLz+jE0tjz7kZ99bmbb2uzKImvtFwxCVZjhD0UINjDOWd9k6ao2pWQIEftv56z
zwz/L8TKCFdM2350XXPg99f4WbrvBqmg3Slb4vrsIYEuAWvArI8sUSYG3rC4fS4=
=8Jun
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'mst/tags/for_anthony' into staging
pci,virtio
This optimizes MSIX handling in virtio-pci.
Also included is pci express capability bugfix.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* mst/tags/for_anthony:
virtio-pci: don't poll masked vectors
msix: expose access to masked/pending state
msi: add API to get notified about pending bit poll
pcie: Fix bug in pcie_ext_cap_set_next
virtio: make bindings typesafe
There are several ARM and MIPS boards which are manufactured with
either Intel (pflash_cfi01.c) or AMD (pflash_cfi02.c) flash memory.
The Linux kernel supports both and first probes for AMD flash which
resulted in one or two warnings from the Intel flash emulation:
pflash_write: Unimplemented flash cmd sequence (offset 0000000000000000, wcycle 0x0 cmd 0x0 value 0xf000f0)
pflash_write: Unimplemented flash cmd sequence (offset 0000000000000000, wcycle 0x0 cmd 0x0 value 0xf0)
These warnings confuse users, so suppress them.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>