The DS1338 datasheet documents that the current time is captured into
the secondary registers when the register pointer wraps round to zero
as well as at a START condition. Implement this.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Correct several deficiencies in the handling of the register pointer:
* it should wrap around after 0x3f, not 0xff
* guard against the caller handing us an out of range pointer
(on h/w this can never happen, because only a 7 bit value is
transferred over the I2C bus)
* there was confusion over whether nvram[] holds only the 56 bytes
of guest-accessible NVRAM, or also the secondary registers
which hold the value of the clock captured at the start of a
multibyte read. Correct to consistently be the latter, by fixing
the array size and the offset used for NVRAM writes.
* ds1338_send was attempting to use 'data' as both the data and
the register offset simultaneously, which meant that writes to
any register were broken; fix to use the register pointer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
s->cpu_enabled is an array, so s->cpu_enabled ? "En" : "Dis" returns
"En" always. We should use s->cpu_enabled[cpu] here.
Signed-off-by: Evgeny Voevodin <e.voevodin@samsung.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A missing call to qemu_set_irq() when reading the IRQ register
required SW to write to the IRQ register to acknowledge an
interrupt. With this patch the behavior is fixed:
- Reading the interrupt register clears it and updates the timers
interrupt status
- Writes to the interrupt register are ignored
Signed-off-by: Soren Brinkmann <soren.brinkmann@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to the spec we must raise an interrupt when one is requested
even for non active tds.
Linux depends on this, for bulk transfers it runs an inactivity timer
to work around a bug in early uhci revisions, when we take longer then
200 ms to process a packet, this timer goes of, and as part of the
handling Linux then unlinks the qh, and relinks it after the frindex
has increased by atleast 1, the problem is Linux only checks for the
frindex increases on an interrupt, and we don't send that, causing
the qh to go inactive for more then 32 frames, at which point we
consider the packet cancelled.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Instead simple disconnect the device like host redirection does on
migration.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Added helper function to automatically connect SPI slaves based on the QOM child
nodes of a device. A SSI master device can call this routine to automatically
hook-up all child nodes to its SPI bus.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Added the two SPI controllers to the zynq machine model. Attached two SPI flash
devices to each controller.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Added device model for the Xilinx Zynq SPI controller (SPIPS).
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Added SPI controller to the reference design, with two n25q128 spi-flashes
connected.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Removed the explicit SSI mux and wired the CS line directly up to the SSI
devices.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Allow multiple qdev_init_gpio_in() calls for the one device. The first call will
define GPIOs 0-N-1, the next GPIOs N- ... . Allows different GPIOs to be handled
with different handlers. Needed when two levels of the QOM class heirachy both
define GPIO functionality, as a single GPIO handler with an index selecter is
not possible.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Slave creation function that can be used to create an SSI slave without
qdev_init() being called. This give machine models a chance to set properties.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Added default CS behaviour for SSI slaves. SSI devices can set a property
to enable CS behaviour which will create a GPIO on the device which is the
CS. Tristating of the bus on SSI transfers is implemented.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Removed assertion that only one device is attached to the SSI bus.
When multiple devices are attached, all slaves have their transfer function
called for transfers. Each device is responsible for knowing whether or not its
CS is active, and if not returning 0. The returned data is the logical or of
all responses from the (mulitple) devices.
Signed-off-by: Peter A. G. Crosthwaite <peter.crosthwaite@petalogix.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
DO_UPCAST is supposed to translate from the first member of a struct to
that struct, not from arbitrary ones. And it (usually) breaks the build
when neglecting this rule. Use container_of to fix the build breakage
and likely also the runtime behavior.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
aw: runtime behavior is actually the same, but clearly misuse of DO_UPCAST
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Blue Swirl reports that Clang doesn't like the structure we define to
avoid dynamic allocation for a number of calls to VFIO_DEVICE_SET_IRQS.
Adding an element after a variable sized type is a GNU extension.
Switch back to dynamic allocation, which really isn't a problem since
this is only done on interrupt setup changes.
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Take what we've learned from pci-assign and apply it to vfio-pci.
On reset, disable previous interrupt config, perform a device
reset if available, re-enable INTx, and disable memory regions on
the device to prevent continuing DMA.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This was a misinterpretation of the spec, hardware doesn't get to
specify how many were actually enabled through this field.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
We try to do lazy initialization of MSIX since we don't actually need
to setup anything until MSIX vectors start getting used. This leads
to problems if MSIX is enabled, but never used (we can end up trying
to re-enable INTx while it's still enabled). We also run into
problems trying to expand our reset function to tear down interrupts
as we can then get vector release notifications after we've released
data structures. By making explicit initialization and teardown we
can avoid both of these problems and behave more similar to bare
metal.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Occasionally we get regions added that overlap with existing mappings.
These always seems to be in the VGA ROM range. VFIO returns EBUSY
for these mapping attempts. We can try a little harder and assume
that the latest mapping is correct by removing any overlapping ranges
and retrying the original request.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
We can't afford the overhead of switching out and back into mmap mode
around each interrupt, but we can do it lazily via a timer. On INTx
interrupt, disable the mmap'd memory regions and set a timer. On
every interrupt, push the timer out. If the timer expires and the
interrupt is no longer pending, switch back to mmap mode.
This has the benefit that things like graphics cards, which rarely or
never, fire an interrupt don't need manual user intervention to add
the x-intx=off parameter. They'll just remain in mmap mode until they
trigger an interrupt, and if they don't continue to regularly fire
interrupts, they'll switch back.
The default timeout is tuned for network cards so that a ping is just
enough to keep them in non-mmap mode, where they have much better
latency. It is tunable with an experimental option,
x-intx-mmap-timeout-ms. A value of 0 keeps the device in non-mmap
mode after the first interrupt.
It's possible we could look at the class code of devices and come up
with reasonable per-class defaults based on expected interrupt
frequency and latency. None of this is used for MSI interrupts and
also won't be used if we can bypass through KVM.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
nc.link_down could not be migrated, this patch updates link_down in
virtio_post_load() to keep it coincident with real link status.
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
This patch introduced e1000_post_load(), it will be called in the end of
migration. nc.link_down could not be migrated, this patch updates
link_down in e1000_post_load() to keep it coincident with real link
status.
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Add a link status chang callback and change the link status bit in BMSR
& MSR accordingly. Tested in Linux/Windows guests.
The link status bit of MediaStatus is infered from BasicModeStatus,
they are inverse.
nc.link_down could not be migrated, this patch updates link_down in
rtl8139_post_load() to keep it coincident with real link status.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
With the next qemu version (1.3) we are going to bump the qxl device
revision to 4. The new features available require a recent spice-server
version, so raise up the bar. Otherwise we would end up with different
qxl revisions depending on the spice-server version installed, which
would be a major PITA when it comes to compat properties.
Clear out a big bunch of #ifdefs which are not needed any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
As suggested by Paolo Bonzini, to avoid possible integer overflow issues.
Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This prevents a segfault later on when the device reset handler
tries to access a NULL ssd.worker since interface_attach_worker has
not been called.
Signed-off-by: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Enables QXL_IO_FLUSH_SURFACES_ASYNC and QXL_IO_FLUSH_RELEASE
which are part of the qxl rev3 feature set.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This cleans up two additions of almost the same code in commits
511b13e2c9 and ccc2960d65. While at it, make error paths
consistent (always use 'break' instead of 'return').
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Cc: Dunrong Huang <riegamaths@gmail.com>
Cc: Alon Levy <alevy@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Don't try to be clever and skip displaysurface reinitialization in case
the size hasn't changed. Other parameters might have changed
nevertheless, for example depth or stride, resulting in rendering being
broken then.
Trigger: boot linux guest with vesafb, start X11, make sure both vesafb
and X11 use the display same resolution. Then watch X11 screen being
upside down.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
* 'trivial-patches' of git://github.com/stefanha/qemu:
versatilepb: Use symbolic indices for ARM PIC
qdev: kill bogus comment
qemu-barrier: Fix compiler version check for future gcc versions
hw: Add missing 'static' attribute for QEMUMachine
cleanup useless return sentence
qemu-sockets: Fix compiler warning (regression for MinGW)
vnc: Fix spelling (hellmen -> hellman) in comment
slirp: Fix spelling in comment (enought -> enough, insure -> ensure)
tcg/arm: Use tcg_out_mov_reg rather than inline equivalent code
cpu: Add missing 'static' attribute to qemu_global_mutex
configure: Support empty target list (--target-list=)
hw: Fix return value check for bdrv_read, bdrv_write
* 'ppc-for-upstream' of git://repo.or.cz/qemu/agraf: (35 commits)
PPC: KVM: Fix BAT put
PPC: e500: Only expose even TLB sizes in initial TLB
ppc/pseries: Reset VPA registration on CPU reset
pseries: Don't test for MSR_PR for hypercalls under KVM
PPC: e500: calculate initrd_base like dt_base
PPC: e500: increase DTC_LOAD_PAD
device tree: simplify dumpdtb code
fdt: move dumpdtb interpretation code to device_tree.c
target-ppc: Remove unused power_mode field from cpu state
pseries: Set hash table size based on RAM size
pseries: Remove unnecessary locking from PAPR hash table hcalls
ppc405_uc: Fix buffer overflow
target-ppc: KVM: Fix some kernel version edge cases for kvmppc_reset_htab()
pseries: Fix semantics of RTAS int-on, int-off and set-xive functions
pseries: Rework implementation of TCE bypass
pseries: Remove never used flags field from spapr vio devices
pseries: Remove XICS irq type enum type
pseries: Remove C bitfields from xics code
pseries: Small cleanup to H_CEDE implementation
pseries: Fix XICS reset
...
Now that all machines call isa_vga_init() or pci_vga_init(), some unused
code can be removed.
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The CONFIG_SPICE is now tested in vl.c and thus not needed anymore.
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
As a bonus it allows new vga card types (including none).
Acked-by: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Keep the case to prevent some vga card to be selected.
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
As a bonus it allows new vga card types (including none).
Acked-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This remove the fallback to std-vga in case, as availability of the
requested vga device is now tested in vl.c, and returns an error message
to the user.
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This function create a ISA VGA device according to the value of
vga_interface_type. It returns a ISADevice (and not a DeviceState).
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This function create a PCI VGA device according to the value of
vga_interface_type. It returns a PCIDevice (and not a DeviceState).
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This better explains what is this function about. Adjust all callers.
Cc: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
This better explains what is this function about. Adjust all callers.
Cc: Alexander Graf <agraf@suse.de>
Cc: Andreas Färber <andreas.faerber@web.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Anthony Liguori <aliguori@us.ibm.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Acked-by: Blue Swirl <blauwirbel@gmail.com>
Acked-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
The entries for libhw* are no longer needed in .gitignore.
There is also no longer a difference between common-obj-y and
hw-obj-y, so one of those two macros is sufficient.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
It is more readable, and all other code does it like that, too.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
When the DeviceInfo code was removed, the comment describing
qdev_subclass_init() was left in the code by mistake. Remove it.
Cc: qemu-trivial@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
It was missing for leon3 and mips_fulong2e.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
This patch cleans up return sentences in the end of void functions.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
Implement the century byte in the RTC emulation, and test that it works.
This leads to some annoying compatibility code because we need to treat
a value of 2000 for the base_year property as "use the century byte
properly" (which would be a value of 0).
The century byte will now be always-zero, rather than always-20,
for the MIPS Magnum machine whose base_year is 1980. Commit 42fc73a
(Support epoch of 1980 in RTC emulation for MIPS Magnum, 2009-01-24)
correctly said:
With an epoch of 1980 and a year of 2009, one could argue that [the
century byte] should hold either 0, 1, 19 or 20. NT 3.50 on MIPS
does not read the century byte.
so I picked the simplest and most sensible implementation which is to
return 0 for 1980-2079, 1 for 2080-2179 and so on.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
QEMU's attempt to implement the century byte cover two possible places
for the byte. A common one on modern chipsets is 0x32, but QEMU also
stores the value in 0x37 (apparently for IBM PS/2 compatibility---it's
only been 25 years). To simplify the implementation of the century
byte, store it only at 0x32 but remap transparently 0x37 to 0x32 when
reading and writing from CMOS.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Adjust all uses s/strzcpy/strncpy/ and mark these uses
of strncpy as "ok".
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Don't use strncpy when the source string is known to fit
in the destination buffer. Use equivalent memcpy.
We could even use strcpy, here, but some static analyzers
warn about that, so don't add new uses.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In all of these cases, the uses of strncpy were unnecessary, since
at each point of use we know that the NUL-terminated source bytes
fit in the destination buffer. Use memcpy in place of strncpy.
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
In bt_hci_name_req a failed snprintf could return len larger than
sizeof(params.name), which means the following memset call would
have a "length" value of (size_t)-1, -2, etc... Sounds scary.
But currently, one can deduce that there is no problem:
strlen(slave->lmp_name) is guaranteed to be smaller than
CHANGE_LOCAL_NAME_CP_SIZE, which is the same as sizeof(params.name),
so this cannot happen. Regardless, there is no justification for
using snprintf+memset. Use pstrcpy instead.
Also, in bt_hci_event_complete_read_local_name, use pstrcpy in place
of unwarranted strncpy.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Actually do what the comment says, using pstrcpy NUL-terminate:
strncpy does not always do that.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
v9fs_add_dir_node and qemu_v9fs_synth_add_file used strncpy
to form node->name, which requires NUL-termination, but
strncpy does not ensure NUL-termination.
Use pstrcpy, which does.
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Use g_strdup rather than strdup, because the sole caller
(qdev_get_fw_dev_path_helper) assumes it gets non-NULL, and dereferences
it. Besides, in that caller, the allocated buffer is already freed with
g_free, so it's better to allocate with a matching g_strdup.
In one case, (scsi-bus.c) it was trivial, so I replaced an snprintf+
g_strdup combination with an equivalent g_strdup_printf use.
Signed-off-by: Jim Meyering <meyering@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Those functions return -errno in case of an error.
The old code would typically only detect EPERM (1) errors.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@gmail.com>
* sstabellini/xen-2012-10-03:
xen: Set the vram dirty when an error occur.
exec, memory: Call to xen_modified_memory.
exec: Introduce helper to set dirty flags.
xen: Introduce xen_modified_memory.
QMP, Introduce xen-set-global-dirty-log command.
qemu/xen: Add 64 bits big bar support on qemu
xen: Fix, no unplug of pt device by platform device.
* kwolf/for-anthony: (30 commits)
qemu-iotests: add tests for streaming error handling
qemu-iotests: map underscore to dash in QMP argument names
blkdebug: process all set_state rules in the old state
stream: add on-error argument
block: introduce block job error
iostatus: reorganize io error code
iostatus: change is_read to a bool
iostatus: move BlockdevOnError declaration to QAPI
iostatus: rename BlockErrorAction, BlockQMPEventAction
qemu-iotests: add test for pausing a streaming operation
qmp: add block-job-pause and block-job-resume
block: add support for job pause/resume
qmp: add 'busy' member to BlockJobInfo
block: add block_job_query
block: move job APIs to separate files
block: fix documentation of block_job_cancel_sync
qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE
qemu-iotests: add initial tests for live block commit
QAPI: add command for live block commit, 'block-commit'
block: helper function, to find the base image of a chain
...
* qmp/queue/qmp:
block: live snapshot documentation tweaks
input: index_from_key(): drop unused code
qmp: qmp_send_key(): accept key codes in hex
input: qmp_send_key(): simplify
hmp: dump-guest-memory: hardcode protocol argument to "file:"
qmp: dump-guest-memory: don't spin if non-blocking fd would block
qmp: dump-guest-memory: improve schema doc (again)
qapi: convert add_client
monitor: add Error * argument to monitor_get_fd
pci-assign: use monitor_handle_fd_param
qapi: add "unix" to the set of reserved words
qapi: do not protect enum values from namespace pollution
Add qemu-ga-client script
Support settimeout in QEMUMonitorProtocol
Make negotiation optional in QEMUMonitorProtocol
* mst/tags/for_anthony:
virtio-serial-bus: let chardev know the exact number of bytes requested
virtio: Introduce virtqueue_get_avail_bytes()
virtio: use unsigned int for counting bytes in vq
iov: add const annotation
virtio-net: fix used len for tx
virtio: don't mark unaccessed memory as dirty
* kraxel/usb.66:
usb: Fix usb_packet_map() in the presence of IOMMUs
usb-redir: Adjust pkg-config check for usbredirparser .pc file rename (v2)
ehci: Fix interrupt packet MULT handling
xhci: create a memory region for each port
xhci: route string & usb hub support
xhci: tweak limits
compat: turn off msi/msix on xhci for old machine types
add pc-1.3 machine type
Conflicts:
hw/pc_piix.c
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The hassle and compile time overhead of maintaining both 32-bit and 64-bit
capable source isn't worth the tiny performance advantage which is seen on
a minority of configurations. Switch to compiling libhw only once, with
target_phys_addr_t unconditionally typedefed to uint64_t.
Signed-off-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
When booting our e500 machine, we automatically generate a big TLB entry
in TLB1 that covers all of the code we need to run in there until the guest
can handle its TLB on its own.
However, e500v2 can only handle MAS1.0 sizes. However, we keep our TLB
information in MAS2.0 layout, which means we have twice as many TLB sizes
to choose from. That also means we can run into a situation where we try
to add a TLB size that could not fit into the MAS1.0 size bits.
Fix it by making sure we always have the lower bit set to 0. That way we
are always guaranteed to have MAS1.0 compatible TLB size information.
Signed-off-by: Alexander Graf <agraf@suse.de>
PAPR hypercalls should only be invoked from the guest kernel, not guest
user programs, that is, with MSR[PR]=0. Currently we check this in
spapr_hypercall, returning H_PRIVILEGE if MSR[PR]=1.
However, under KVM the state of MSR[PR] is already checked by the host
kernel before passing the hypercall to qemu, making this check redundant.
Worse, however, we don't generally synchronize KVM and qemu state on the
hypercall path, meaning that qemu could incorrectly reject a hypercall
because it has a stale MSR value.
This patch fixes the problem by moving the privilege test exclusively to
the TCG hypercall path.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
CC: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
While investigating dtb pad issues, I noticed that initrd_base wasn't taking
loadaddr into account the way dt_base was. This seems wrong.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
An allowance of 5 MiB for BSS is not enough for Linux kernels with certain
debug options enabled (not sure exactly which one caused it, but I'd guess
lockdep). The kernel I ran into this with had a BSS of around 6.4 MB.
Unfortunately, uImage does not give us enough information to determine the
actual BSS size. Increase the allowance to 18 MiB to give us plenty of
room. Eventually this should be more intelligent, possibly packing
initrd+dtb at the end of guest RAM.
Signed-off-by: Scott Wood <scottwood@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the pseries machine code always attempts to set the size of the
guests's hash page table to 16MB. However, because of the way the POWER
MMU works, a suitable hash page table size should really depend on memory
size. 16MB will be excessive for guests with <1GB and RAM, and may not be
enough for guests with >2GB of RAM (depending on guest page size and
other factors).
The usual given rule of thumb is that the hash table should be 1/64 of
the size of memory, but in fact the Linux guests we are aiming at don't
really need that much. This patch, therefore, changes the hash table
allocation code to aim for 1/128 of the size of RAM (rounding up). When
using KVM, this size may still be adjusted by the host kernel if it is
unable to allocate a suitable (contiguous) table.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
In the paravirtualized environment provided by PAPR, there is a standard
locking scheme so that hypercalls updating the hash page table from
different guest threads don't corrupt the haah table state. We implement
this HVLOCK bit in out page table hypercalls. However, it is not necessary
in our case, since the hypercalls all run in the qemu environment under the
big qemu lock.
Therefore, this patch removes the locking code. This has the additional
advantage of freeing up a hash PTE bit which will be useful for migration
support.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Report from smatch:
ppc405_uc.c:209 dcr_read_pob(12) error: buffer overflow 'pob->besr' 2 <= 2
ppc405_uc.c:232 dcr_write_pob(12) error: buffer overflow 'pob->besr' 2 <= 2
The old code reads and writes besr[POB0_BESR1 - POB0_BESR0] or besr[2]
which is one too much.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the ibm,int-on and ibm,int-off RTAS functions are implemented as
no-ops. This is because when implemented as specified in PAPR they caused
Linux (which calls both int-on/off and set-xive) to end up with interrupts
masked when they should not be. Since Linux's set-xive calls make the
int-on/off calls redundant, making them nops worked around the problem.
In fact, the problem was caused because there was a subtle bug in set-xive,
PAPR specifies that as well as updating the current priority, it also needs
to update the saved priority used by int-on/off. With this bug fixed the
problem goes away. This patch implements this more correct fix.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
On the pseries machine the IOMMU (aka TCE tables) is always active for all
PCI and VIO devices. Mostly to simplify the SLOF firmware, we implement an
extension which allows the IOMMU to be temporarily disabled for certain
devices.
Currently this is implemented by setting the device's DMAContext pointer to
NULL (thus reverting to qemu's default no-IOMMU DMA behaviour), then
replacing it when bypass mode is disabled.
This approach causes a bunch of complications though. It complexifies the
management of the DMAContext lifetimes, it's problematic for savevm/loadvm,
and it means that while bypass is active we have nowhere to store the
device's LIOBN (Logical IO Bus Number, used to identify DMA address
spaces). At present we regenerate the LIOBN from other address information
but this restricts how we can allocate LIOBNs.
This patch gives up on this approach, replacing it with the much simpler
one of having a 'bypass' boolean flag in the TCE state structure.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The general device state structure for PAPR VIO emulated devices includes a
'flags' field which was never used. This patch removes it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently the XICS interrupt controller emulation uses a custom enum to
specify whether a given interrupt is level-sensitive or message-triggered.
This enum makes life awkward for saving the state, and isn't particularly
useful since there are only two possibilities. This patch replaces the
enum with a simple bool.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The XICS interrupt controller emulation uses some C bitfield variables in
its internal state structure. This makes like awkward for saving the state
because we don't have easy VMSTATE helpers for bitfields.
This patch removes the bitfields, instead using explicit bit masking in a
single status variable.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The H_CEDE hypercall implementation for the pseries machine doesn't trigger
quite the right path in the main cpu exec loop. We should set exit_request
to pop up one extra level and recheck state, and we should set the
exception_index to EXCP_HLT (H_CEDE is roughly equivalent to the hlt
instruction on x86).
In practice, this doesn't really matter except for KVM, and KVM implements
H_CEDE internally so we never hit this code path. But we might as well
get it right, just in case it matters some day.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The XICS interrupt controller used on the pseries machine currently has no
reset handler. We can get away with this under some circumstances, but
it's not correct, and can cause failures if the XICS happens to be in the
wrong state at the time of reset.
This patch adds a hook to properly reset the XICS state.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The emulated PCI host bridge on the pseries machine incorporates an IOMMU
(PAPR TCE table). Currently the mappings in this IOMMU are not cleared
when we reset the system. This patch fixes this bug. To do this it adds
a new reset function to the IOMMU emulation code. The VIO devices already
reset their TCE tables, but they do so by destroying and re-creating their
DMA context. This doesn't work for the PCI host bridge, because the
infrastructure for PCI IOMMUs has already copied/cached the DMA pointer
context into the subordinate PCI device structures.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
When we reset the system, the reset method for VIO bus devices resets
the state of their request queue (if present) as it should. However
it was not resetting the state of their TCE table (DMA translation) if
present. It was also not resetting the state of the per-device signal
mask set with H_VIO_SIGNAL. This patch corrects both bugs, and also
removes some small code duplication in the reset paths.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds support for then new "reset htab" ioctl which allows qemu
to properly cleanup the MMU hash table when the guest is reset. With
the corresponding kernel support, reset of a guest now works properly.
This also paves the way for indicating a different size hash table
to the kernel and for the kernel to be able to impose limits on
the requested size.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>