The fast path of cpu_physical_memory_sync_dirty_bitmap() directly
manipulates the dirty bitmap. Use atomic_xchg() to make the
test-and-clear atomic.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-7-git-send-email-stefanha@redhat.com>
[Only do xchg on nonzero words. - Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The cpu_physical_memory_reset_dirty() function is sometimes used
together with cpu_physical_memory_get_dirty(). This is not atomic since
two separate accesses to the dirty memory bitmap are made.
Turn cpu_physical_memory_reset_dirty() and
cpu_physical_memory_clear_dirty_range_type() into the atomic
cpu_physical_memory_test_and_clear_dirty().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-6-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The dirty memory bitmap is managed by ram_addr.h and copied to
migration_bitmap[] periodically during live migration.
Move the code to sync the bitmap to ram_addr.h where related code lives.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-5-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use set_bit_atomic() and bitmap_set_atomic() so that multiple threads
can dirty memory without race conditions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-4-git-send-email-stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new bitmap_test_and_clear_atomic() function clears a range and
returns whether or not the bits were set.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-3-git-send-email-stefanha@redhat.com>
[Test before xchg; then a full barrier is needed at the end just like
in the previous patch. The barrier can be avoided if we did at least
one xchg. - Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use atomic_or() for atomic bitmaps where several threads may set bits at
the same time. This avoids the race condition between threads loading
an element, bitwise ORing, and then storing the element.
When setting all bits in a word we can avoid atomic ops and instead just
use an smp_mb() at the end.
Most bitmap users don't need atomicity so introduce new functions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1417519399-3166-2-git-send-email-stefanha@redhat.com>
[Avoid barrier in the single word case, use full barrier instead of write.
- Paolo]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
cpu_physical_memory_set_dirty_lebitmap unconditionally syncs the
DIRTY_MEMORY_CODE bitmap. This however is unused unless TCG is
enabled.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Most of the time, not all bitmaps have to be marked as dirty;
do not do anything if the interesting ones are already dirty.
Previously, any clean bitmap would have cause all the bitmaps to be
marked dirty.
In fact, unless running TCG most of the time bitmap operations need
not be done at all, because memory_region_is_logging returns zero.
In this case, skip the call to cpu_physical_memory_range_includes_clean
altogether as well.
With this patch, cpu_physical_memory_set_dirty_range is called
unconditionally, so there need not be anymore a separate call to
xen_modified_memory.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While it is obvious that cpu_physical_memory_get_dirty returns true even if
a single page is dirty, the same is not true for cpu_physical_memory_get_clean;
one would expect that it returns true only if all the pages are clean, but
it actually looks for even one clean page. (By contrast, the caller of that
function, cpu_physical_memory_range_includes_clean, has a good name).
To clarify, rename the function to cpu_physical_memory_all_dirty and return
true if _all_ the pages are dirty. This is the opposite of the previous
meaning, because "all are 1" is the same as "not (any is 0)", so we have to
modify cpu_physical_memory_range_includes_clean as well.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This cuts in half the cost of bitmap operations (which will become more
expensive when made atomic) during migration on non-VRAM regions.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These days modification of the TLB is done in notdirty_mem_write,
so the virtual address and env pointer as unnecessary.
The new name of the function, tlb_unprotect_code, is consistent with
tlb_protect_code.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Remove them from the sundry exec-all.h header, since they are only used by
the TCG runtime in exec.c and user-exec.c.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Invoke xen_modified_memory from cpu_physical_memory_set_dirty_range_nocode;
it is akin to DIRTY_MEMORY_MIGRATION, so set it together with that bitmap.
The remaining call from invalidate_and_set_dirty's "else" branch will go
away soon.
Second, fix the second argument to the function in the
cpu_physical_memory_set_dirty_lebitmap call site. That function is only used
by KVM, but it is better to be clean anyway.
Acked-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DIRTY_MEMORY_CODE is only needed for TCG. By adding it directly to
mr->dirty_log_mask, we avoid testing for TCG everywhere a region is
checked for the enabled/disabled state of dirty logging.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
dpy_gfx_update_dirty expects DIRTY_MEMORY_VGA logging to be always on,
but that will not be the case soon. Because it computes the memory
region on the fly for every update (with memory_region_find), it cannot
enable/disable logging by itself.
We could always treat updates as invalidations if dirty logging is
not enabled, assuming that the board will enable logging on the
RAM region that includes the framebuffer.
However, the function is unused, so just drop it.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the dirty log mask will also cover other bits than DIRTY_MEMORY_VGA,
some listeners may be interested in the overall zero/non-zero value of
the dirty log mask; others may be interested in the value of single bits.
For this reason, always call log_start/log_stop if bits have respectively
appeared or disappeared, and pass the old and new values of the dirty log
mask so that listeners can distinguish the kinds of change.
For example, KVM checks if dirty logging used to be completely disabled
(in log_start) or is now completely disabled (in log_stop). On the
other hand, Xen has to check manually if DIRTY_MEMORY_VGA changed,
since that is the only bit it cares about.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For now memory regions only track DIRTY_MEMORY_VGA individually, but
this will change soon. To support this, split memory_region_is_logging
in two functions: one that returns a given bit from dirty_log_mask,
and one that returns the entire mask. memory_region_is_logging gets an
extra parameter so that the compiler flags misuse.
While VGA-specific users (including the Xen listener!) will want to keep
checking that bit, KVM and vhost check for "any bit except migration"
(because migration is handled via the global start/stop listener
callbacks).
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DIRTY_MEMORY_MIGRATION is triggered by memory_global_dirty_log_start
and memory_global_dirty_log_stop, so it cannot be used with
memory_region_set_log.
Specify this in the documentation and assert it.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This includes pxb support by Marcel, as well as multiple enhancements all over
the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVcC2WAAoJECgfDbjSjVRpwgcH/3mvFP3UmvmXyzf8mYtQ/1fR
ikvdTGHl2DR7TMQszNeCJn/p6NgH3oXRbXh39xM1xl9D2/dsZH9o1cUyFE04K9LK
am0cTmlty1OEyFN8BX1TtpngUxa5mpRA/+NYuWbh1FoTp6RoEPM6P+L1zLqtXYn1
REF++ehrsQI2Az2pibf4nul8bwuTWJLJeMS6TcCVCRGoaHsCESiVMu2sQrzEbWEW
E8ZWaXaiycLxLkW0/oU8BmZyrAk1PHdHwgbMUINV0kV5E2u+ZU+3KY79ezC2FyHW
NV7G9Rhh/5H828/cB6UP4CPZ4AYIYmg02iz5XBGKbd8WS9oPrJVK7EoqfU3oZfc=
=5AmP
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, acpi, virtio, tpm
This includes pxb support by Marcel, as well as multiple enhancements all over
the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu Jun 4 11:51:02 2015 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream: (28 commits)
vhost: logs sharing
hw/acpi: piix4_pm_init(): take fw_cfg object no more
hw/acpi: move "etc/system-states" fw_cfg file from PIIX4 to core
hw/acpi: acpi_pm1_cnt_init(): take "disable_s3" and "disable_s4"
pc-dimm: don't assert if pc-dimm alignment != hotpluggable mem range size
docs: Add PXB documentation
apci: fix PXB behaviour if used with unsupported BIOS
hw/pxb: add numa_node parameter
hw/pci: add support for NUMA nodes
hw/pxb: add map_irq func
hw/pci: inform bios if the system has extra pci root buses
hw/pci: introduce PCI Expander Bridge (PXB)
hw/pci: removed 'rootbus nr is 0' assumption from qmp_pci_query
hw/acpi: remove from root bus 0 the crs resources used by other buses.
hw/acpi: add _CRS method for extra root busses
hw/apci: add _PRT method for extra PCI root busses
hw/acpi: add support for i440fx 'snooping' root busses
hw/pci: extend PCI config access to support devices behind PXB
hw/i386: query only for q35/pc when looking for pci host bridge
hw/pci: made pci_bus_num a PCIBusClass method
...
Conflicts:
hw/i386/pc_piix.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQEcBAABAgAGBQJVbvwjAAoJEL7lnXSkw9fbaFwIAIh6PN5v6fvuEjnPX5ijHZC2
7iJoFd0I2cYrxgLe4xONFX9qzV5vgdEAJfXCljVCKAmzu5RK7G0ZSW81sJ3t6Mp8
kA8buJeyTp2UcTlDrC3qji8ScEIj+g8I9tKGflNVI2uDAVumMBPqnJNSFhbaqYlu
SEq+4y/D3J6xPzr7NhyHliG0NmxJrIn6QCtux5djj3xO4KXfp1j2YQCPKhYjkRlW
wHfqeD7x9LX6875FX3csgfPsYIycW0WYtba2adTe0vbTsclOY0CU3ho8HPeXgHE6
WQj6KYGT8Fo0zmK8UV0Jmok7+hZoxXXInf6vY+sSY58oe71FgdxNwLvIC6N0eQc=
=AALk
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2015-06-03' into staging
trivial patches for 2015-06-03
# gpg: Signature made Wed Jun 3 14:07:47 2015 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg: aka "Michael Tokarev <mjt@corpit.ru>"
# gpg: aka "Michael Tokarev <mjt@debian.org>"
* remotes/mjt/tags/pull-trivial-patches-2015-06-03: (30 commits)
configure: postfix --extra-cflags to QEMU_CFLAGS
cadence_gem: Fix Rx buffer size field mask
slirp: use less predictable directory name in /tmp for smb config (CVE-2015-4037)
translate-all: delete prototype for non-existent function
Add -incoming help text
hw/display/tc6393xb.c: Fix misusing qemu_allocate_irqs for single irq
hw/arm/nseries.c: Fix misusing qemu_allocate_irqs for single irq
hw/alpha/typhoon.c: Fix misusing qemu_allocate_irqs for single irq
hw/unicore32/puv3.c: Fix misusing qemu_allocate_irqs for single irq
hw/lm32/milkymist.c: Fix misusing qemu_allocate_irqs for single irq
hw/lm32/lm32_boards.c: Fix misusing qemu_allocate_irqs for single irq
hw/ppc/prep.c: Fix misusing qemu_allocate_irqs for single irq
hw/sparc/sun4m.c: Fix misusing qemu_allocate_irqs for single irq
hw/timer/arm_timer.c: Fix misusing qemu_allocate_irqs for single irq
hw/isa/i82378.c: Fix misusing qemu_allocate_irqs for single irq
hw/isa/lpc_ich9.c: Fix misusing qemu_allocate_irqs for single irq
hw/i386/pc: Fix misusing qemu_allocate_irqs for single irq
hw/intc/exynos4210_gic.c: Fix memory leak by adjusting order
hw/arm/omap_sx1.c: Fix memory leak spotted by valgrind
hw/ppc/e500.c: Fix memory leak
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently we allocate one vhost log per vhost device. This is sub
optimal when:
- Guest has several device with vhost as backend
- Guest has multiqueue devices
In the above cases, we can avoid the memory allocation by sharing a
single vhost log among all the vhost devices. This is done through:
- Introducing a new vhost_log structure with refcnt inside.
- Using a global pointer to vhost_log structure that will be used. And
introduce helper to get the log with expected log size and helper to
- drop the refcnt to the old log.
- Each vhost device still keep track of a pointer to the log that was
used.
With above, if no resize happens, all vhost device will share a single
vhost log. During resize, a new vhost_log structure will be allocated
and made for the global pointer. And each vhost devices will drop the
refcnt to the old log.
Tested by doing scp during migration for a 2 queues virtio-net-pci.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This PIIX4 init function has no more reason to receive a pointer to the
FwCfg object. Remove the parameter from the prototype, and update callers.
As a result, the pc_init1() function no longer needs to save the return
value of pc_memory_init() and xen_load_linux(), which makes it more
similar to pc_q35_init().
The return type & value of pc_memory_init() and xen_load_linux() are not
changed themselves; maybe we'll need their return values sometime later.
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1204696
Cc: Amit Shah <amit.shah@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
This patch only modifies the function prototype and updates all chipset
code that calls acpi_pm1_cnt_init() to pass in their own disable_s3 and
disable_s4 settings. vt82c686 is assumed to be fixed "S3 and S4 enabled".
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1204696
Cc: Amit Shah <amit.shah@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
* more EL2 preparation patches
* revert a no-longer-necessary workaround for old glib versions
* add GICv2m support to virt board (MSI support)
* pl061: fix wrong calculation of GPIOMIS register
* support MSI via irqfd
* remove a confusing v8_ prefix from some variable names
* add dynamic sysbus device support to the virt board
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABCAAGBQJVbdouAAoJEDwlJe0UNgzeQSYP/iXkogCzrNAqkcAGNzzVox5o
54HFUMBFC05hnOBQxxUcGwXDetidYIEFuSt2uUooZapWYzL6ProIDXcgYMoVJTzL
XaYUzuV3M3pxl8A7DtKRv39e9HTDL4EmT+dd8mNuRHfbhmvlFlWw3TzO/DHxl29e
iOIrNDaLAPjI4QyFR0k5kHTmijTm13Sd5/Un/8m6bjKtrKst8k2HmqemtsjcrVUK
+/9k+60+uTYJb4xKKcCY7w0zbbJGvlW9216bf3ccfAvGAbaGDxH+hRn0E1xd2BoR
JmXofWYL55tT8cQxO7ZjCDMzhiJsQ/hFlo1ds5DkdcYuaYXUHiRB62mleSb2zc+T
kcLFWCBQp/YWULpngZrnu5bzKUN0BwFtTOoMMv5WGR/N4hAcj6rgIIEaGeRsGrhV
XrGeLmk25IwrIvn4Nwr0Ve70g6rdL5NauVYq21Bx2GLK18NEXXsUR1Z0X38WSVrN
HXBNFHFECf0S1CNp8KVcyyfE+XZx2Cb5jFpS2jiy648KoXHgHYZUjCqJd4JzRAQB
dEjoNKA6yod72UkgpoeaOTHsF9razOpqG+ymJzsVTiDBpg/eE6ZT/0jCBZ+92NqN
qf2IUwubQH9jAFcxDuzoHDx+XCycdkYVEnoBGtBcS2QfaJd4dwfJkFAOOHR70XkH
Kvj419eJjO0uhItsEODA
=s7uF
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20150602' into staging
target-arm queue:
* more EL2 preparation patches
* revert a no-longer-necessary workaround for old glib versions
* add GICv2m support to virt board (MSI support)
* pl061: fix wrong calculation of GPIOMIS register
* support MSI via irqfd
* remove a confusing v8_ prefix from some variable names
* add dynamic sysbus device support to the virt board
# gpg: Signature made Tue Jun 2 17:30:38 2015 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
* remotes/pmaydell/tags/pull-target-arm-20150602: (22 commits)
hw/arm/virt: change indentation in a15memmap
hw/arm/virt: add dynamic sysbus device support
hw/arm/boot: arm_load_kernel implemented as a machine init done notifier
hw/arm/sysbus-fdt: helpers for platform bus nodes addition
target-arm: Remove v8_ prefix from names of non-v8-specific cpreg arrays
arm_gicv2m: set kvm_gsi_direct_mapping and kvm_msi_via_irqfd_allowed
kvm: introduce kvm_arch_msi_data_to_gsi
pl061: fix wrong calculation of GPIOMIS register
target-arm: Add the GICv2m to the virt board
target-arm: Extend the gic node properties
arm_gicv2m: Add GICv2m widget to support MSIs
target-arm: Add GIC phandle to VirtBoardInfo
Revert "target-arm: Avoid g_hash_table_get_keys()"
target-arm: Add TLBI_VAE2{IS}
target-arm: Add TLBI_ALLE2
target-arm: Add TLBI_ALLE1{IS}
target-arm: Add TTBR0_EL2
target-arm: Add TPIDR_EL2
target-arm: Add SCTLR_EL2
target-arm: Add TCR_EL2
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At 8k per TLB (for 64-bit host or target), 8 or more modes
make the TLBs bigger than 64k, and some RISC TCG backends do
not like that. On the affected hosts, cut the TLB size in
half---there is still a measurable speedup on PPC with the
next patch.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1424436345-37924-3-git-send-email-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Alexander Graf <agraf@suse.de>
Machines types can have different requirement for default ram
size. Introduce a member in the machine class and set the current
default_ram_size to 128MB.
For QEMUMachine types override the value during the registration of
the machine and for MachineClass introduce the generic class init
setting the default_ram_size.
Add helpers [K,M,G,T,P,E]_BYTE for better readability and easy usage
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
We need to work with PCI BARs to generate OF properties
during PCI hotplug for sPAPR guests.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This option enables/disables PCI hotplug for a particular PHB.
Also add machine compatibility code to disable it by default for machine
types prior to pseries-2.4.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: move commas for compat fields]
Signed-off-by: Alexander Graf <agraf@suse.de>
This function handles generation of ibm,drc-* array device tree
properties to describe DRC topology to guests. This will by used
by the guest to direct RTAS calls to manage any dynamic resources
we associate with a particular DR Connector as part of
hotplug/unplug.
Since general management of boot-time device trees are handled
outside of sPAPRDRConnector, we insert these values blindly given
an FDT and offset. A mask of sPAPRDRConnector types is given to
instruct us on what types of connectors entries should be generated
for, since descriptions for different connectors may live in
different parts of the device tree.
Based on code originally written by Nathan Fontenot.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
We don't actually rely on this interface to surface hotplug events, and
instead rely on the similar-but-interrupt-driven check-exception RTAS
interface used for EPOW events. However, the existence of this interface
is needed to ensure guest kernels initialize the event-reporting
interfaces which will in turn be used by userspace tools to handle these
events, so we implement this interface here.
Since events surfaced by this call are mutually exclusive to those
surfaced via check-exception, we also update the RTAS event queue code
to accept a boolean to mark/filter for events accordingly.
Events of this sort are not currently generated by QEMU, but the interface
has been tested by surfacing hotplug events via event-scan in place
of check-exception.
Signed-off-by: Tyrel Datwyler <tyreld@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This extends the data structures currently used to report EPOW events to
guests via the check-exception RTAS interfaces to also include event types
for hotplug/unplug events.
This is currently undocumented and being finalized for inclusion in PAPR
specification, but we implement this here as an extension for guest
userspace tools to implement (existing guest kernels simply log these
events via a sysfs interface that's read by rtas_errd, and current
versions of rtas_errd/powerpc-utils already support the use of this
mechanism for initiating hotplug operations).
We also add support for queues of pending RTAS events, since in the
case of hotplug there's chance for multiple events being in-flight
at any point in time.
Signed-off-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This interface is used to fetch an OF device-tree nodes that describes a
newly-attached device to guest. It is called multiple times to walk the
device-tree node and fetch individual properties into a 'workarea'/buffer
provided by the guest.
The device-tree is generated by QEMU and passed to an sPAPRDRConnector during
the initial hotplug operation, and the state of these RTAS calls is tracked by
the sPAPRDRConnector. When the last of these properties is successfully
fetched, we report as special return value to the guest and transition
the device to a 'configured' state on the QEMU/DRC side.
See docs/specs/ppc-spapr-hotplug.txt for a complete description of
this interface.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This is similar to the existing rtas_st_buffer(), but for cases
where the guest is not expecting a length-encoded byte array.
Namely, for calls where a "work area" buffer is used to pass
around arbitrary fields/data.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This interface allows a guest to control various platform/device
sensors. Initially, we only implement support necessary to control
sensors that are required for hotplug: DR connector indicators/LEDs,
resource allocation state, and resource isolation state.
See docs/specs/ppc-spapr-hotplug.txt for a complete description of
this interface.
Signed-off-by: Mike Day <ncmike@ncultra.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This device emulates a firmware abstraction used by pSeries guests to
manage hotplug/dynamic-reconfiguration of host-bridges, PCI devices,
memory, and CPUs. It is conceptually similar to an SHPC device,
complete with LED indicators to identify individual slots to physical
physical users and indicate when it is safe to remove a device. In
some cases it is also used to manage virtualized resources, such a
memory, CPUs, and physical-host bridges, which in the case of pSeries
guests are virtualized resources where the physical components are
managed by the host.
Guests communicate with these DR Connectors using RTAS calls,
generally by addressing the unique DRC index associated with a
particular connector for a particular resource. For introspection
purposes we expose this state initially as QOM properties, and
in subsequent patches will introduce the RTAS calls that make use of
it. This constitutes to the 'guest' interface.
On the QEMU side we provide an attach/detach interface to associate
or cleanup a DeviceState with a particular sPAPRDRConnector in
response to hotplug/unplug, respectively. This constitutes the
'physical' interface to the DR Connector.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
The check "liobn & 0xFFFFFFFF00000000ULL" in spapr_tce_find_by_liobn()
is completely useless since liobn is only declared as an uint32_t
parameter. Fix this by using target_ulong instead (this is what most
of the callers of this function are using, too).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
At the moment spapr_tce_find_by_liobn() is used by H_PUT_TCE/...
handlers to find an IOMMU by LIOBN.
We are going to implement Dynamic DMA windows (DDW), new code
will go to a new file and we will use spapr_tce_find_by_liobn()
there too so let's make it public.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This makes find_phb()/find_dev() public and changed its names
to spapr_pci_find_phb()/spapr_pci_find_dev() as they are going to
be used from other parts of QEMU such as VFIO DDW (dynamic DMA window)
or VFIO PCI error injection or VFIO EEH handling - in all these
cases there are RTAS calls which are addressed to BUID+config_addr
in IEEE1275 format.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This is to reduce VIO noise while debugging PCI DMA.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This gets rid of a magic constant describing the default DMA window size
for an emulated PHB.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This introduces a macro which makes up a LIOBN from fixed prefix and
VIO device address (@reg property).
This is to keep LIOBN macros rendering consistent - the same macro for
PCI has been added by the previous patch.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
We are going to have multiple DMA windows per PHB and we want them to
migrate so we need a predictable way of assigning LIOBNs.
This introduces a macro which makes up a LIOBN from fixed prefix,
PHB index (unique PHB id) and window number.
This introduces a SPAPR_PCI_DMA_WINDOW_NUM() to know the window number
from LIOBN. It is used to distinguish the default 32bit windows from
dynamic windows and avoid picking default DMA window properties from
a wrong TCE table.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
PCI root buses can be attached to a specific NUMA node.
PCI buses are not attached by default to a NUMA node.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
PXB is a "light-weight" host bridge whose purpose is to enable
the main host bridge to support multiple PCI root buses
for pc machines.
As oposed to PCI-2-PCI bridge's secondary bus, PXB's bus
is a primary bus and can be associated with a NUMA node
(different from the main host bridge) allowing the guest OS
to recognize the proximity of a pass-through device to
other resources as RAM and CPUs.
The PXB is composed from:
- A primary PCI bus (can be associated with a NUMA node)
Acts like a normal pci bus and from the functionality point
of view is an "expansion" of the bus behind the
main host bridge.
- A pci-2-pci bridge behind the primary PCI bus where the actual
devices will be attached.
- A host-bridge PCI device
Situated on the bus behind the main host bridge, allows
the BIOS to configure the bus number and IO/mem resources.
It does not have its own config/data register for configuration
cycles, this being handled by the main host bridge.
- A host-bridge sysbus to comply with QEMU current design.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Refactoring it as a method of PCIBusClass will allow
different implementations for subclasses.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Refactoring it as a method of PCIBusClass will allow
different implementations for subclasses.
Removed the assumption that the root bus does not
have a parent device because is specific only
to the default class implementation.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Add a new API named acpi_send_gpe_event() to send hotplug SCI.
This API can be used by pci, cpu and memory hotplug.
This patch is rebased on master.
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Commit "019a3ed virtio: make features 64bit wide" missed a few changes,
as I've noticed while trying to rebase the virtio-1 branch to latest
master. This patch adds them.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>