qemu/hw/misc/ivshmem.c

1139 lines
31 KiB
C
Raw Normal View History

/*
* Inter-VM Shared Memory PCI device.
*
* Author:
* Cam Macdonell <cam@cs.ualberta.ca>
*
* Based On: cirrus_vga.c
* Copyright (c) 2004 Fabrice Bellard
* Copyright (c) 2004 Makoto Suzuki (suzu)
*
* and rtl8139.c
* Copyright (c) 2006 Igor Kovalenko
*
* This code is licensed under the GNU GPL v2.
*
* Contributions after 2012-01-13 are licensed under the terms of the
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu/osdep.h"
#include "qemu/units.h"
2016-03-14 11:01:28 +03:00
#include "qapi/error.h"
#include "qemu/cutils.h"
#include "hw/pci/pci.h"
#include "hw/qdev-properties.h"
#include "hw/qdev-properties-system.h"
#include "hw/pci/msi.h"
#include "hw/pci/msix.h"
#include "sysemu/kvm.h"
#include "migration/blocker.h"
#include "migration/vmstate.h"
#include "qemu/error-report.h"
#include "qemu/event_notifier.h"
#include "qemu/module.h"
#include "qom/object_interfaces.h"
#include "chardev/char-fe.h"
#include "sysemu/hostmem.h"
#include "qapi/visitor.h"
#include "hw/misc/ivshmem.h"
#include "qom/object.h"
#define PCI_VENDOR_ID_IVSHMEM PCI_VENDOR_ID_REDHAT_QUMRANET
#define PCI_DEVICE_ID_IVSHMEM 0x1110
#define IVSHMEM_MAX_PEERS UINT16_MAX
#define IVSHMEM_IOEVENTFD 0
#define IVSHMEM_MSI 1
#define IVSHMEM_REG_BAR_SIZE 0x100
#define IVSHMEM_DEBUG 0
#define IVSHMEM_DPRINTF(fmt, ...) \
do { \
if (IVSHMEM_DEBUG) { \
printf("IVSHMEM: " fmt, ## __VA_ARGS__); \
} \
} while (0)
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
#define TYPE_IVSHMEM_COMMON "ivshmem-common"
typedef struct IVShmemState IVShmemState;
DECLARE_INSTANCE_CHECKER(IVShmemState, IVSHMEM_COMMON,
TYPE_IVSHMEM_COMMON)
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
#define TYPE_IVSHMEM_PLAIN "ivshmem-plain"
DECLARE_INSTANCE_CHECKER(IVShmemState, IVSHMEM_PLAIN,
TYPE_IVSHMEM_PLAIN)
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
#define TYPE_IVSHMEM_DOORBELL "ivshmem-doorbell"
DECLARE_INSTANCE_CHECKER(IVShmemState, IVSHMEM_DOORBELL,
TYPE_IVSHMEM_DOORBELL)
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
#define TYPE_IVSHMEM "ivshmem"
DECLARE_INSTANCE_CHECKER(IVShmemState, IVSHMEM,
TYPE_IVSHMEM)
typedef struct Peer {
int nb_eventfds;
EventNotifier *eventfds;
} Peer;
typedef struct MSIVector {
PCIDevice *pdev;
int virq;
bool unmasked;
} MSIVector;
struct IVShmemState {
/*< private >*/
PCIDevice parent_obj;
/*< public >*/
uint32_t features;
/* exactly one of these two may be set */
HostMemoryBackend *hostmem; /* with interrupts */
CharBackend server_chr; /* without interrupts */
/* registers */
uint32_t intrmask;
uint32_t intrstatus;
int vm_id;
/* BARs */
MemoryRegion ivshmem_mmio; /* BAR 0 (registers) */
MemoryRegion *ivshmem_bar2; /* BAR 2 (shared memory) */
MemoryRegion server_bar2; /* used with server_chr */
/* interrupt support */
Peer *peers;
int nb_peers; /* space in @peers[] */
uint32_t vectors;
MSIVector *msi_vectors;
uint64_t msg_buf; /* buffer for receiving server messages */
int msg_buffered_bytes; /* #bytes in @msg_buf */
/* migration stuff */
OnOffAuto master;
Error *migration_blocker;
};
/* registers for the Inter-VM shared memory device */
enum ivshmem_registers {
INTRMASK = 0,
INTRSTATUS = 4,
IVPOSITION = 8,
DOORBELL = 12,
};
static inline uint32_t ivshmem_has_feature(IVShmemState *ivs,
unsigned int feature) {
return (ivs->features & (1 << feature));
}
static inline bool ivshmem_is_master(IVShmemState *s)
{
assert(s->master != ON_OFF_AUTO_AUTO);
return s->master == ON_OFF_AUTO_ON;
}
static void ivshmem_IntrMask_write(IVShmemState *s, uint32_t val)
{
IVSHMEM_DPRINTF("IntrMask write(w) val = 0x%04x\n", val);
s->intrmask = val;
}
static uint32_t ivshmem_IntrMask_read(IVShmemState *s)
{
uint32_t ret = s->intrmask;
IVSHMEM_DPRINTF("intrmask read(w) val = 0x%04x\n", ret);
return ret;
}
static void ivshmem_IntrStatus_write(IVShmemState *s, uint32_t val)
{
IVSHMEM_DPRINTF("IntrStatus write(w) val = 0x%04x\n", val);
s->intrstatus = val;
}
static uint32_t ivshmem_IntrStatus_read(IVShmemState *s)
{
uint32_t ret = s->intrstatus;
/* reading ISR clears all interrupts */
s->intrstatus = 0;
return ret;
}
static void ivshmem_io_write(void *opaque, hwaddr addr,
uint64_t val, unsigned size)
{
IVShmemState *s = opaque;
uint16_t dest = val >> 16;
uint16_t vector = val & 0xff;
addr &= 0xfc;
IVSHMEM_DPRINTF("writing to addr " HWADDR_FMT_plx "\n", addr);
switch (addr)
{
case INTRMASK:
ivshmem_IntrMask_write(s, val);
break;
case INTRSTATUS:
ivshmem_IntrStatus_write(s, val);
break;
case DOORBELL:
/* check that dest VM ID is reasonable */
if (dest >= s->nb_peers) {
IVSHMEM_DPRINTF("Invalid destination VM ID (%d)\n", dest);
break;
}
/* check doorbell range */
if (vector < s->peers[dest].nb_eventfds) {
IVSHMEM_DPRINTF("Notifying VM %d on vector %d\n", dest, vector);
event_notifier_set(&s->peers[dest].eventfds[vector]);
} else {
IVSHMEM_DPRINTF("Invalid destination vector %d on VM %d\n",
vector, dest);
}
break;
default:
IVSHMEM_DPRINTF("Unhandled write " HWADDR_FMT_plx "\n", addr);
}
}
static uint64_t ivshmem_io_read(void *opaque, hwaddr addr,
unsigned size)
{
IVShmemState *s = opaque;
uint32_t ret;
switch (addr)
{
case INTRMASK:
ret = ivshmem_IntrMask_read(s);
break;
case INTRSTATUS:
ret = ivshmem_IntrStatus_read(s);
break;
case IVPOSITION:
ret = s->vm_id;
break;
default:
IVSHMEM_DPRINTF("why are we reading " HWADDR_FMT_plx "\n", addr);
ret = 0;
}
return ret;
}
static const MemoryRegionOps ivshmem_mmio_ops = {
.read = ivshmem_io_read,
.write = ivshmem_io_write,
.endianness = DEVICE_LITTLE_ENDIAN,
.impl = {
.min_access_size = 4,
.max_access_size = 4,
},
};
static void ivshmem_vector_notify(void *opaque)
{
MSIVector *entry = opaque;
PCIDevice *pdev = entry->pdev;
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(pdev);
int vector = entry - s->msi_vectors;
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
if (!event_notifier_test_and_clear(n)) {
return;
}
IVSHMEM_DPRINTF("interrupt on vector %p %d\n", pdev, vector);
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
ivshmem: Clean up MSI-X conditions There are three predicates related to MSI-X: * ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X variant of the device is selected with msi=off. * msix_present() is true when the device has the PCI capability MSI-X. It's initially false, and becomes true during successful realize of the MSI-X variant of the device. Thus, it's the same as ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices. * msix_enabled() is true when msix_present() is true and guest software has enabled MSI-X. Code that differs between the non-MSI-X and the MSI-X variant of the device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or by msix_present(), except the latter works only for realized devices. Code that depends on whether MSI-X is in use needs to be guarded with msix_enabled(). Code review led me to two minor messes: * ivshmem_vector_notify() calls msix_notify() even when !msix_enabled(), unlike most other MSI-X-capable devices. As far as I can tell, msix_notify() does nothing when !msix_enabled(). Add the guard anyway. * Most callers of ivshmem_use_msix() guard it with ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because ivshmem_use_msix() does nothing when !msix_present(). That's ivshmem's only use of msix_present(), though. Guard it consistently, and drop the now redundant msix_present() check. While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-15 21:34:34 +03:00
if (msix_enabled(pdev)) {
msix_notify(pdev, vector);
}
} else {
ivshmem_IntrStatus_write(s, 1);
}
}
static int ivshmem_vector_unmask(PCIDevice *dev, unsigned vector,
MSIMessage msg)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(dev);
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
MSIVector *v = &s->msi_vectors[vector];
int ret;
IVSHMEM_DPRINTF("vector unmask %p %d\n", dev, vector);
if (!v->pdev) {
error_report("ivshmem: vector %d route does not exist", vector);
return -EINVAL;
}
assert(!v->unmasked);
ret = kvm_irqchip_update_msi_route(kvm_state, v->virq, msg, dev);
if (ret < 0) {
return ret;
}
kvm_irqchip_commit_routes(kvm_state);
ret = kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL, v->virq);
if (ret < 0) {
return ret;
}
v->unmasked = true;
return 0;
}
static void ivshmem_vector_mask(PCIDevice *dev, unsigned vector)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(dev);
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
MSIVector *v = &s->msi_vectors[vector];
int ret;
IVSHMEM_DPRINTF("vector mask %p %d\n", dev, vector);
if (!v->pdev) {
error_report("ivshmem: vector %d route does not exist", vector);
return;
}
assert(v->unmasked);
ret = kvm_irqchip_remove_irqfd_notifier_gsi(kvm_state, n, v->virq);
if (ret < 0) {
error_report("remove_irqfd_notifier_gsi failed");
return;
}
v->unmasked = false;
}
static void ivshmem_vector_poll(PCIDevice *dev,
unsigned int vector_start,
unsigned int vector_end)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(dev);
unsigned int vector;
IVSHMEM_DPRINTF("vector poll %p %d-%d\n", dev, vector_start, vector_end);
vector_end = MIN(vector_end, s->vectors);
for (vector = vector_start; vector < vector_end; vector++) {
EventNotifier *notifier = &s->peers[s->vm_id].eventfds[vector];
if (!msix_is_masked(dev, vector)) {
continue;
}
if (event_notifier_test_and_clear(notifier)) {
msix_set_pending(dev, vector);
}
}
}
static void watch_vector_notifier(IVShmemState *s, EventNotifier *n,
int vector)
{
int eventfd = event_notifier_get_fd(n);
assert(!s->msi_vectors[vector].pdev);
s->msi_vectors[vector].pdev = PCI_DEVICE(s);
qemu_set_fd_handler(eventfd, ivshmem_vector_notify,
NULL, &s->msi_vectors[vector]);
}
static void ivshmem_add_eventfd(IVShmemState *s, int posn, int i)
{
memory_region_add_eventfd(&s->ivshmem_mmio,
DOORBELL,
4,
true,
(posn << 16) | i,
&s->peers[posn].eventfds[i]);
}
static void ivshmem_del_eventfd(IVShmemState *s, int posn, int i)
{
memory_region_del_eventfd(&s->ivshmem_mmio,
DOORBELL,
4,
true,
(posn << 16) | i,
&s->peers[posn].eventfds[i]);
}
static void close_peer_eventfds(IVShmemState *s, int posn)
{
int i, n;
assert(posn >= 0 && posn < s->nb_peers);
n = s->peers[posn].nb_eventfds;
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
memory_region_transaction_begin();
for (i = 0; i < n; i++) {
ivshmem_del_eventfd(s, posn, i);
}
memory_region_transaction_commit();
}
for (i = 0; i < n; i++) {
event_notifier_cleanup(&s->peers[posn].eventfds[i]);
}
g_free(s->peers[posn].eventfds);
s->peers[posn].nb_eventfds = 0;
}
static void resize_peers(IVShmemState *s, int nb_peers)
{
int old_nb_peers = s->nb_peers;
int i;
assert(nb_peers > old_nb_peers);
IVSHMEM_DPRINTF("bumping storage to %d peers\n", nb_peers);
s->peers = g_renew(Peer, s->peers, nb_peers);
s->nb_peers = nb_peers;
for (i = old_nb_peers; i < nb_peers; i++) {
s->peers[i].eventfds = g_new0(EventNotifier, s->vectors);
s->peers[i].nb_eventfds = 0;
}
}
static void ivshmem_add_kvm_msi_virq(IVShmemState *s, int vector,
Error **errp)
{
PCIDevice *pdev = PCI_DEVICE(s);
KVMRouteChange c;
int ret;
IVSHMEM_DPRINTF("ivshmem_add_kvm_msi_virq vector:%d\n", vector);
assert(!s->msi_vectors[vector].pdev);
c = kvm_irqchip_begin_route_changes(kvm_state);
ret = kvm_irqchip_add_msi_route(&c, vector, pdev);
if (ret < 0) {
error_setg(errp, "kvm_irqchip_add_msi_route failed");
return;
}
kvm_irqchip_commit_route_changes(&c);
s->msi_vectors[vector].virq = ret;
s->msi_vectors[vector].pdev = pdev;
}
static void setup_interrupt(IVShmemState *s, int vector, Error **errp)
{
EventNotifier *n = &s->peers[s->vm_id].eventfds[vector];
bool with_irqfd = kvm_msi_via_irqfd_enabled() &&
ivshmem_has_feature(s, IVSHMEM_MSI);
PCIDevice *pdev = PCI_DEVICE(s);
Error *err = NULL;
IVSHMEM_DPRINTF("setting up interrupt for vector: %d\n", vector);
if (!with_irqfd) {
IVSHMEM_DPRINTF("with eventfd\n");
watch_vector_notifier(s, n, vector);
} else if (msix_enabled(pdev)) {
IVSHMEM_DPRINTF("with irqfd\n");
ivshmem_add_kvm_msi_virq(s, vector, &err);
if (err) {
error_propagate(errp, err);
return;
}
if (!msix_is_masked(pdev, vector)) {
kvm_irqchip_add_irqfd_notifier_gsi(kvm_state, n, NULL,
s->msi_vectors[vector].virq);
/* TODO handle error */
}
} else {
/* it will be delayed until msix is enabled, in write_config */
IVSHMEM_DPRINTF("with irqfd, delayed until msix enabled\n");
}
}
static void process_msg_shmem(IVShmemState *s, int fd, Error **errp)
{
Error *local_err = NULL;
struct stat buf;
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
size_t size;
if (s->ivshmem_bar2) {
error_setg(errp, "server sent unexpected shared memory message");
close(fd);
return;
}
if (fstat(fd, &buf) < 0) {
error_setg_errno(errp, errno,
"can't determine size of shared memory sent by server");
close(fd);
return;
}
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
size = buf.st_size;
/* mmap the region and map into the BAR2 */
memory_region_init_ram_from_fd(&s->server_bar2, OBJECT(s), "ivshmem.bar2",
size, RAM_SHARED, fd, 0, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
s->ivshmem_bar2 = &s->server_bar2;
}
static void process_msg_disconnect(IVShmemState *s, uint16_t posn,
Error **errp)
{
IVSHMEM_DPRINTF("posn %d has gone away\n", posn);
if (posn >= s->nb_peers || posn == s->vm_id) {
error_setg(errp, "invalid peer %d", posn);
return;
}
close_peer_eventfds(s, posn);
}
static void process_msg_connect(IVShmemState *s, uint16_t posn, int fd,
Error **errp)
{
Peer *peer = &s->peers[posn];
int vector;
/*
* The N-th connect message for this peer comes with the file
* descriptor for vector N-1. Count messages to find the vector.
*/
if (peer->nb_eventfds >= s->vectors) {
error_setg(errp, "Too many eventfd received, device has %d vectors",
s->vectors);
close(fd);
return;
}
vector = peer->nb_eventfds++;
IVSHMEM_DPRINTF("eventfds[%d][%d] = %d\n", posn, vector, fd);
event_notifier_init_fd(&peer->eventfds[vector], fd);
g_unix_set_fd_nonblocking(fd, true, NULL); /* msix/irqfd poll non block */
if (posn == s->vm_id) {
setup_interrupt(s, vector, errp);
/* TODO do we need to handle the error? */
}
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD)) {
ivshmem_add_eventfd(s, posn, vector);
}
}
static void process_msg(IVShmemState *s, int64_t msg, int fd, Error **errp)
{
IVSHMEM_DPRINTF("posn is %" PRId64 ", fd is %d\n", msg, fd);
if (msg < -1 || msg > IVSHMEM_MAX_PEERS) {
error_setg(errp, "server sent invalid message %" PRId64, msg);
close(fd);
return;
}
if (msg == -1) {
process_msg_shmem(s, fd, errp);
return;
}
if (msg >= s->nb_peers) {
resize_peers(s, msg + 1);
}
if (fd >= 0) {
process_msg_connect(s, msg, fd, errp);
} else {
process_msg_disconnect(s, msg, errp);
}
}
static int ivshmem_can_receive(void *opaque)
{
IVShmemState *s = opaque;
assert(s->msg_buffered_bytes < sizeof(s->msg_buf));
return sizeof(s->msg_buf) - s->msg_buffered_bytes;
}
static void ivshmem_read(void *opaque, const uint8_t *buf, int size)
{
IVShmemState *s = opaque;
Error *err = NULL;
int fd;
int64_t msg;
assert(size >= 0 && s->msg_buffered_bytes + size <= sizeof(s->msg_buf));
memcpy((unsigned char *)&s->msg_buf + s->msg_buffered_bytes, buf, size);
s->msg_buffered_bytes += size;
if (s->msg_buffered_bytes < sizeof(s->msg_buf)) {
return;
}
msg = le64_to_cpu(s->msg_buf);
s->msg_buffered_bytes = 0;
fd = qemu_chr_fe_get_msgfd(&s->server_chr);
process_msg(s, msg, fd, &err);
if (err) {
error_report_err(err);
}
}
static int64_t ivshmem_recv_msg(IVShmemState *s, int *pfd, Error **errp)
{
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
int64_t msg;
int n, ret;
n = 0;
do {
ret = qemu_chr_fe_read_all(&s->server_chr, (uint8_t *)&msg + n,
sizeof(msg) - n);
if (ret < 0) {
if (ret == -EINTR) {
continue;
}
error_setg_errno(errp, -ret, "read from server failed");
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
return INT64_MIN;
}
n += ret;
} while (n < sizeof(msg));
*pfd = qemu_chr_fe_get_msgfd(&s->server_chr);
return le64_to_cpu(msg);
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
}
static void ivshmem_recv_setup(IVShmemState *s, Error **errp)
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
{
Error *err = NULL;
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
int64_t msg;
int fd;
msg = ivshmem_recv_msg(s, &fd, &err);
if (err) {
error_propagate(errp, err);
return;
}
if (msg != IVSHMEM_PROTOCOL_VERSION) {
error_setg(errp, "server sent version %" PRId64 ", expecting %d",
msg, IVSHMEM_PROTOCOL_VERSION);
return;
}
if (fd != -1) {
error_setg(errp, "server sent invalid version message");
return;
}
/*
* ivshmem-server sends the remaining initial messages in a fixed
* order, but the device has always accepted them in any order.
* Stay as compatible as practical, just in case people use
* servers that behave differently.
*/
/*
* ivshmem_device_spec.txt has always required the ID message
* right here, and ivshmem-server has always complied. However,
* older versions of the device accepted it out of order, but
* broke when an interrupt setup message arrived before it.
*/
msg = ivshmem_recv_msg(s, &fd, &err);
if (err) {
error_propagate(errp, err);
return;
}
if (fd != -1 || msg < 0 || msg > IVSHMEM_MAX_PEERS) {
error_setg(errp, "server sent invalid ID message");
return;
}
s->vm_id = msg;
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
/*
* Receive more messages until we got shared memory.
*/
do {
msg = ivshmem_recv_msg(s, &fd, &err);
if (err) {
error_propagate(errp, err);
return;
}
process_msg(s, msg, fd, &err);
if (err) {
error_propagate(errp, err);
return;
}
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
} while (msg != -1);
/*
* This function must either map the shared memory or fail. The
* loop above ensures that: it terminates normally only after it
* successfully processed the server's shared memory message.
* Assert that actually mapped the shared memory:
*/
assert(s->ivshmem_bar2);
}
/* Select the MSI-X vectors used by device.
* ivshmem maps events to vectors statically, so
* we just enable all vectors on init and after reset. */
ivshmem: Clean up MSI-X conditions There are three predicates related to MSI-X: * ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X variant of the device is selected with msi=off. * msix_present() is true when the device has the PCI capability MSI-X. It's initially false, and becomes true during successful realize of the MSI-X variant of the device. Thus, it's the same as ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices. * msix_enabled() is true when msix_present() is true and guest software has enabled MSI-X. Code that differs between the non-MSI-X and the MSI-X variant of the device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or by msix_present(), except the latter works only for realized devices. Code that depends on whether MSI-X is in use needs to be guarded with msix_enabled(). Code review led me to two minor messes: * ivshmem_vector_notify() calls msix_notify() even when !msix_enabled(), unlike most other MSI-X-capable devices. As far as I can tell, msix_notify() does nothing when !msix_enabled(). Add the guard anyway. * Most callers of ivshmem_use_msix() guard it with ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because ivshmem_use_msix() does nothing when !msix_present(). That's ivshmem's only use of msix_present(), though. Guard it consistently, and drop the now redundant msix_present() check. While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-15 21:34:34 +03:00
static void ivshmem_msix_vector_use(IVShmemState *s)
{
PCIDevice *d = PCI_DEVICE(s);
int i;
for (i = 0; i < s->vectors; i++) {
msix_vector_use(d, i);
}
}
static void ivshmem_disable_irqfd(IVShmemState *s);
static void ivshmem_reset(DeviceState *d)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(d);
ivshmem_disable_irqfd(s);
s->intrstatus = 0;
s->intrmask = 0;
ivshmem: Clean up MSI-X conditions There are three predicates related to MSI-X: * ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X variant of the device is selected with msi=off. * msix_present() is true when the device has the PCI capability MSI-X. It's initially false, and becomes true during successful realize of the MSI-X variant of the device. Thus, it's the same as ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices. * msix_enabled() is true when msix_present() is true and guest software has enabled MSI-X. Code that differs between the non-MSI-X and the MSI-X variant of the device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or by msix_present(), except the latter works only for realized devices. Code that depends on whether MSI-X is in use needs to be guarded with msix_enabled(). Code review led me to two minor messes: * ivshmem_vector_notify() calls msix_notify() even when !msix_enabled(), unlike most other MSI-X-capable devices. As far as I can tell, msix_notify() does nothing when !msix_enabled(). Add the guard anyway. * Most callers of ivshmem_use_msix() guard it with ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because ivshmem_use_msix() does nothing when !msix_present(). That's ivshmem's only use of msix_present(), though. Guard it consistently, and drop the now redundant msix_present() check. While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-15 21:34:34 +03:00
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
ivshmem_msix_vector_use(s);
}
}
static int ivshmem_setup_interrupts(IVShmemState *s, Error **errp)
{
/* allocate QEMU callback data for receiving interrupts */
s->msi_vectors = g_new0(MSIVector, s->vectors);
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
if (msix_init_exclusive_bar(PCI_DEVICE(s), s->vectors, 1, errp)) {
return -1;
}
IVSHMEM_DPRINTF("msix initialized (%d vectors)\n", s->vectors);
ivshmem: Clean up MSI-X conditions There are three predicates related to MSI-X: * ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X variant of the device is selected with msi=off. * msix_present() is true when the device has the PCI capability MSI-X. It's initially false, and becomes true during successful realize of the MSI-X variant of the device. Thus, it's the same as ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices. * msix_enabled() is true when msix_present() is true and guest software has enabled MSI-X. Code that differs between the non-MSI-X and the MSI-X variant of the device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or by msix_present(), except the latter works only for realized devices. Code that depends on whether MSI-X is in use needs to be guarded with msix_enabled(). Code review led me to two minor messes: * ivshmem_vector_notify() calls msix_notify() even when !msix_enabled(), unlike most other MSI-X-capable devices. As far as I can tell, msix_notify() does nothing when !msix_enabled(). Add the guard anyway. * Most callers of ivshmem_use_msix() guard it with ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because ivshmem_use_msix() does nothing when !msix_present(). That's ivshmem's only use of msix_present(), though. Guard it consistently, and drop the now redundant msix_present() check. While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-15 21:34:34 +03:00
ivshmem_msix_vector_use(s);
}
return 0;
}
static void ivshmem_remove_kvm_msi_virq(IVShmemState *s, int vector)
{
IVSHMEM_DPRINTF("ivshmem_remove_kvm_msi_virq vector:%d\n", vector);
if (s->msi_vectors[vector].pdev == NULL) {
return;
}
/* it was cleaned when masked in the frontend. */
kvm_irqchip_release_virq(kvm_state, s->msi_vectors[vector].virq);
s->msi_vectors[vector].pdev = NULL;
}
static void ivshmem_enable_irqfd(IVShmemState *s)
{
PCIDevice *pdev = PCI_DEVICE(s);
int i;
for (i = 0; i < s->peers[s->vm_id].nb_eventfds; i++) {
Error *err = NULL;
ivshmem_add_kvm_msi_virq(s, i, &err);
if (err) {
error_report_err(err);
goto undo;
}
}
if (msix_set_vector_notifiers(pdev,
ivshmem_vector_unmask,
ivshmem_vector_mask,
ivshmem_vector_poll)) {
error_report("ivshmem: msix_set_vector_notifiers failed");
goto undo;
}
return;
undo:
while (--i >= 0) {
ivshmem_remove_kvm_msi_virq(s, i);
}
}
static void ivshmem_disable_irqfd(IVShmemState *s)
{
PCIDevice *pdev = PCI_DEVICE(s);
int i;
if (!pdev->msix_vector_use_notifier) {
return;
}
msix_unset_vector_notifiers(pdev);
for (i = 0; i < s->peers[s->vm_id].nb_eventfds; i++) {
/*
* MSI-X is already disabled here so msix_unset_vector_notifiers()
* didn't call our release notifier. Do it now to keep our masks and
* unmasks balanced.
*/
if (s->msi_vectors[i].unmasked) {
ivshmem_vector_mask(pdev, i);
}
ivshmem_remove_kvm_msi_virq(s, i);
}
}
static void ivshmem_write_config(PCIDevice *pdev, uint32_t address,
uint32_t val, int len)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(pdev);
int is_enabled, was_enabled = msix_enabled(pdev);
pci_default_write_config(pdev, address, val, len);
is_enabled = msix_enabled(pdev);
if (kvm_msi_via_irqfd_enabled()) {
if (!was_enabled && is_enabled) {
ivshmem_enable_irqfd(s);
} else if (was_enabled && !is_enabled) {
ivshmem_disable_irqfd(s);
}
}
}
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
static void ivshmem_common_realize(PCIDevice *dev, Error **errp)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
IVShmemState *s = IVSHMEM_COMMON(dev);
Error *err = NULL;
uint8_t *pci_conf;
/* IRQFD requires MSI */
if (ivshmem_has_feature(s, IVSHMEM_IOEVENTFD) &&
!ivshmem_has_feature(s, IVSHMEM_MSI)) {
error_setg(errp, "ioeventfd/irqfd requires MSI");
return;
}
pci_conf = dev->config;
pci_conf[PCI_COMMAND] = PCI_COMMAND_IO | PCI_COMMAND_MEMORY;
memory_region_init_io(&s->ivshmem_mmio, OBJECT(s), &ivshmem_mmio_ops, s,
"ivshmem-mmio", IVSHMEM_REG_BAR_SIZE);
/* region for registers*/
pci_register_bar(dev, 0, PCI_BASE_ADDRESS_SPACE_MEMORY,
&s->ivshmem_mmio);
if (s->hostmem != NULL) {
IVSHMEM_DPRINTF("using hostmem\n");
s->ivshmem_bar2 = host_memory_backend_get_memory(s->hostmem);
host_memory_backend_set_mapped(s->hostmem, true);
} else {
Chardev *chr = qemu_chr_fe_get_driver(&s->server_chr);
assert(chr);
IVSHMEM_DPRINTF("using shared memory server (socket = %s)\n",
chr->filename);
/* we allocate enough space for 16 peers and grow as needed */
resize_peers(s, 16);
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
/*
* Receive setup messages from server synchronously.
* Older versions did it asynchronously, but that creates a
* number of entertaining race conditions.
*/
ivshmem_recv_setup(s, &err);
if (err) {
error_propagate(errp, err);
return;
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
}
if (s->master == ON_OFF_AUTO_ON && s->vm_id != 0) {
error_setg(errp,
"master must connect to the server before any peers");
return;
}
qemu_chr_fe_set_handlers(&s->server_chr, ivshmem_can_receive,
ivshmem_read, NULL, NULL, s, NULL, true);
if (ivshmem_setup_interrupts(s, errp) < 0) {
error_prepend(errp, "Failed to initialize interrupts: ");
ivshmem: Receive shared memory synchronously in realize() When configured for interrupts (property "chardev" given), we receive the shared memory from an ivshmem server. We do so asynchronously after realize() completes, by setting up callbacks with qemu_chr_add_handlers(). Keeping server I/O out of realize() that way avoids delays due to a slow server. This is probably relevant only for hot plug. However, this funny "no shared memory, yet" state of the device also causes a raft of issues that are hard or impossible to work around: * The guest is exposed to this state: when we enter and leave it its shared memory contents is apruptly replaced, and device register IVPosition changes. This is a known issue. We document that guests should not access the shared memory after device initialization until the IVPosition register becomes non-negative. For cold plug, the funny state is unlikely to be visible in practice, because we normally receive the shared memory long before the guest gets around to mess with the device. For hot plug, the timing is tighter, but the relative slowness of PCI device configuration has a good chance to hide the funny state. In either case, guests complying with the documented procedure are safe. * Migration becomes racy. If migration completes before the shared memory setup completes on the source, shared memory contents is silently lost. Fortunately, migration is rather unlikely to win this race. If the shared memory's ramblock arrives at the destination before shared memory setup completes, migration fails. There is no known way for a management application to wait for shared memory setup to complete. All you can do is retry failed migration. You can improve your chances by leaving more time between running the destination QEMU and the migrate command. To mitigate silent memory loss, you need to ensure the server initializes shared memory exactly the same on source and destination. These issues are entirely undocumented so far. I'd expect the server to be almost always fast enough to hide these issues. But then rare catastrophic races are in a way the worst kind. This is way more trouble than I'm willing to take from any device. Kill the funny state by receiving shared memory synchronously in realize(). If your hot plug hangs, go kill your ivshmem server. For easier review, this commit only makes the receive synchronous, it doesn't add the necessary error propagation. Without that, the funny state persists. The next commit will do that, and kill it off for real. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-15 21:34:40 +03:00
return;
}
}
if (s->master == ON_OFF_AUTO_AUTO) {
s->master = s->vm_id == 0 ? ON_OFF_AUTO_ON : ON_OFF_AUTO_OFF;
}
if (!ivshmem_is_master(s)) {
error_setg(&s->migration_blocker,
"Migration is disabled when using feature 'peer mode' in device 'ivshmem'");
if (migrate_add_blocker(s->migration_blocker, errp) < 0) {
error_free(s->migration_blocker);
return;
}
}
vmstate_register_ram(s->ivshmem_bar2, DEVICE(s));
pci_register_bar(PCI_DEVICE(s), 2,
PCI_BASE_ADDRESS_SPACE_MEMORY |
PCI_BASE_ADDRESS_MEM_PREFETCH |
PCI_BASE_ADDRESS_MEM_TYPE_64,
s->ivshmem_bar2);
}
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
static void ivshmem_exit(PCIDevice *dev)
{
IVShmemState *s = IVSHMEM_COMMON(dev);
int i;
if (s->migration_blocker) {
migrate_del_blocker(s->migration_blocker);
error_free(s->migration_blocker);
}
if (memory_region_is_mapped(s->ivshmem_bar2)) {
if (!s->hostmem) {
void *addr = memory_region_get_ram_ptr(s->ivshmem_bar2);
int fd;
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
if (munmap(addr, memory_region_size(s->ivshmem_bar2) == -1)) {
error_report("Failed to munmap shared memory %s",
strerror(errno));
}
fd = memory_region_get_fd(s->ivshmem_bar2);
close(fd);
}
vmstate_unregister_ram(s->ivshmem_bar2, DEVICE(dev));
}
if (s->hostmem) {
host_memory_backend_set_mapped(s->hostmem, false);
}
if (s->peers) {
for (i = 0; i < s->nb_peers; i++) {
close_peer_eventfds(s, i);
}
g_free(s->peers);
}
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
msix_uninit_exclusive_bar(dev);
}
g_free(s->msi_vectors);
}
static int ivshmem_pre_load(void *opaque)
{
IVShmemState *s = opaque;
if (!ivshmem_is_master(s)) {
error_report("'peer' devices are not migratable");
return -EINVAL;
}
return 0;
}
static int ivshmem_post_load(void *opaque, int version_id)
{
IVShmemState *s = opaque;
if (ivshmem_has_feature(s, IVSHMEM_MSI)) {
ivshmem: Clean up MSI-X conditions There are three predicates related to MSI-X: * ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X variant of the device is selected with msi=off. * msix_present() is true when the device has the PCI capability MSI-X. It's initially false, and becomes true during successful realize of the MSI-X variant of the device. Thus, it's the same as ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices. * msix_enabled() is true when msix_present() is true and guest software has enabled MSI-X. Code that differs between the non-MSI-X and the MSI-X variant of the device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or by msix_present(), except the latter works only for realized devices. Code that depends on whether MSI-X is in use needs to be guarded with msix_enabled(). Code review led me to two minor messes: * ivshmem_vector_notify() calls msix_notify() even when !msix_enabled(), unlike most other MSI-X-capable devices. As far as I can tell, msix_notify() does nothing when !msix_enabled(). Add the guard anyway. * Most callers of ivshmem_use_msix() guard it with ivshmem_has_feature(s, IVSHMEM_MSI). Not necessary, because ivshmem_use_msix() does nothing when !msix_present(). That's ivshmem's only use of msix_present(), though. Guard it consistently, and drop the now redundant msix_present() check. While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-15 21:34:34 +03:00
ivshmem_msix_vector_use(s);
}
return 0;
}
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
static void ivshmem_common_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
k->realize = ivshmem_common_realize;
k->exit = ivshmem_exit;
k->config_write = ivshmem_write_config;
k->vendor_id = PCI_VENDOR_ID_IVSHMEM;
k->device_id = PCI_DEVICE_ID_IVSHMEM;
k->class_id = PCI_CLASS_MEMORY_RAM;
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
k->revision = 1;
dc->reset = ivshmem_reset;
set_bit(DEVICE_CATEGORY_MISC, dc->categories);
dc->desc = "Inter-VM shared memory";
}
static const TypeInfo ivshmem_common_info = {
.name = TYPE_IVSHMEM_COMMON,
.parent = TYPE_PCI_DEVICE,
.instance_size = sizeof(IVShmemState),
.abstract = true,
.class_init = ivshmem_common_class_init,
pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devices Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of TYPE_PCI_DEVICE, except: 1) The ones that already have INTERFACE_PCIE_DEVICE set: * base-xhci * e1000e * nvme * pvscsi * vfio-pci * virtio-pci * vmxnet3 2) base-pci-bridge Not all PCI bridges are Conventional PCI devices, so INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes that are actually Conventional PCI: * dec-21154-p2p-bridge * i82801b11-bridge * pbm-bridge * pci-bridge The direct subtypes of base-pci-bridge not touched by this patch are: * xilinx-pcie-root: Already marked as PCIe-only. * pcie-pci-bridge: Already marked as PCIe-only. * pcie-port: all non-abstract subtypes of pcie-port are already marked as PCIe-only devices. 3) megasas-base Not all megasas devices are Conventional PCI devices, so the interface names are added to the subclasses registered by megasas_register_types(), according to information in the megasas_devices[] array. "megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas". Acked-by: Alberto Garcia <berto@igalia.com> Acked-by: John Snow <jsnow@redhat.com> Acked-by: Anthony PERARD <anthony.perard@citrix.com> Signed-off-by: Eduardo Habkost <ehabkost@redhat.com> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Marcel Apfelbaum <marcel@redhat.com> Reviewed-by: Michael S. Tsirkin <mst@redhat.com> Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-09-27 22:56:34 +03:00
.interfaces = (InterfaceInfo[]) {
{ INTERFACE_CONVENTIONAL_PCI_DEVICE },
{ },
},
};
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
static const VMStateDescription ivshmem_plain_vmsd = {
.name = TYPE_IVSHMEM_PLAIN,
.version_id = 0,
.minimum_version_id = 0,
.pre_load = ivshmem_pre_load,
.post_load = ivshmem_post_load,
.fields = (VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, IVShmemState),
VMSTATE_UINT32(intrstatus, IVShmemState),
VMSTATE_UINT32(intrmask, IVShmemState),
VMSTATE_END_OF_LIST()
},
};
static Property ivshmem_plain_properties[] = {
DEFINE_PROP_ON_OFF_AUTO("master", IVShmemState, master, ON_OFF_AUTO_OFF),
DEFINE_PROP_LINK("memdev", IVShmemState, hostmem, TYPE_MEMORY_BACKEND,
HostMemoryBackend *),
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
DEFINE_PROP_END_OF_LIST(),
};
static void ivshmem_plain_realize(PCIDevice *dev, Error **errp)
{
IVShmemState *s = IVSHMEM_COMMON(dev);
if (!s->hostmem) {
error_setg(errp, "You must specify a 'memdev'");
return;
} else if (host_memory_backend_is_mapped(s->hostmem)) {
error_setg(errp, "can't use already busy memdev: %s",
object_get_canonical_path_component(OBJECT(s->hostmem)));
return;
}
ivshmem_common_realize(dev, errp);
}
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
static void ivshmem_plain_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
k->realize = ivshmem_plain_realize;
device_class_set_props(dc, ivshmem_plain_properties);
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
dc->vmsd = &ivshmem_plain_vmsd;
}
static const TypeInfo ivshmem_plain_info = {
.name = TYPE_IVSHMEM_PLAIN,
.parent = TYPE_IVSHMEM_COMMON,
.instance_size = sizeof(IVShmemState),
.class_init = ivshmem_plain_class_init,
};
static const VMStateDescription ivshmem_doorbell_vmsd = {
.name = TYPE_IVSHMEM_DOORBELL,
.version_id = 0,
.minimum_version_id = 0,
.pre_load = ivshmem_pre_load,
.post_load = ivshmem_post_load,
.fields = (VMStateField[]) {
VMSTATE_PCI_DEVICE(parent_obj, IVShmemState),
VMSTATE_MSIX(parent_obj, IVShmemState),
VMSTATE_UINT32(intrstatus, IVShmemState),
VMSTATE_UINT32(intrmask, IVShmemState),
VMSTATE_END_OF_LIST()
},
};
static Property ivshmem_doorbell_properties[] = {
DEFINE_PROP_CHR("chardev", IVShmemState, server_chr),
DEFINE_PROP_UINT32("vectors", IVShmemState, vectors, 1),
DEFINE_PROP_BIT("ioeventfd", IVShmemState, features, IVSHMEM_IOEVENTFD,
true),
DEFINE_PROP_ON_OFF_AUTO("master", IVShmemState, master, ON_OFF_AUTO_OFF),
DEFINE_PROP_END_OF_LIST(),
};
static void ivshmem_doorbell_init(Object *obj)
{
IVShmemState *s = IVSHMEM_DOORBELL(obj);
s->features |= (1 << IVSHMEM_MSI);
}
static void ivshmem_doorbell_realize(PCIDevice *dev, Error **errp)
{
IVShmemState *s = IVSHMEM_COMMON(dev);
if (!qemu_chr_fe_backend_connected(&s->server_chr)) {
error_setg(errp, "You must specify a 'chardev'");
return;
}
ivshmem_common_realize(dev, errp);
}
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
static void ivshmem_doorbell_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
k->realize = ivshmem_doorbell_realize;
device_class_set_props(dc, ivshmem_doorbell_properties);
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
dc->vmsd = &ivshmem_doorbell_vmsd;
}
static const TypeInfo ivshmem_doorbell_info = {
.name = TYPE_IVSHMEM_DOORBELL,
.parent = TYPE_IVSHMEM_COMMON,
.instance_size = sizeof(IVShmemState),
.instance_init = ivshmem_doorbell_init,
.class_init = ivshmem_doorbell_class_init,
};
static void ivshmem_register_types(void)
{
ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem ivshmem can be configured with and without interrupt capability (a.k.a. "doorbell"). The two configurations have largely disjoint options, which makes for a confusing (and badly checked) user interface. Moreover, the device can't tell the guest whether its doorbell is enabled. Create two new device models ivshmem-plain and ivshmem-doorbell, and deprecate the old one. Changes from ivshmem: * PCI revision is 1 instead of 0. The new revision is fully backwards compatible for guests. Guests may elect to require at least revision 1 to make sure they're not exposed to the funny "no shared memory, yet" state. * Property "role" replaced by "master". role=master becomes master=on, role=peer becomes master=off. Default is off instead of auto. * Property "use64" is gone. The new devices always have 64 bit BARs. Changes from ivshmem to ivshmem-plain: * The Interrupt Pin register in PCI config space is zero (does not use an interrupt pin) instead of one (uses INTA). * Property "x-memdev" is renamed to "memdev". * Properties "shm" and "size" are gone. Use property "memdev" instead. * Property "msi" is gone. The new device can't have MSI-X capability. It can't interrupt anyway. * Properties "ioeventfd" and "vectors" are gone. They're meaningless without interrupts anyway. Changes from ivshmem to ivshmem-doorbell: * Property "msi" is gone. The new device always has MSI-X capability. * Property "ioeventfd" defaults to on instead of off. * Property "size" is gone. The new device can only map all the shared memory received from the server. Guests can easily find out whether the device is configured for interrupts by checking for MSI-X capability. Note: some code added in sub-optimal places to make the diff easier to review. The next commit will move it to more sensible places. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com> Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-15 21:34:51 +03:00
type_register_static(&ivshmem_common_info);
type_register_static(&ivshmem_plain_info);
type_register_static(&ivshmem_doorbell_info);
}
type_init(ivshmem_register_types)