Currently macvtap based macvlan device is working in promiscuous
mode, we want to implement mac-programming over macvtap through
Libvirt for better performance.
Design:
QEMU notifies Libvirt when rx-filter config is changed in guest,
then Libvirt query the rx-filter information by a monitor command,
and sync the change to macvtap device. Related rx-filter config
of the nic contains main mac, rx-mode items and vlan table.
This patch adds a QMP event to notify management of rx-filter change,
and adds a monitor command for management to query rx-filter
information.
Test:
If we repeatedly add/remove vlan, and change macaddr of vlan
interfaces in guest by a loop script.
Result:
The events will flood the QMP client(management), management takes
too much resource to process the events.
Event_throttle API (set rate to 1 ms) can avoid the events to flood
QMP client, but it could cause an unexpected delay (~1ms), guests
guests normally expect rx-filter updates immediately.
So we use a flag for each nic to avoid events flooding, the event
is emitted once until the query command is executed. The flag
implementation could not introduce unexpected delay.
There maybe exist an uncontrollable delay if we let Libvirt do the
real change, guests normally expect rx-filter updates immediately.
But it's another separate issue, we can investigate it when the
work in Libvirt side is done.
Michael S. Tsirkin: tweaked to enable events on start
Michael S. Tsirkin: fixed not to crash when no id
Michael S. Tsirkin: fold in patch:
"additional fixes for mac-programming feature"
Amos Kong: always notify QMP client if mactable is changed
Amos Kong: return NULL list if no net client supports rx-filter query
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Amos Kong <akong@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Fix for LP#1187529: Devices on PCI bridge stop working when
live-migrated. Update bridge mappings for all PCI bridge
devices in get_pci_config_device().
Signed-off-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The host_buses list is an odd structure - a list of pointers to PCI root
buses existing in parallel to the normal qdev tree structure. This patch
removes it, instead putting the link pointers into the PCIHostState
structure, which have a 1:1 relationship to PCIHostBus structures anyway.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are now no users of the domain field of PCIHostBus, so remove it
from the structure, and as a parameter from the pci_host_bus_register()
function which sets it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently pci_find_primary_bus() searches the list of root buses for one
with domain 0. But since host buses are always registered with domain 0,
this just amounts to finding the only PCI host bus. The only remaining
users of pci_find_primary_bus() are in pci-hotplug-old.c, which implements
the old style pci_add/pci_del commands.
Therefore, this patch redefines pci_find_primary_bus() to find the only
PCI root bus, returning an error if there are multiple roots. The callers
in pci-hotplug-old.c are updated correspondingly, to produce sensible
error messages.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
At present, pci_nic_init() and pci_nic_init_nofail() assume that they will
only create a NIC under the primary PCI root. As we add support for
multiple PCI roots, that may no longer be the case. This patch adds a root
bus parameter to pci_nic_init() (and updates callers accordingly) to allow
the machine init code using it to specify the right PCI root for NICs
created by old-style -net nic parameters. NICs created new-style, with
-device can of course be put anywhere.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_get_bus_devfn() interprets a full PCI address string to give a PCIBus *
and device/function number within that bus. Currently it assumes it is
working on an address under the primary PCI root bus. This patch extends
it to allow the caller to specify a root bus. This might seem a little odd
since the supplied address can (theoretically) include a PCI domain number.
However, attempting to use a non-zero domain number there is currently an
error, so that shouldn't really cause problems.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_find_domain() is used in a number of places where we want an id for a
whole PCI domain (i.e. the subtree under a PCI root bus). The trouble is
that many platforms may support multiple independent host bridges with no
hardware supplied notion of domain number.
This patch, therefore, replaces calls to pci_find_domain() with calls to
a new pci_root_bus_path() returning a string. The new call is implemented
in terms of a new callback in the host bridge class, so it can be defined
in some way that's well defined for the platform. When no callback is
available we fall back on the qbus name.
Most current uses of pci_find_domain() are for error or informational
messages, so the change in identifiers should be harmless. The exception
is pci_get_dev_path(), whose results form part of migration streams. To
maintain compatibility with old migration streams, the PIIX PCI host is
altered to always supply "0000" for this path, which matches the old domain
number (since the code didn't actually support domains other than 0).
For the pseries (spapr) PCI bridge we use a different platform-unique
identifier (pseries machines can routinely have dozens of PCI host
bridges). Theoretically that breaks migration streams, but given that we
don't yet have migration support for pseries, it doesn't matter.
Any other machines that have working migration support including PCI
devices will need to be updated to maintain migration stream compatibility.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently pci_find_domain() performs two functions - it locates the PCI
root bus above the given bus, then looks up that root bus's domain number.
This patch adds a helper function to perform the first task, finding the
root bus for a given PCI device. This is then used in pci_find_domain().
This changes pci_find_domain()'s signature slightly, taking a PCIDevice
instead of a PCIBus - since all callers passed something of the form
dev->bus, this simplifies things slightly.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_find_root_bus() takes a domain parameter. Currently PCI root buses
with domain other than 0 can't be created, so this is more or less a long
winded way of retrieving the main PCI root bus. Numbered domains don't
actually properly cover the (non x86) possibilities for multiple PCI root
buses, so this patch for now enforces the domain == 0 restriction in other
places to replace pci_find_root_bus() with an explicit
pci_find_primary_bus().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci_read_devaddr() is only used by the legacy functions for the old PCI
hotplug interface in pci-hotplug-old.c. So we move the function there,
and make it static.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
pci-hotplug.c and the CONFIG_PCI_HOTPLUG variable which controls its
compilation are misnamed. They're not about PCI hotplug in general, but
rather about the pci_add/pci_del interface which are now deprecated in
favour of the more general device_add/device_del interface. This patch
therefore renames them to pci-hotplug-old.c and CONFIG_PCI_HOTPLUG_OLD.
CONFIG_PCI_HOTPLUG=y was listed twice in {i386,x86_64}-softmmu.make for no
particular reason, so we clean that up too. In addition it was included in
ppc64-softmmu.mak for which the old hotplug interface was never used and is
unsuitable, so we remove that too.
Most of pci-hotplug.c was additionaly protected by #ifdef TARGET_I386. The
small piece which wasn't is only called from the pci_add and pci_del hooks
in hmp-commands.hx, which themselves were protected by #ifdef TARGET_I386.
This patch therefore also removes the #ifdef from pci-hotplug-old.c,
and changes the ifdefs in hmp-commands.hx to use CONFIG_PCI_HOTPLUG_OLD.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Avoid use of static variables: PC systems
initialize pvpanic device through pvpanic_init,
so we can simply create the fw_cfg file at that point.
This also makes it possible to skip device
creation completely if fw_cfg is not there, e.g. for xen -
so the ports it reserves are not discoverable by guests.
Also, make pvpanic_init void since callers ignore return
status anyway.
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Paul Durrant <Paul.Durrant@citrix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Marcelo doesn't maintain kvm anymore,
Paolo is taking over the job.
Update MAINTAINERS to stop flooding Marcelo with mail.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Coverity complains about two overruns in process_tx_desc(). The
complaints are false positives, but we might as well eliminate
them. The problem is that "hdr" is defined as an unsigned int,
but then used to offset an array of size 65536, and another of
size 256 bytes. hdr will actually never be greater than 255
though, as it's assigned only once and to the value of
tp->hdr_len, which is an uint8_t. This patch simply gets rid of
hdr, replacing it with tp->hdr_len, which makes it consistent
with all other tp member use in the function.
v2:
- also cleanup coding style issues in the touched lines
Signed-off-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make sure 1.4 calls 1.5, 1.3 calls 1.4 etc.
This way it's enough to add enough new compat hook
in a single place in piix.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Guest currently has to jump through lots of hoops to guess the PCI hole
ranges. It's fragile, and makes us change BIOS each time we add a new
chipset. Let's report the window in a ROM file, to make BIOS do exactly
what QEMU intends.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Sometimes we need to pass ranges around, add a
handy structure for this purpose.
Note: memory.c defines its own concept of AddrRange structure for
working with 128 addresses. It's necessary there for doing range math.
This is not needed for most users: struct Range is
much simpler, and is only used for passing the range around.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# By Michael S. Tsirkin (2) and others
# Via Paolo Bonzini
* qemu-kvm/uq/master:
kvmclock: clock should count only if vm is running
pci-assign: remove the duplicate function name in debug message
kvm: skip system call when msi route is unchanged
kvm: zero-initialize KVM_SET_GSI_ROUTING input
kvm: add detail error message when fail to add ioeventfd
Message-id: 1372841072-22265-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
kvmclock should not count while vm is paused, because:
1) if the vm is paused for long periods, timekeeping
math can overflow while converting the (large) clocksource
delta to nanoseconds.
2) Users rely on CLOCK_MONOTONIC to count run time, that is,
time which OS has been in a runnable state (see CLOCK_BOOTTIME).
Change kvmclock driver so as to save clock value when vm transitions
from runnable to stopped state, and to restore clock value from stopped
to runnable transition.
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While DEBUG() already includes the function name.
Signed-off-by: Wanlong Gao <gaowanlong@cn.fujitsu.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Some guests do a large number of mask/unmask
calls which currently trigger expensive route update
system calls.
Detect that route in unchanged and skip the system call.
Reported-by: "Zhanghaoyu (A)" <haoyu.zhang@huawei.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
kvm_add_routing_entry makes an attempt to
zero-initialize any new routing entry.
However, it fails to initialize padding
within the u field of the structure
kvm_irq_routing_entry.
Other functions like kvm_irqchip_update_msi_route
also fail to initialize the padding field in
kvm_irq_routing_entry.
It's better to just make sure all input is initialized.
Once it is, we can also drop complex field by field assignment and just
do the simple *a = *b to update a route entry.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
I try to hotplug 28 * 8 multiple-function devices to guest with
old host kernel, ioeventfds in host kernel will be exhausted, then
qemu fails to allocate ioeventfds for blk/nic devices.
It's better to add detail error here.
Signed-off-by: Amos Kong <akong@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
# By Alexander Graf (12) and others
# Via Alexander Graf
* agraf/ppc-for-upstream: (32 commits)
PPC: Ignore writes to L2CR
mac-io: Add escc-legacy memory alias region
PPC: Newworld: Add second uninorth control register set
PPC: Newworld: Add uninorth token register
PPC: Add clock-frequency export for Mac machines
PPC: Introduce an alias cache for faster lookups
PPC: Fix GDB read on code area for PPC6xx
PPC: Add dump_mmu() for 6xx
target-ppc: Introduce unrealizefn for PowerPCCPU
booke_ppc: limit booke timer to max when timeout overflow
Graphics: Switch to 800x600x32 as default mode
pseries: Update MAINTAINERS information
target-ppc kvm: save cr register
pseries: Fix compiler warning (conversion of pointer to integral value)
spapr-rtas: add CPU argument to RTAS calls
target-ppc: Change default machine for 64-bit
ppc: do not register IABR SPR twice for 603e
target-ppc: Drop redundant flags assignments from CPU families
mpc8544_guts: Turn qdev initfn into instance_init
mpc8544_guts: QOM'ify
...
Message-id: 1372556709-23868-1-git-send-email-agraf@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
# By Cornelia Huck
# Via Cornelia Huck
* cohuck/virtio-ccw-upstr:
virtio-ccw: fix build breakage on windows
Message-id: 1372669523-4039-1-git-send-email-cornelia.huck@de.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
event_notifier_get_fd() is not available on windows hosts. Fix this by
moving the calls to event_notifier_get_fd() to the kvm code.
Reported-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The L2CR register contains a number of bits that either impose configuration
which we can't deal with or mean "something is in progress until the bit is
0 again".
Since we don't model the former and we do want to accomodate guests using the
latter semantics, let's just ignore writes to L2CR. That way guests always read
back 0 and are usually happy with that.
Signed-off-by: Alexander Graf <agraf@suse.de>
Mac OS X's debugging serial driver accesses the ESCC through a different
register layout, called "escc-legacy". This layout differs from the normal
escc register layout purely by the location of the respective registers.
This patch adds a memory alias region that takes normal escc registers and
maps them into the escc-legacy register space.
With this patch applied, a Mac OS X guest successfully emits debug output
on the serial port when run with debug parameters set, for example by running:
$ qemu-system-ppc -prom-env -'boot-args=-v debug=0x8 io=0xff serial=0x3' \
-cdrom 10.4.iso -boot d
Signed-off-by: Alexander Graf <agraf@suse.de>
Mac OS X requires a second uninorth register set to be mapped a few
bytes above the first one. Let's just expose it to make it happy.
Signed-off-by: Alexander Graf <agraf@suse.de>
Mac OS X expects the uninorth control register set to contain one
register that always reads back what it writes in. Expose that.
This is just a temporary hack. Eventually, we want to expose the
uninorth (/uni-n in device tree) as a separate QOM device.
Signed-off-by: Alexander Graf <agraf@suse.de>
Support in fwcfg has been around for exposure of the clock-frequency
CPU property. OpenBIOS reads it, we just never exposed it.
Since Mac OS X is very picky about its clock frequency values, let's
just take a known good value and always expose that.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Alexander Graf <agraf@suse.de>
When running QEMU with "-cpu ?" we walk through every alias for every
target CPU we know about. This takes several seconds on my very fast
host system.
Let's introduce a class object cache in the alias table. Using that we
don't have to go through the tedious work of finding our target class.
Instead, we can just go directly from the alias name to the target class
pointer.
This patch brings -cpu "?" to reasonable times again.
Before:
real 0m4.716s
After:
real 0m0.025s
Signed-off-by: Alexander Graf <agraf@suse.de>
On PPC 6xx, data and code have separated TLBs. Until now QEMU was only
looking at data TLBs, which is not good when GDB wants to read code.
This patch adds a second call to get_physical_address() with an
ACCESS_CODE type of access when the first call with ACCESS_INT fails.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
"(qemu) info tlb" is a very useful tool for debugging, so I implemented
the missing 6xx version.
Signed-off-by: Fabien Chouteau <chouteau@adacore.com>
[agraf: fix printfs on hwaddr to PRI]
Signed-off-by: Alexander Graf <agraf@suse.de>
Use it to clean up the opcode table, resolving a former TODO from Jocelyn.
Also switch from malloc() to g_malloc().
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Limit watchdog and fit timer to maximum timeout value which
qemu timer can support (INT64_MAX). This maximum timeout will be
hundreds of years, so limiting to max timeout is pretty safe.
Signed-off-by: Bharat Bhushan <bharat.bhushan@freescale.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
We have stayed at 800x600x15 as default graphics mode for the last 9 years.
If there ever was a reason to be there, surely nobody remembers it.
However, recently non-Linux PPC guests started to show bad effects on 15 bit
color mode. They do work just fine with 32 bits however.
So let's switch to 32 bit color as the default graphic mode.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Alexander Graf <agraf@suse.de>
I'm no longer at IBM, and therefore no long actively working on the pseries
(aka sPAPR) qemu machine type. This patch removes my information in the
MAINTAINERS file.
While we're at it, I've added some extra file patterns for pseries specific
files that weren't included in the existing pattern.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: Remove new maintainer addition]
Signed-off-by: Alexander Graf <agraf@suse.de>
This adds a missing code to save CR (condition register) via
kvm_arch_put_registers(). kvm_arch_get_registers() already has it.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
This kind of type cast must use uintptr_t or target_ulong to be portable
for hosts with sizeof(void *) != sizeof(long).
Here the value is assigned to a variable of type target_ulong.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
[agraf: fix compilation on 32bit hosts]
Signed-off-by: Alexander Graf <agraf@suse.de>
RTAS is a hypervisor provided binary blob that a guest loads and
calls into to execute certain functions. It's similar to the
vsyscall page in Linux or the short lived VMCI paravirt interface
from VMware.
The QEMU implementation of the RTAS blob is simply a passthrough
that proxies all RTAS calls to the hypervisor via an hypercall.
While we pass a CPU argument for hypercall handling in QEMU, we
don't pass it for RTAS calls. Since some RTAs calls require
making hypercalls (normally RTAS is implemented as guest code) we
have nasty hacks to allow that.
Add a CPU argument to RTAS call handling so we can more easily
invoke hypercalls just as guest code would.
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Currently, for qemu-system-ppc64, the default machine type is 'mac99'.
The mac99 machine is not being actively maintained, and represents a
bizarre hybrid of components that never actually existed as a real system.
This patch changes the default machine to 'pseries', which is actively
maintained and works well with most modern ppc64 Linux distributions as a
guest.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
[agraf: adjust commit message]
Signed-off-by: Alexander Graf <agraf@suse.de>
IABR SPR is already registered in gen_spr_603(), called from init_proc_603E().
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
Previous code has #define POWERPC_INSNS2_<family> PPC_NONE in some
places for macrofied assignment to insns_flags2 field.
PPC_NONE is defined as zero though and QOM classes are zero-initialized,
so drop any pcc->insns_flags2 = PPC_NONE; assignments.
PPC_NONE itself is still in use in translate.c.
Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
SysBus can deal with NULL SysBusDeviceClass::init since 4ce5dae.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>