Commit Graph

546 Commits

Author SHA1 Message Date
Victor Kaplansky
5669655aaf vhost-user interrupt management fixes
Since guest_mask_notifier can not be used in vhost-user mode due
to buffering implied by unix control socket, force
use_mask_notifier on virtio devices of vhost-user interfaces, and
send correct callfd to the guest at vhost start.

Using guest_notifier_mask function in vhost-user case may
break interrupt mask paradigm, because mask/unmask is not
really done when returning from guest_notifier_mask call, instead
message is posted in a unix socket, and processed later.

Add an option boolean flag 'use_mask_notifier' to disable the use
of guest_notifier_mask in virtio pci.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-18 16:13:56 +02:00
Greg Kurz
3154d1e426 vhost-net: revert support of cross-endian vnet headers
Cross-endian is now handled by the core virtio-net code.

This patch reverts:

commit 5be7d9f1b1
	vhost-net: tell tap backend about the vnet endianness

and

commit cf0a628f6e81bfc9b7a944fa0b80c3594836df56
	net: set endianness on all backend devices

Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:17 +02:00
Greg Kurz
1bfa316ce7 virtio-net: use the backend cross-endian capabilities
When running a fully emulated device in cross-endian conditions, including
a virtio 1.0 device offered to a big endian guest, we need to fix the vnet
headers. This is currently handled by the virtio_net_hdr_swap() function
in the core virtio-net code but it should actually be handled by the net
backend.

With this patch, virtio-net now tries to configure the backend to do the
endian fixing when the device starts (i.e. drivers sets the CONFIG_OK bit).
If the backend cannot support the requested endiannes, we have to fallback
onto virtio_net_hdr_swap(): this is recorded in the needs_vnet_hdr_swap flag,
to be used in the TX and RX paths.

Note that we reset the backend to the default behaviour (guest native
endianness) when the device stops (i.e. device status had CONFIG_OK bit and
driver unsets it). This is needed, with the linux tap backend at least,
otherwise the guest may lose network connectivity if rebooted into a
different endianness.

The current vhost-net code also tries to configure net backends. This will
be no more needed and will be reverted in a subsequent patch.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:17 +02:00
Eric Blake
d7bce9999d qom: Swap 'name' next to visitor in ObjectPropertyAccessor
Similar to the previous patch, it's nice to have all functions
in the tree that involve a visitor and a name for conversion to
or from QAPI to consistently stick the 'name' parameter next
to the Visitor parameter.

Done by manually changing include/qom/object.h and qom/object.c,
then running this Coccinelle script and touching up the fallout
(Coccinelle insisted on adding some trailing whitespace).

    @ rule1 @
    identifier fn;
    typedef Object, Visitor, Error;
    identifier obj, v, opaque, name, errp;
    @@
     void fn
    - (Object *obj, Visitor *v, void *opaque, const char *name,
    + (Object *obj, Visitor *v, const char *name, void *opaque,
       Error **errp) { ... }

    @@
    identifier rule1.fn;
    expression obj, v, opaque, name, errp;
    @@
     fn(obj, v,
    -   opaque, name,
    +   name, opaque,
        errp)

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-20-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Eric Blake
51e72bc1dd qapi: Swap visit_* arguments for consistent 'name' placement
JSON uses "name":value, but many of our visitor interfaces were
called with visit_type_FOO(v, &value, name, errp).  This can be
a bit confusing to have to mentally swap the parameter order to
match JSON order.  It's particularly bad for visit_start_struct(),
where the 'name' parameter is smack in the middle of the
otherwise-related group of 'obj, kind, size' parameters! It's
time to do a global swap of the parameter ordering, so that the
'name' parameter is always immediately after the Visitor argument.

Additional reason in favor of the swap: the existing include/qjson.h
prefers listing 'name' first in json_prop_*(), and I have plans to
unify that file with the qapi visitors; listing 'name' first in
qapi will minimize churn to the (admittedly few) qjson.h clients.

Later patches will then fix docs, object.h, visitor-impl.h, and
those clients to match.

Done by first patching scripts/qapi*.py by hand to make generated
files do what I want, then by running the following Coccinelle
script to affect the rest of the code base:
 $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
I then had to apply some touchups (Coccinelle insisted on TAB
indentation in visitor.h, and botched the signature of
visit_type_enum() by rewriting 'const char *const strings[]' to
the syntactically invalid 'const char*const[] strings').  The
movement of parameters is sufficient to provoke compiler errors
if any callers were missed.

    // Part 1: Swap declaration order
    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_start_struct
    -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type bool, TV, T1;
    identifier ARG1;
    @@
     bool visit_optional
    -(TV v, T1 ARG1, const char *name)
    +(TV v, const char *name, T1 ARG1)
     { ... }

    @@
    type TV, TErr, TObj, T1;
    identifier OBJ, ARG1;
    @@
     void visit_get_next_type
    -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_type_enum
    -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj;
    identifier OBJ;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
     void VISIT_TYPE
    -(TV v, TObj OBJ, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, TErr errp)
     { ... }

    // Part 2: swap caller order
    @@
    expression V, NAME, OBJ, ARG1, ARG2, ERR;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
    (
    -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
    +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -visit_optional(V, ARG1, NAME)
    +visit_optional(V, NAME, ARG1)
    |
    -visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
    +visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
    |
    -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
    +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -VISIT_TYPE(V, OBJ, NAME, ERR)
    +VISIT_TYPE(V, NAME, OBJ, ERR)
    )

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Peter Maydell
bdad0f3977 pc and misc cleanups and fixes, virtio optimizations
Included here:
 Refactoring and bugfix patches in PC/ACPI.
 New commands for ipmi.
 Virtio optimizations.
 
 Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
 -----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJWtj8KAAoJECgfDbjSjVRpBIQIAJSB9xwTcBLXwD0+8z5lqjKC
 GTtuVbHU0+Y/eO8O3llN5l+SzaRtPHo18Ele20Oz7IQc0ompANY273K6TOlyILwB
 rOhrub71uqpOKbGlxXJflroEAXb78xVK02lohSUvOzCDpwV+6CS4ZaSer7yDCYkA
 MODZj7rrEuN0RmBWqxbs1R7Mj2CeQJzlgTUNTBGCLEstoZGFOJq8FjVdG5P1q8vI
 fnI9mGJ1JsDnmcUZe/bTFfB4VreqeQ7UuGyNAMMGnvIbr0D1a+CoaMdV7/HZ+KyT
 5TIs0siVdhZei60A/Cq2OtSVCbj5QdxPBLhZfwJCp6oU4lh2U5tSvva0mh7MwJ0=
 =D/cA
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging

pc and misc cleanups and fixes, virtio optimizations

Included here:
Refactoring and bugfix patches in PC/ACPI.
New commands for ipmi.
Virtio optimizations.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sat 06 Feb 2016 18:44:26 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (45 commits)
  net: set endianness on all backend devices
  fix MSI injection on Xen
  intel_iommu: large page support
  dimm: Correct type of MemoryHotplugState->base
  pc: set the OEM fields in the RSDT and the FADT from the SLIC
  acpi: add function to extract oem_id and oem_table_id from the user's SLIC
  acpi: expose oem_id and oem_table_id in build_rsdt()
  acpi: take oem_id in build_header(), optionally
  pc: Eliminate PcGuestInfo struct
  pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
  pc: Move PcGuestInfo.fw_cfg to PCMachineState
  pc: Remove PcGuestInfo.isapc_ram_fw field
  pc: Remove RAM size fields from PcGuestInfo
  pc: Remove compat fields from PcGuestInfo
  acpi: Don't save PcGuestInfo on AcpiBuildState
  acpi: Remove guest_info parameters from functions
  pc: Simplify xen_load_linux() signature
  pc: Simplify pc_memory_init() signature
  pc: Eliminate struct PcGuestInfoState
  pc: Move PcGuestInfo declaration to top of file
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-08 11:25:31 +00:00
Laurent Vivier
a407644079 net: set endianness on all backend devices
commit 5be7d9f1b1
       vhost-net: tell tap backend about the vnet endianness

makes vhost net to set the endianness of the device, but only for
the first device.

In case of multiqueue, we have multiple devices... This patch sets the
endianness for all the devices of the interface.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2016-02-06 20:44:10 +02:00
Paolo Bonzini
51b19ebe43 virtio: move allocation to virtqueue_pop/vring_pop
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0.  We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.

The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items.  Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.

The patch is pretty large, but changes to each device are testable
more or less independently.  Splitting it would mostly add churn.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-06 20:39:07 +02:00
Laszlo Ersek
dd793a7488 e1000: eliminate infinite loops on out-of-bounds transfer start
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:

- the TDLEN and RDLEN registers store the total size of the descriptor
  area,

- while the TDH and RDH registers store the offset (in whole tx / rx
  descriptors) into the area where the transfer is supposed to start.

Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).

QEMU already contains logic to deal with bogus transfers submitted by the
guest:

- Normally, the transmit case wants to increase TDH from its initial value
  to TDT. (TDT is allowed to be numerically smaller than the initial TDH
  value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
  that QEMU currently has here is a check against reaching the original
  TDH value again -- a complete wraparound, which should never happen.

- In the receive case RDH is increased from its initial value until
  "total_size" bytes have been received; preferably in a single step, or
  in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
  RX descriptors are skipped without receiving data, while RDH is
  incremented just the same. QEMU tries to prevent an infinite loop
  (processing only null RX descriptors) by detecting whether RDH assumes
  its original value during the loop. (Again, wrapping from RDLEN to 0 is
  normal.)

What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.

The condition that expresses this is:

  xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)

i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.

This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.

This is CVE-2016-1981.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Michael S. Tsirkin
d7f053652f cadence_gem: fix buffer overflow
gem_transmit copies a packet from guest into an tx_packet[2048]
array on stack, with size limited by descriptor length set by guest.  If
guest is malicious and specifies a descriptor length that is too large,
and should packet size exceed array size, this results in a buffer
overflow.

Reported-by: 刘令 <liuling-it@360.cn>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Prasad J Pandit
244381ec19 net: cadence_gem: check packet size in gem_recieve
While receiving packets in 'gem_receive' routine, if Frame Check
Sequence(FCS) is enabled, it copies the packet into a local
buffer without checking its size. Add check to validate packet
length against the buffer size to avoid buffer overflow.

Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Peter Maydell
e8d4046559 hw/net: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-19-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:23 +00:00
Peter Maydell
9b8bfe21be virtio: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-15-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:23 +00:00
Peter Maydell
21cbfe5f37 xen: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-14-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:23 +00:00
Peter Maydell
8ef94f0bc9 arm: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-13-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:23 +00:00
Peter Maydell
0d75590d91 ppc: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-6-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:22 +00:00
Peter Maydell
ea99dde191 lm32: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453832250-766-5-git-send-email-peter.maydell@linaro.org
2016-01-29 15:07:22 +00:00
Ian Campbell
c1345a8878 xen: Switch to libxengnttab interface for compat shims.
In Xen 4.7 we are refactoring parts libxenctrl into a number of
separate libraries which will provide backward and forward API and ABI
compatiblity.

One such library will be libxengnttab which provides access to grant
tables.

In preparation for this switch the compatibility layer in xen_common.h
(which support building with older versions of Xen) to use what will
be the new library API. This means that the gnttab shim will disappear
for versions of Xen which include libxengnttab.

To simplify things for the <= 4.0.0 support we wrap the int fd in a
malloc(sizeof int) such that the handle is always a pointer. This
leads to less typedef headaches and the need for
XC_HANDLER_INITIAL_VALUE etc for these interfaces.

Note that this patch does not add any support for actually using
libxengnttab, it just adjusts the existing shims.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-01-26 17:19:28 +00:00
Peter Maydell
d341d9f306 fpu: Replace uint8 typedef with uint8_t
Replace the uint8 softfloat-specific typedef with uint8_t.
This change was made with

find include hw fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\buint8\b/uint8_t/g'

together with manual removal of the typedef definition and
manual fixing of more erroneous uses found via test compilation.

It turns out that the only code using this type is an accidental
use where uint8_t was intended anyway...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-7-git-send-email-peter.maydell@linaro.org
2016-01-22 15:09:21 +00:00
Peter Maydell
3a87d00910 fpu: Replace uint32 typedef with uint32_t
Replace the uint32 softfloat-specific typedef with uint32_t.
This change was made with

find include hw fpu target-* -name '*.[ch]' | xargs sed -i -e 's/\buint32\b/uint32_t/g'

together with manual removal of the typedef definition,
manual undoing of various mis-hits, and another couple of
fixes found via test compilation.

All the uses in hw/ were using the wrong type by mistake.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Acked-by: Leon Alrae <leon.alrae@imgtec.com>
Acked-by: James Hogan <james.hogan@imgtec.com>
Message-id: 1452603315-27030-5-git-send-email-peter.maydell@linaro.org
2016-01-22 15:09:21 +00:00
Markus Armbruster
5a8de107e3 etraxfs_eth: Don't use hw_error() in init() method
Device init() methods aren't supposed to call hw_error(), they should
report the error and fail cleanly.  Do that.

Cc: "Edgar E. Iglesias" <edgar.iglesias@gmail.com>
Signed-off-by: Markus Armbruster <armbru@pond.sub.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-Id: <1450370121-5768-5-git-send-email-armbru@redhat.com>
2016-01-13 11:58:58 +01:00
Dr. David Alan Gilbert
9c7ffe2664 ether/slirp: Avoid redefinition of the same constants
eth.h and slirp.h both define ETH_ALEN and ETH_P_IP
rtl8139.c and eth.h both define ETH_HLEN

Move the related constant (ETH_P_ARP) from slirp.h to eth.h, and
remove the duplicates; make slirp.h include eth.h

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:35 +08:00
Prasad J Pandit
aa7f9966df net: ne2000: fix bounds check in ioport operations
While doing ioport r/w operations, ne2000 device emulation suffers
from OOB r/w errors. Update respective array bounds check to avoid
OOB access.

Reported-by: Ling Liu <liuling-it@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:35 +08:00
Prasad J Pandit
007cd223de net: rocker: fix an incorrect array bounds check
While processing transmit(tx) descriptors in 'tx_consume' routine
the switch emulator suffers from an off-by-one error, if a
descriptor was to have more than allowed(ROCKER_TX_FRAGS_MAX=16)
fragments. Fix an incorrect bounds check to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:35 +08:00
Shmulik Ladkani
7d6d347d06 vmxnet3: Introduce 'x-disable-pcie' back-compat property
Following the previous patch which changed vmxnet3 to be a pci express
device, this patch introduces a boolean property 'x-disable-pcie' whose
default is false.

Setting 'x-disable-pcie' to 'on' preserves the old 'pci device' (non
express) behavior. This allows migration to older versions.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:35 +08:00
Shmulik Ladkani
3509866ab3 vmxnet3: Report the Device Serial Number capability
Report the DSN extended PCI capability at 0x100.
DSN value is a transformation of device MAC address, as calculated
by VMware virtual hardware.

DSN is reported only if device is pcie.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:35 +08:00
Shmulik Ladkani
f713d4d2f1 vmxnet3: The vmxnet3 device is a PCIE endpoint
Report the 'express endpoint' capability if on a PCIE bus.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:35 +08:00
Shmulik Ladkani
b79f17a9bc vmxnet3: coding: Introduce VMXNET3Class
Introduce a class type for vmxnet3, and the usual
DEVICE_CLASS/DEVICE_GET_CLASS macros.

No semantic change.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Shmulik Ladkani
b22e0aef46 vmxnet3: Introduce 'x-old-msi-offsets' back-compat property
Following the previous patches, where vmxnet3's pci's msi/msix
capability offsets and msix's PBA table offsets have been changed, this
patch introduces a boolean property 'x-old-msi-offsets' to vmxnet3,
whose default is false.

Setting 'x-old-msi-offsets' to 'on' preserves the old offsets behavior,
which allows migration to older versions.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Shmulik Ladkani
9c087a0504 vmxnet3: Change the offset of the MSIX PBA table
Place the PBA table at 0x1000, as placed by VMware virtual hardware.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Shmulik Ladkani
f9262dae13 vmxnet3: Change offsets of msi/msix pci capabilities
Place device reported PCI capabilities at the same offsets as placed by
the VMware virtual hardware: MSI at [84], MSI-X at [9c].

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
c12d82ef15 net/vmxnet3: rename VMXNET3_DEVICE_VERSION to VMXNET3_UPT_REVISION
VMXNET3_DEVICE_VERSION is used as return value for accessing
UPT Revision Report and Selection register. So rename it
to VMXNET3_UPT_REVISION.

Signed-off-by: Miao Yan <yanmiaoebest@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
8856be1512 net/vmxnet3: return 0 on unknown command
Return 0 on unknown command, this is what esxi (5.x+) behaves.

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
5ae3e91c35 net/vmxnet3: return correct value for VMXNET3_CMD_GET_DEV_EXTRA_INFO
VMXNET3_CMD_GET_DEV_EXTRA_INFO should return 0 for emulation
mode

This behavior can be observed by the following steps:

1) run a Linux distro on esxi server (5.x+)
2) modify vmxnet3 Linux driver to read the register:

  VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD, VMXNET3_CMD_GET_DEV_EXTRA_INFO);
  ret =  VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_CMD);
  pr_info("vmxnet3 dev_info: 0x%x\n", ret);

The kernel log will have some like the following message:

  [ 7005.111170] vmxnet3 dev_info: 0x0

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
c469669ef7 net/vmxnet3: return correct value for VMXNET3_CMD_GET_DID_* command
VMXNET3_CMD_GET_DID_LO should return PCI ID of the device
and VMXNET3_CMD_GET_DID_HI should return vmxnet3 revision ID.

This behavior can be observed by the following steps:

1) run a Linux distro on esxi server (5.x+)
2) modify vmxnet3 Linux driver to read DID_HI and DID_LO:

  VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD, VMXNET3_CMD_GET_DID_LO);
  lo =  VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_CMD);

  VMXNET3_WRITE_BAR1_REG(adapter, VMXNET3_REG_CMD, VMXNET3_CMD_GET_DID_HI);
  high =  VMXNET3_READ_BAR1_REG(adapter, VMXNET3_REG_CMD);
  pr_info("vmxnet3 DID lo: 0x%x, high: 0x%x\n", lo, high);

The kernel log will have something like the following message:

  [ 7005.111170] vmxnet3 DID lo: 0x7b0, high: 0x1

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
fde58177aa net/vmxnet3: return 1 on device activation failure
When reading device status, 0 means device is successfully
activated and 1 means error.

This behavior can be observed by the following steps:

1) run a Linux distro on esxi server (5.5+)
2) modify vmxnet3 Linux driver to give it an invalid
   address to 'adapter->shared_pa' which is the
   shared memory for guest/host communication

This will trigger device activation failure and kernel
log will have the following message:

   [ 7138.403256] vmxnet3 0000:03:00.0 eth1: Failed to activate dev: error 1

So return 1 on device activation failure instead of -1;

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
P J P
aa4a3dce1c net: vmxnet3: avoid memory leakage in activate_device
Vmxnet3 device emulator does not check if the device is active
before activating it, also it did not free the transmit & receive
buffers while deactivating the device, thus resulting in memory
leakage on the host. This patch fixes both these issues to avoid
host memory leakage.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
dd3c168471 net/vmxnet3: remove redundant VMW_SHPRN(...) definition
Macro VMW_SHPRN(...) is already defined vmxnet3_debug.h,
so remove the duplication

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
71c2f5b9b3 net/vmxnet3: fix debug macro pattern for vmxnet3
Vmxnet3 uses the following debug macro style:

 #ifdef SOME_DEBUG
 #  define debug(...) do{ printf(...); } while (0)
 # else
 #  define debug(...) do{ } while (0)
 #endif

If SOME_DEBUG is undefined, then format string inside the
debug macro will never be checked by compiler. Code is
likely to break in the future when SOME_DEBUG is enabled
 because of lack of testing. This patch changes this
to the following:

 #define debug(...) \
  do { if (SOME_DEBUG_ENABLED) printf(...); } while (0)

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
2e4ca7dbc1 net/vmxnet3: use %zu for size_t in printf
Use %zu specifier for size_t in printf, otherwise build would fail
on platforms where size_t is not unsigned long

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:34 +08:00
Miao Yan
938cdfefee net/vmxnet3: fix a build error when enabling debug output
Macro MAC_FMT and MAC_ARG are not defined, but used in vmxnet3_net_init().
This will cause build error when debug level is raised in
vmxnet3_debug.h (enable all VMXNET3_DEBUG_xxx).

Use VMXNET_MF and VXMNET_MA instead.

Signed-off-by: Miao Yan <yanmiaobest@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-01-11 11:01:32 +08:00
Peter Maydell
84942979de -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJWZZJPAAoJEO8Ells5jWIRmp0H/26aFXVEgZykkUVNbqq05r7w
 AI7podQlFOAESJHqZtR8FMaH8TAZ5GhphP4pn0PsWp54VjwcYZbdoME+dhZ4Elyc
 WDanRHIweLv/zVg6+M8oHhw5GMaxtFLoLWrf0oanbUW9IZZmmM3COz/Y31hSVrR2
 EzEJi1VZZhpMj3ibeOJns4MrugYrne8MtOdvusE/Uw2rJBTiStnWw1eTk8RmkNcg
 5un1mQZxFU2AcNzmWdmWJmjY0rCnR3HhtTdZOwjM6uZGIJ9hbsItGzqiGadBfozI
 fUtIa2HZahioe0VIzoB0snXnAuhV1jA0Uy18i04dPvgQOmiVSRjQNE2/lwQflyE=
 =Pad3
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Mon 07 Dec 2015 14:06:07 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  lan9118: log and ignore access to invalid registers, rather than aborting
  lan9118: fix emulation of MAC address loaded bit in E2P_CMD register
  vmxnet3: silence warning
  pcnet: fix rx buffer overflow(CVE-2015-7512)
  net: pcnet: add check to validate receive data size(CVE-2015-7504)
  e1000: fix hang of win2k12 shutdown with flood ping

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-12-07 14:18:31 +00:00
Andrew Baumann
52b4bb7383 lan9118: log and ignore access to invalid registers, rather than aborting
With this change, access to invalid/unimplemented device registers are
logged as a "guest error" rather than aborting qemu with
hw_error. This enables drivers for similar devices (e.g. SMSC 9221),
by simply ignoring the unimplemented writes. It's also closer to what
real hardware does.

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-12-07 21:43:48 +08:00
Andrew Baumann
12fdd928c8 lan9118: fix emulation of MAC address loaded bit in E2P_CMD register
There appears to have been a longstanding typo in the implementation
of the "MAC address loaded" bit in the E2P_CMD (EEPROM command)
register. The code was using 0x10, but the controller spec says it
should be bit 8 (0x100).

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-12-07 21:43:48 +08:00
Michael S. Tsirkin
6a9c647095 vmxnet3: silence warning
vmxnet3 always produces a warning under qtest.

This is not a user error, don't warn.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-12-07 21:43:48 +08:00
Jason Wang
8b98a2f071 pcnet: fix rx buffer overflow(CVE-2015-7512)
Backends could provide a packet whose length is greater than buffer
size. Check for this and truncate the packet to avoid rx buffer
overflow in this case.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-12-07 21:43:48 +08:00
Prasad J Pandit
837f21aacf net: pcnet: add check to validate receive data size(CVE-2015-7504)
In loopback mode, pcnet_receive routine appends CRC code to the
receive buffer. If the data size given is same as the buffer size,
the appended CRC code overwrites 4 bytes after s->buffer. Added a
check to avoid that.

Reported by: Qinghao Tang <luodalongde@gmail.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-12-07 21:43:48 +08:00
Denis V. Lunev
9596ef7c7b e1000: fix hang of win2k12 shutdown with flood ping
e1000 driver in Win2k12 is really well rotten. It 100% hangs on shutdown
of UP VM under flood ping. The guest checks card state and reinjects
itself interrupt in a loop. This is fatal for UP machine.

There is no good way to fix this misbehavior but to kludge it. The
emulation has interrupt throttling register aka ITR which limits
interrupt rate and allows the guest to proceed this phase.
There is no problem with this kludge for Linux guests - it adjust the
value of it itself.

On the other hand according to the initial research in
    commit e9845f0985
    Author: Vincenzo Maffione <v.maffione@gmail.com>
    Date:   Fri Aug 2 18:30:52 2013 +0200

    e1000: add interrupt mitigation support

    ...

    Interrupt mitigation boosts performance when the guest suffers from
    an high interrupt rate (i.e. receiving short UDP packets at high packet
    rate). For some numerical results see the following link
    http://info.iet.unipi.it/~luigi/papers/20130520-rizzo-vm.pdf

this should also boost performance a bit.

See https://bugzilla.redhat.com/show_bug.cgi?id=874406 for additional
details.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Vincenzo Maffione <v.maffione@gmail.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-12-07 21:43:43 +08:00
Stefan Weil
00837731d2 eepro100: Prevent two endless loops
http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg04592.html
shows an example how an endless loop in function action_command can
be achieved.

During my code review, I noticed a 2nd case which can result in an
endless loop.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-27 10:39:55 +08:00
Michael S. Tsirkin
72018d1e19 vhost-user: ignore qemu-only features
Some features (such as ctrl vq) are supported
by qemu without need to communicate with the
backend.

Drop them from the feature mask so we set them
unconditionally.

Reported-by: Victor Kaplansky <vkaplans@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-18 18:49:12 +02:00
Yuanhan Liu
12b8cbac3c vhost: don't send RESET_OWNER at stop
First of all, RESET_OWNER message is sent incorrectly, as it's sent
before GET_VRING_BASE. And the reset message would let the later call
get nothing correct.

And, sending SET_VRING_ENABLE at stop, which has already been done,
makes more sense than RESET_OWNER.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-11-16 12:02:54 +02:00
Peter Maydell
17e50a72a3 -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJWREdzAAoJEO8Ells5jWIRI18H/0CEDVwj7AJHLEpAv07hX2iS
 jfq6Osgj5hDChv43+66Clz3owog3m9NfPKWxBMvIw5c/Q1mFvNuxZcUaVOzX2dT4
 E+IwIsZxXOANIGPYtCxOhARz1zNSDxJxgYPMVuIDZ+uZVJqYeCjdduMGzgy8wt8H
 qiquUCI2sktg97AntZqzp8iWfZZIN5w6uNbf3FvgwIffWDxGRPt8wY6dlwgIpsx2
 uFd9PMwtj7lJyV9guy36FdrS7MhVTCF5/5GIerPj2nN1ByJp9vu5InzPAlmZNRSZ
 KxKcBnmkLsnT3nDN86ZS6ajDyjeEgWSVdrQS9MHDURfinADuuqjbJkhME/UhG+g=
 =vRNP
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Thu 12 Nov 2015 08:01:55 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net: netmap: use error_setg() helpers in place of error_report()
  net: netmap: Fix compilation issue
  e1000: Introducing backward compatibility command line parameter
  e1000: Implementing various counters
  e1000: Fixing the packet address filtering procedure
  e1000: Fixing the received/transmitted octets' counters
  e1000: Fixing the received/transmitted packets' counters
  e1000: Trivial implementation of various MAC registers
  e1000: Introduced an array to control the access to the MAC registers
  e1000: Add support for migrating the entire MAC registers' array
  e1000: Cosmetic and alignment fixes
  slirp: Fix type casts and format strings in debug code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-11-12 14:15:32 +00:00
Leonid Bloch
ba63ec8594 e1000: Introducing backward compatibility command line parameter
This follows the previous patches, where support for migrating the
entire MAC registers' array, and some new MAC registers were introduced.

This patch introduces the e1000-specific boolean parameter
"extra_mac_registers", which is on by default. Setting it to off will
enable migration to older versions of QEMU, but will disable the read
and write access to the new registers, that were introduced since adding
the ability to migrate the entire MAC array.

Example for usage to enable backward compatibility and to disable the
new MAC registers:

    qemu-system-x86_64 -device e1000,extra_mac_registers=off,... ...

As mentioned above, the default value is "on".

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:54 +08:00
Leonid Bloch
3b27430177 e1000: Implementing various counters
This implements the following Statistic registers (various counters)
according to Intel's specs:

TSCTC  GOTCL  GOTCH  GORCL  GORCH  MPRC   BPRC   RUC    ROC
BPTC   MPTC   PTC... PRC...

PLEASE NOTE: these registers will not be active, nor will migrate, until
a compatibility flag will be set (in the next patch in this series).

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:54 +08:00
Leonid Bloch
4aeea330f0 e1000: Fixing the packet address filtering procedure
Previously, if promiscuous unicast was enabled, a packet was received
straight away, even if it was a multicast or a broadcast packet. This
patch fixes that behavior, while making the filtering procedure a bit
more human-readable.

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:54 +08:00
Leonid Bloch
45e9376471 e1000: Fixing the received/transmitted octets' counters
Previously, these 64-bit registers did not stick at their maximal
values when (and if) they reached them, as they should do, according to
the specs.

This patch introduces a function that takes care of such registers,
avoiding code duplication, making the relevant parts more compatible
with the QEMU coding style, while ensuring that in the unlikely case
of reaching the maximal value, the counter will stick there, as it
supposed to.

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:53 +08:00
Leonid Bloch
1f67f92c4f e1000: Fixing the received/transmitted packets' counters
According to Intel's specs, these counters (as the other Statistic
registers) stick at 0xffffffff when this maximal value is reached.
Previously, they would reset after the max. value.

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:53 +08:00
Leonid Bloch
72ea771c97 e1000: Trivial implementation of various MAC registers
These registers appear in Intel's specs, but were not implemented.
These registers are now implemented trivially, i.e. they are initiated
with zero values, and if they are RW, they can be written or read by the
driver, or read only if they are R (essentially retaining their zero
values). For these registers no other procedures are performed.

For the trivially implemented Diagnostic registers, a debug warning is
produced on read/write attempts.

PLEASE NOTE: these registers will not be active, nor will migrate, until
a compatibility flag will be set (in a later patch in this series).

The registers implemented here are:

Transmit:
RW: AIT

Management:
RW: WUC     WUS     IPAV    IP6AT*  IP4AT*  FFLT*   WUPM*   FFMT*   FFVT*

Diagnostic:
RW: RDFH    RDFT    RDFHS   RDFTS   RDFPC   PBM*    TDFH    TDFT    TDFHS
    TDFTS   TDFPC

Statistic:
RW: FCRUC
R:  RNBC    TSCTFC  MGTPRC  MGTPDC  MGTPTC  RFC     RJC     SCC     ECOL
    LATECOL MCC     COLC    DC      TNCRS   SEC     CEXTERR RLEC    XONRXC
    XONTXC  XOFFRXC XOFFTXC

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:53 +08:00
Leonid Bloch
bc0f0674f0 e1000: Introduced an array to control the access to the MAC registers
The array of uint8_t's which is introduced here, contains access metadata
about the MAC registers: if a register is accessible, but partly implemented,
or if a register requires a certain compatibility flag in order to be
accessed. Currently, 6 hypothetical flags are supported (3 exist for e1000
so far) but in the future, if more than 6 flags will be needed, the datatype
of this array can simply be swapped for a larger one.

This patch is intended to solve the following current problems:

1) In a scenario of migration between different versions of QEMU, which
differ by the MAC registers implemented in them, some registers need not to
be active if a compatibility flag is set, in order to preserve the machine's
state perfectly for the older version. Checking this for each register
individually, would create a lot of clutter in the code.

2) Some registers are (or may be) only partly implemented (e.g.
placeholders that allow reading and writing, but lack other functions).
In such cases it is better to print a debug warning on read/write attempts.
As above, dealing with this functionality on a per-register level, would
require longer and more messy code.

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:52 +08:00
Leonid Bloch
9e11773417 e1000: Add support for migrating the entire MAC registers' array
This patch makes the migration of the entire array of MAC registers
possible during live migration. The entire array is just 128 KB long, so
practically no penalty should be felt when transmitting it, additionally
to the previously transmitted individual registers. The advantage here is
eliminating the need to introduce new vmstate subsections in the future,
when additional MAC registers will be implemented.

Backward compatibility is preserved by introducing a e1000-specific
boolean parameter (in a later patch), which will be on by default.
Setting it to off would enable migration to older versions of QEMU.

Additionally, this parameter will be used to control the access to the
extra MAC registers in the future.

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 15:26:52 +08:00
Leonid Bloch
20f3e86362 e1000: Cosmetic and alignment fixes
This fixes some alignment and cosmetic issues. The changes are made
in order that the following patches in this series will look like
integral parts of the code surrounding them, while conforming to the
coding style. Although some changes in unrelated areas are also made.

Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-11-12 13:48:53 +08:00
Eric Blake
455b0fde8c error: More error_setg() usage
A few uses of error_set(ERROR_CLASS_GENERIC_ERROR) were missed in
c6bd8c706, or have snuck in since.  Nuke them.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1447224690-9743-19-git-send-email-eblake@redhat.com>
Acked-by: Andreas Färber <afaerber@suse.de>
[Indentation tidied up, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2015-11-11 18:56:26 +01:00
Jean-Christophe Dubois
b72d8d257c i.MX: Standardize i.MX FEC debug
The goal is to have debug code always compiled during build.

We standardize all debug output on the following format:

[QOM_TYPE_NAME]reporting_function: debug message

The qemu_log_mask() output is following the same format as the
above debug.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 57e565982db94fb433c32dfa17608888464d21de.1445781957.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-27 15:59:46 +00:00
Shmulik Ladkani
eedeeeffd4 vmxnet3: Do not fill stats if device is inactive
Guest OS may issue VMXNET3_CMD_GET_STATS even before device was
activated (for example in linux, after insmod but prior net-dev open).

Accessing shared descriptors prior device activation is illegal as the
VMXNET3State structures have not been fully initialized.

As a result, guest memory gets corrupted and may lead to guest OS
crashes.

Fix, by not filling the stats descriptors if device is inactive.

Reported-by: Leonid Shatz <leonid.shatz@ravellosystems.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-10-27 10:30:38 +08:00
Sebastian Huber
afb4c51fad net: cadence_gem: Set initial MAC address
Set initial MAC address to the one specified by the command line.

Signed-off-by: Sebastian Huber <sebastian.huber@embedded-brains.de>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-10-27 10:30:30 +08:00
Thibaut Collet
3e866365e1 vhost user: add rarp sending after live migration for legacy guest
A new vhost user message is added to allow QEMU to ask to vhost user backend to
broadcast a fake RARP after live migration for guest without GUEST_ANNOUNCE
capability.

This new message is sent only if the backend supports the new
VHOST_USER_PROTOCOL_F_RARP protocol feature.
The payload of this new message is the MAC address of the guest (not known by
the backend). The MAC address is copied in the first 6 bytes of a u64 to avoid
to create a new payload message type.

This new message has no equivalent ioctl so a new callback is added in the
userOps structure to send the request.

Upon reception of this new message the vhost user backend must generate and
broadcast a fake RARP request to notify the migration is terminated.

Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
[Rebased and fixed checkpatch errors - Marc-André]
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-22 14:34:49 +03:00
Thibaut Collet
f6f56291de vhost user: add support of live migration
Some vhost user backends are able to support live migration.
To provide this service the following features must be added:
1. Add the VIRTIO_NET_F_GUEST_ANNOUNCE capability to vhost-net when netdev
   backend is vhost-user.
2. Provide a nop receive callback to vhost-user.
   This callback is called by:
    *  qemu_announce_self after a migration to send fake RARP to avoid network
       outage for peers talking to the migrated guest.
         - For guest with GUEST_ANNOUNCE capabilities, guest already sends GARP
           when the bit VIRTIO_NET_S_ANNOUNCE is set.
           => These packets must be discarded.
         - For guest without GUEST_ANNOUNCE capabilities, migration termination
           is notified when the guest sends packets.
           => These packets can be discarded.
    * virtio_net_tx_bh with a dummy boot to send fake bootp/dhcp request.
      BIOS guest manages virtio driver to send 4 bootp/dhcp request in case of
      dummy boot.
      => These packets must be discarded.

Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-22 14:34:49 +03:00
Marc-André Lureau
21e704256d vhost: use a function for each call
Replace the generic vhost_call() by specific functions for each
function call to help with type safety and changing arguments.

While doing this, I found that "unsigned long long" and "uint64_t" were
used interchangeably and causing compilation warnings, using uint64_t
instead, as the vhost & protocol specifies.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[Fix enum usage and MQ - Thibaut Collet]
Signed-off-by: Thibaut Collet <thibaut.collet@6wind.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thibaut Collet <thibaut.collet@6wind.com>
2015-10-22 14:34:49 +03:00
Peter Maydell
0bf224d5da -----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
 
 iQEcBAABAgAGBQJWG2e/AAoJEO8Ells5jWIRcYcH/2D11W8cToCBjGDuw/u9K1ht
 S3oGyFasOEq3lm3+a3zQE+vDw0RDkjLEMhcTVwNskJQl6k6Ts5JleTZ6wffvUKPM
 UCozgPOCt1ZAdGskwdbByc+NhaVBHIiEsmlbDKqP22CENdDx6GWjcFW4brA4tQJQ
 AW36EH77j/M+7/KiSukcUfIexILUZJRfN+ICJVyNTpGsqUNJtFqiVPBMPyJhKCEq
 3pr3yJ2lf78SAEF5kBeBc9r/PDWUhtqExBsrK0L8Ey1FdrCy8ldqDPGecT4TsxNv
 W/KX5AqhKSsMI8DQKdbv/IKaUdjYWNjTRQ2Qjm8Vt0hcW0PhxR0NYi6bV4yjDNM=
 =f26Q
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging

# gpg: Signature made Mon 12 Oct 2015 08:56:47 BST using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  tests: add test cases for netfilter object
  netfilter: add a netbuffer filter
  net/queue: export qemu_net_queue_append_iov
  netfilter: print filter info associate with the netdev
  netfilter: add an API to pass the packet to next filter
  net/queue: introduce NetQueueDeliverFunc
  net: merge qemu_deliver_packet and qemu_deliver_packet_iov
  netfilter: hook packets before net queue send
  init/cleanup of netfilter object
  vl.c: init delayed object after net_init_clients
  vmxnet3: Add support for VMXNET3_CMD_GET_ADAPTIVE_RING_INFO command
  e1000: use alias for default model
  vmxnet3: Support reading IMR registers on bar0
  net/vmxnet3: Refine l2 header validation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-10-12 14:29:29 +01:00
Shmulik Ladkani
d62241eb6d vmxnet3: Add support for VMXNET3_CMD_GET_ADAPTIVE_RING_INFO command
Some drivers (e.g. vmware-tools) issue the VMXNET3_CMD_GET_ADAPTIVE_RING_INFO
command.

Currently, due to lack of support, a bogus value (-1) is returned.

Support this command, returning the "adaptive-ring disabled" flag.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-10-12 13:19:29 +08:00
Jason Wang
8304402033 e1000: use alias for default model
Instead of duplicating the "e1000-82540em" device model as "e1000",
make the latter an alias for the former.

Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2015-10-12 13:19:29 +08:00
Shmulik Ladkani
c6048f849c vmxnet3: Support reading IMR registers on bar0
Instead of asserting, return the actual IMR register value.
This is aligned with what's returned on ESXi.

Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Tested-by: Dana Rubin <dana.rubin@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-10-12 13:19:29 +08:00
Dana Rubin
a7278b36fc net/vmxnet3: Refine l2 header validation
Validation of l2 header length assumed minimal packet size as
eth_header + 2 * vlan_header regardless of the actual protocol.

This caused crash for valid non-IP packets shorter than 22 bytes, as
'tx_pkt->packet_type' hasn't been assigned for such packets, and
'vmxnet3_on_tx_done_update_stats()' expects it to be properly set.

Refine header length validation in 'vmxnet_tx_pkt_parse_headers'.
Check its return value during packet processing flow.

As a side effect, in case IPv4 and IPv6 header validation failure,
corrupt packets will be dropped.

Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2015-10-12 13:19:29 +08:00
Markus Armbruster
778358d0a8 rocker: Use g_new() & friends where that makes obvious sense
g_new(T, n) is neater than g_malloc(sizeof(T) * n).  It's also safer,
for two reasons.  One, it catches multiplication overflowing size_t.
Two, it returns T * rather than void *, which lets the compiler catch
more type errors.

This commit only touches allocations with size arguments of the form
sizeof(T).  Same Coccinelle semantic patchas in commit b45c03f.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Jiri Pirko <jiri@resnulli.us>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-10-08 19:46:47 +03:00
Jason Wang
0cf33fb6b4 virtio-net: correctly drop truncated packets
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:

- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated

In order to be consistent with vhost, fix by discarding the descriptor
in this case.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-10-01 16:16:52 +03:00
Peter Maydell
690b286fef Remove muldiv64() by using period instead of frequency
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABAgAGBQJWBVIdAAoJEPMMOL0/L748BXUP/2j+TRnsaU/gMHoL2IGP78BK
 LLLOL7yyV8ZrsrOFvyv8IW0DtpldoYsvObty/bNAr0iq+QwqwGWn9Gw4im5DtIkN
 s7e1WcxLgFHHcT1QLa70MUjjVtRrflTmcW9TVIW79PQ+HsCqnb7EmFZ96HxzH3zN
 YM93eBT6cJV3axsLwJsE82igCXsLo3raKGNb0jt8b6/XwMoR3iUb1Kgs2dJXZUJw
 TYPtHv7sJpQiLQY8Y8o4EjyyjdFuWPVeIfokgPahoOdVA1PSCx6Qh8o+FV1GZ+nF
 vmAr7Jolri6tdbMgRWtIgQQs2YSvPNIUEOYTXVu/4p287JGZPNU5790V2aIczERc
 gEPTqjI6w1AYy8/yMlO3WpfFxXWZH6ZsNBmxCmhH/mczA2dx3DzDlyI7SofQsCHW
 +81U6GSc/Ryy47C+b6m/YZNQDx3yG8rUFtY4PqCcjJwPZdSEhLEM7crC2XWJwy+0
 rg3SnVvXuE2vC/k7UHEYbnFOyVbvezUYJnigbppMilO8nfXIsyuvc7G4AT96jxbt
 4HQJT6ESGEsIToslWObJ53z3jzoAA17xp4gzkZjx7RwSofkFFIaT7jjaA/D2cxFn
 UOXZgAfde6mfg4Ak0czcBYYvm+peEjXBC+DfsBjfAcQ1dz6WSGyd3QZY0J7i9/7y
 iSNiuCE9J6Ha7XVVYzd2
 =krOI
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier-misc/tags/pull-muldiv64-20150925' into staging

Remove muldiv64() by using period instead of frequency

# gpg: Signature made Fri 25 Sep 2015 14:54:37 BST using RSA key ID 3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier-misc/tags/pull-muldiv64-20150925:
  net: remove muldiv64()
  bt: remove muldiv64()
  hpet: remove muldiv64()
  arm: clarify the use of muldiv64()
  openrisc: remove muldiv64()
  mips: remove muldiv64()
  pcnet: remove muldiv64()
  rtl8139: remove muldiv64()
  i6300esb: remove muldiv64()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-25 18:03:19 +01:00
Laurent Vivier
c6acbe861f pcnet: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as PCI frequency is 33 MHz, we can also do:

    y = x * 30; /* 33 MHz PCI period is 30 ns */

Which is much more simple.

This implies a 33.333333 MHz PCI frequency,
but this is correct.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-25 14:53:50 +02:00
Laurent Vivier
37b9ab92f7 rtl8139: remove muldiv64()
Originally, timers were ticks based, and it made sense to
add ticks to current time to know when to trigger an alarm.

But since commit:

7447545 change all other clock references to use nanosecond resolution accessors

All timers use nanoseconds and we need to convert ticks to nanoseconds, by
doing something like:

    y = muldiv64(x, get_ticks_per_sec(), PCI_FREQUENCY)

where x is the number of device ticks and y the number of system ticks.

y is used as nanoseconds in timer functions,
it works because 1 tick is 1 nanosecond.
(get_ticks_per_sec() is 10^9)

But as PCI frequency is 33 MHz, we can also do:

    y = x * 30; /* 33 MHz PCI period is 30 ns */

Which is much more simple.

This implies a 33.333333 MHz PCI frequency,
but this is correct.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-25 14:53:29 +02:00
Changchun Ouyang
7263a0ad78 vhost-user: add a new message to disable/enable a specific virt queue.
Add a new message, VHOST_USER_SET_VRING_ENABLE, to enable or disable
a specific virt queue, which is similar to attach/detach queue for
tap device.

virtio driver on guest doesn't have to use max virt queue pair, it
could enable any number of virt queue ranging from 1 to max virt
queue pair.

Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:53 +03:00
Changchun Ouyang
b931bfbf04 vhost-user: add multiple queue support
This patch is initially based a patch from Nikolay Nikolaev.

This patch adds vhost-user multiple queue support, by creating a nc
and vhost_net pair for each queue.

Qemu exits if find that the backend can't support the number of requested
queues (by providing queues=# option). The max number is queried by a
new message, VHOST_USER_GET_QUEUE_NUM, and is sent only when protocol
feature VHOST_USER_PROTOCOL_F_MQ is present first.

The max queue check is done at vhost-user initiation stage. We initiate
one queue first, which, in the meantime, also gets the max_queues the
backend supports.

In older version, it was reported that some messages are sent more times
than necessary. Here we came an agreement with Michael that we could
categorize vhost user messages to 2 types: non-vring specific messages,
which should be sent only once, and vring specific messages, which should
be sent per queue.

Here I introduced a helper function vhost_user_one_time_request(), which
lists following messages as non-vring specific messages:

        VHOST_USER_SET_OWNER
        VHOST_USER_RESET_DEVICE
        VHOST_USER_SET_MEM_TABLE
        VHOST_USER_GET_QUEUE_NUM

For above messages, we simply ignore them when they are not sent the first
time.

Signed-off-by: Nikolay Nikolaev <n.nikolaev@virtualopensystems.com>
Signed-off-by: Changchun Ouyang <changchun.ouyang@intel.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:53 +03:00
Yuanhan Liu
e2051e9e00 vhost-user: add VHOST_USER_GET_QUEUE_NUM message
This is for querying how many queues the backend supports if it has mq
support(when VHOST_USER_PROTOCOL_F_MQ flag is set from the quried
protocol features).

vhost_net_get_max_queues() is the interface to export that value, and
to tell if the backend supports # of queues user requested, which is
done in the following patch.

Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Yuanhan Liu
d1f8b30ec8 vhost: rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE
Quote from Michael:

    We really should rename VHOST_RESET_OWNER to VHOST_RESET_DEVICE.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Michael S. Tsirkin
dcb10c000c vhost-user: add protocol feature negotiation
Support a separate bitmask for vhost-user protocol features,
and messages to get/set protocol features.

Invoke them at init.

No features are defined yet.

[ leverage vhost_user_call for request handling -- Yuanhan Liu ]

Signed-off-by: Michael S. Tsirkin <address@hidden>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
2015-09-24 16:27:52 +03:00
Jason Wang
1f8828ef57 virtio-net: unbreak self announcement and guest offloads after migration
After commit 019a3edbb2 ("virtio: make
features 64bit wide"). Device's guest_features was actually set after
vdc->load(). This breaks the assumption that device specific load()
function can check guest_features. For virtio-net, self announcement
and guest offloads won't work after migration.

Fixing this by defer them to virtio_net_load() where guest_features
were guaranteed to be set. Other virtio devices looks fine.

Fixes: 019a3edbb2
       ("virtio: make features 64bit wide")
Cc: qemu-stable@nongnu.org
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2015-09-24 13:39:46 +03:00
Markus Armbruster
f8ed85ac99 Fix bad error handling after memory_region_init_ram()
Symptom:

    $ qemu-system-x86_64 -m 10000000
    Unexpected error in ram_block_add() at /work/armbru/qemu/exec.c:1456:
    upstream-qemu: cannot set up guest memory 'pc.ram': Cannot allocate memory
    Aborted (core dumped)

Root cause: commit ef701d7 screwed up handling of out-of-memory
conditions.  Before the commit, we report the error and exit(1), in
one place, ram_block_add().  The commit lifts the error handling up
the call chain some, to three places.  Fine.  Except it uses
&error_abort in these places, changing the behavior from exit(1) to
abort(), and thus undoing the work of commit 3922825 "exec: Don't
abort when we can't allocate guest memory".

The three places are:

* memory_region_init_ram()

  Commit 4994653 (right after commit ef701d7) lifted the error
  handling further, through memory_region_init_ram(), multiplying the
  incorrect use of &error_abort.  Later on, imitation of existing
  (bad) code may have created more.

* memory_region_init_ram_ptr()

  The &error_abort is still there.

* memory_region_init_rom_device()

  Doesn't need fixing, because commit 33e0eb5 (soon after commit
  ef701d7) lifted the error handling further, and in the process
  changed it from &error_abort to passing it up the call chain.
  Correct, because the callers are realize() methods.

Fix the error handling after memory_region_init_ram() with a
Coccinelle semantic patch:

    @r@
    expression mr, owner, name, size, err;
    position p;
    @@
            memory_region_init_ram(mr, owner, name, size,
    (
    -                              &error_abort
    +                              &error_fatal
    |
                                   err@p
    )
                                  );
    @script:python@
        p << r.p;
    @@
    print "%s:%s:%s" % (p[0].file, p[0].line, p[0].column)

When the last argument is &error_abort, it gets replaced by
&error_fatal.  This is the fix.

If the last argument is anything else, its position is reported.  This
lets us check the fix is complete.  Four positions get reported:

* ram_backend_memory_alloc()

  Error is passed up the call chain, ultimately through
  user_creatable_complete().  As far as I can tell, it's callers all
  handle the error sanely.

* fsl_imx25_realize(), fsl_imx31_realize(), dp8393x_realize()

  DeviceClass.realize() methods, errors handled sanely further up the
  call chain.

We're good.  Test case again behaves:

    $ qemu-system-x86_64 -m 10000000
    qemu-system-x86_64: cannot set up guest memory 'pc.ram': Cannot allocate memory
    [Exit 1 ]

The next commits will repair the rest of commit ef701d7's damage.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1441983105-26376-3-git-send-email-armbru@redhat.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
2015-09-18 14:39:29 +02:00
Peter Crosthwaite
271a234a23 net: smc91c111: flush packets on RCR register changes
The SOFT_RST or RXEN in the control register can be used as a condition
to unblock the net layer via can_receive(). So check for possible
flushes on RCR changes. This will drop all pending packets on soft
reset or disable which is the functional intent of the can_receive()
logic.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Message-id: b114d4c96f4afbdaa15f1361d9c07e3021755915.1441873621.git.crosthwaite.peter@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-17 12:36:03 +01:00
Peter Crosthwaite
e62cb54cd5 net: smc91c111: gate can_receive() on rx FIFO having a slot
Return false from can_receive() when the FIFO doesn't have a free RX
slot. This fixes a bug in the current code where the allocated buffer
is freed before the fifo pop, triggering a premature flush of queued RX
packets. It also will handle a corner case, where the guest manually
frees the allocated buffer before popping the rx FIFO (hence it is not
enough to just delay the flush_queued_packets()).

Reported-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Message-id: 97bfdfc5cbce0bd5e0cbbbff35ce7a1bf6f8603d.1441873621.git.crosthwaite.peter@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-17 12:36:03 +01:00
Peter Crosthwaite
8d06b14927 net: smc91c111: guard flush_queued_packets() on can_rx()
Check that the core can once again receive packets before asking the
net layer to do a flush. This will make it more convenient to flush
packets when adding new conditions to can_receive.

Add missing if braces while moving the can_receive() core code.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Message-id: 92e15e12a6964274f4bc0eb71b61a7d94326f6c6.1441873621.git.crosthwaite.peter@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-17 12:36:03 +01:00
P J P
737d2b3c41 net: avoid infinite loop when receiving packets(CVE-2015-5278)
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, leading to an infinite
loop situation.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-15 12:51:14 +01:00
P J P
9bbdbc66e5 net: add checks to validate ring buffer pointers(CVE-2015-5279)
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, which could lead to a
memory buffer overflow. Added other checks at initialisation.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-15 12:51:14 +01:00
P J P
b947ac2bf2 e1000: Avoid infinite loop in processing transmit descriptor (CVE-2015-6815)
While processing transmit descriptors, it could lead to an infinite
loop if 'bytes' was to become zero; Add a check to avoid it.

[The guest can force 'bytes' to 0 by setting the hdr_len and mss
descriptor fields to 0.
--Stefan]

Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1441383666-6590-1-git-send-email-stefanha@redhat.com
2015-09-15 12:51:02 +01:00
Veres Lajos
67cc32ebfd typofixes - v4
Signed-off-by: Veres Lajos <vlajos@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:45:43 +03:00
Markus Armbruster
012aef0734 maint: avoid useless "if (foo) free(foo)" pattern
My Coccinelle semantic patch finds a few more, because it also fixes up
the equally pointless conditional

    if (foo) {
        free(foo);
        foo = NULL;
    }

Result (feel free to squash it into your patch):

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:21:38 +03:00
Daniel P. Berrange
ef1e1e0782 maint: avoid useless "if (foo) free(foo)" pattern
The free() and g_free() functions both happily accept
NULL on any platform QEMU builds on. As such putting a
conditional 'if (foo)' check before calls to 'free(foo)'
merely serves to bloat the lines of code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:21:38 +03:00
Daniel P. Berrange
1618d2ae7f maint: remove unused include for signal.h
A number of files were including signal.h but not using any
of the functions it provides

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:21:38 +03:00
Daniel P. Berrange
b6af097528 maint: remove / fix many doubled words
Many source files have doubled words (eg "the the", "to to",
and so on). Most of these can simply be removed, but a couple
were actual mis-spellings (eg "to to" instead of "to do").
There was even one triple word score "to to to" :-)

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2015-09-11 10:21:38 +03:00
Cornelia Huck
95129d6fc9 virtio: avoid leading underscores for helpers
Commit ef546f1275 ("virtio: add
feature checking helpers") introduced a helper __virtio_has_feature.
We don't want to use reserved identifiers, though, so let's
rename __virtio_has_feature to virtio_has_feature and virtio_has_feature
to virtio_vdev_has_feature.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-09-10 11:06:05 +03:00
Alistair Francis
7777b7a0ba cadence_gem: Correct Marvell PHY SPCFC reset value
Bit 15 of the PHY Specific Status Register is reserved and
should remain 0. Fix the reset value to ensure that the 15th
bit is not set.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: c795069e49040ff770fe2ece19dfe1791b729e22.1441316450.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-08 17:38:45 +01:00
Jean-Christophe Dubois
fcbd8018e6 i.MX: Add FEC Ethernet Emulator
This is based on mcf_fec.c FEC implementation for Coldfire

  * A generic PHY was added (borrowwed from LAN9118)
  * The buffer management is also modified as buffers are
    slightly different between Coldfire and i.MX

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: fb314f8a120aa49f8f6ad886f312c649b484fb5a.1441057361.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-09-07 10:39:30 +01:00
Fam Zheng
c5a9378045 ne2000: Drop ne2000_can_receive
ne2000_receive already checks the same conditions and drops the packet
if it's not ready, removing the .can_receive callback avoids the
necessity to add explicit flushes when the conditions turn true (which
is required by the new semantics of .can_receive since 6e99c63
"net/socket: Drop net_socket_can_send").

Plus the "return 1" if E8390_STOP is also suspicious.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-09-02 14:51:07 +01:00