Commit Graph

64515 Commits

Author SHA1 Message Date
Roman Kagan
b56920245c hyperv: allow passing arbitrary data to sint ack callback
Make sint ack callback accept an opaque pointer, that is stored on
sint_route at creation time.

This allows for more convenient interaction with the callback.

Besides, nothing outside hyperv.c should need to know the layout of
HvSintRoute fields any more so its declaration can be removed from the
header.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-6-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:13 +02:00
Roman Kagan
bd4ed63caa hyperv: synic: only setup ack notifier if there's a callback
There's no point setting up an sint ack notifier if no callback is
specified.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-5-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:13 +02:00
Roman Kagan
42e4b0e1fb hyperv: cosmetic: g_malloc -> g_new
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:13 +02:00
Roman Kagan
cc4669f065 hyperv_testdev: drop unnecessary includes
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:13 +02:00
Roman Kagan
1ba5c3a954 hyperv_testdev: refactor for better maintainability
Make hyperv_testdev slightly easier to follow and enhance in future.
For that, put the hyperv sint routes (wrapped in a helper structure) on
a linked list rather than a fixed-size array.  Besides, this way
HvSintRoute can be treated as an opaque structure, allowing for easier
refactoring of the core Hyper-V SynIC code in followup pathches.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20180921081836.29230-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:13 +02:00
Paolo Bonzini
40dce4ee61 scsi-disk: fix rerror/werror=ignore
rerror=ignore was returning true from scsi_handle_rw_error but the callers were not
calling scsi_req_complete when rerror=ignore returns true (this is the correct thing
to do when true is returned after executing a passthrough command).  Fix this by
calling it in scsi_handle_rw_error.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:13 +02:00
Paolo Bonzini
e6aa5ba4ac scsi-disk: fix double completion of failing passthrough requests
If a command fails with a sense that scsi_sense_buf_to_errno converts to
ECANCELED/EAGAIN/ENOTCONN or with a unit attention, scsi_req_complete is
called twice.  This caused a crash.

Reported-by: Wangguang <wang.guangA@h3c.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Li Qiang
a519e38944 hw: edu: drop DO_UPCAST
Signed-off-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Igor Mammedov
8b5e6caf01 call HotplugHandler->plug() as the last step in device realization
When [2] was fixed it was agreed that adding and calling post_plug()
callback after device_reset() was low risk approach to hotfix issue
right before release. So it was merged instead of moving already
existing plug() callback after device_reset() is called which would
be more risky and require all plug() callbacks audit.

Looking at the current plug() callbacks, it doesn't seem that moving
plug() callback after device_reset() is breaking anything, so here
goes agreed upon [3] proper fix which essentially reverts [1][2]
and moves plug() callback after device_reset().
This way devices always comes to plug() stage, after it's been fully
initialized (including being reset), which fixes race condition [2]
without need for an extra post_plug() callback.

 1. (25e897881 "qdev: add HotplugHandler->post_plug() callback")
 2. (8449bcf94 "virtio-scsi: fix hotplug ->reset() vs event race")
 3. https://www.mail-archive.com/qemu-devel@nongnu.org/msg549915.html

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1539696820-273275-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Pierre Morel<pmorel@linux.ibm.com>
Acked-by: Pierre Morel<pmorel@linux.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Artem Pisarenko
ac0989f53d vl, qapi: offset calculation in RTC_CHANGE event reverted
Return value of qemu_timedate_diff(), used for calculation offset in
QAPI 'RTC_CHANGE' event, restored to keep compatibility. Since it
wasn't documented that difference is relative to host clock
advancement, this change also adds important note to 'RTC_CHANGE'
event description to highlight established implementation specifics.

Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <1fc12c77e8b7115d3842919a8b586d9cbe4efca6.1539846575.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Artem Pisarenko
eb6a520991 Fixes RTC bug with base datetime shifts in clock=vm
This makes all current "-rtc" option parameters combinations produce
fixed/unambiguous RTC timedate reference for hardware emulation
frontends.
It restores determinism of guest execution when used with clock=vm and
specified base <datetime> value.

Buglink: https://bugs.launchpad.net/qemu/+bug/1797033
Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <1d963c3e013dfedafa1f6edb9fb219b7e49e39da.1539846575.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Artem Pisarenko
7e166ebd8c vl: refactor -rtc option references
Improve code readability and prepare for fixing bug #1797033

Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <9330a48899f997431a34460014886d118a7c0960.1539846575.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Artem Pisarenko
238d1240d9 vl: improve/fix documentation related to RTC function
Documentation describing -rtc option updated to better match current
implementation and highlight some important specifics.

Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <1b245c6c0803d4bf11dcbf9eb32f34af8c2bd0b4.1539846575.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Roman Bolshakov
92cc3aaa1f i386: hvf: Remove hvf_disabled
accel_init_machine sets *(acc->allowed) to true if acc->init_machine(ms)
succeeds. There's no need to have both hvf_allowed and hvf_disabled.

Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018143051.48508-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Roman Bolshakov
b4e1af8961 i386: hvf: Fix register refs if REX is present
According to Intel(R)64 and IA-32 Architectures Software Developer's
Manual, the following one-byte registers should be fetched when REX
prefix is present (sorted by reg encoding index):
AL, CL, DL, BL, SPL, BPL, SIL, DIL, R8L - R15L

The first 8 are fetched if REX.R is zero, the last 8 if non-zero.

The following registers should be fetched for instructions without REX
prefix (also sorted by reg encoding index):
AL, CL, DL, BL, AH, CH, DH, BH

Current emulation code doesn't handle accesses to SPL, BPL, SIL, DIL
when REX is present, thefore an instruction 40883e "mov %dil,(%rsi)" is
decoded as "mov %bh,(%rsi)".

That caused an infinite loop in vp_reset:
https://lists.gnu.org/archive/html/qemu-devel/2018-10/msg03293.html

Signed-off-by: Roman Bolshakov <r.bolshakov@yadro.com>
Message-Id: <20181018134401.44471-1-r.bolshakov@yadro.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Vitaly Kuznetsov
6b7a98303b i386/kvm: add support for Hyper-V IPI send
Hyper-V PV IPI support is merged to KVM, enable the feature in Qemu. When
enabled, this allows Windows guests to send IPIs to other vCPUs with a
single hypercall even when there are >64 vCPUs in the request.

Signed-off-by: Vitaly Kuznetsov <vkuznets@redhat.com>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20181009130853.6412-3-vkuznets@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Pavel Dovgalyuk
ca9759c2a9 replay: don't process events at virtual clock checkpoint
As QEMU becomes more multi-threaded and non-synchronized, checkpoints
move from thread to thread. And the event queue that processed at checkpoints
should belong to the same thread in both record and replay executions.
This patch disables asynchronous event processing at virtual clock
checkpoint, because it may be invoked in different threads at record and
replay. This patch is temporary fix until the checkpoints are completely
refactored.

Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20181018063345.7433.11678.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:12 +02:00
Peng Hao
a8de011500 target-i386: add q35 0xcf8 port as coalesced_pio
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Message-Id: <1539795177-21038-6-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:11 +02:00
Peng Hao
37abf8d234 target-i386: add i440fx 0xcf8 port as coalesced_pio
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Message-Id: <1539795177-21038-5-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:11 +02:00
Peng Hao
f98167ea06 target-i386: add rtc 0x70 port as coalesced_pio
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Message-Id: <1539890353-30273-1-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:11 +02:00
Peng Hao
e6d34aeea6 target-i386 : add coalesced_pio API
the primary API realization.

Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1539795177-21038-3-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:11 +02:00
Paolo Bonzini
966f2ec3ac linux-headers: update to 4.20-rc1
This brings in eVMCS and coalesced PIO support, as well as other features we do
not support yet.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:11 +02:00
Paolo Bonzini
b31c003895 target-i386: kvm: do not initialize padding fields
The exception.pad field is going to be renamed to pending in an upcoming
header file update.  Remove the unnecessary initialization; it was
introduced to please valgrind (commit 7e680753cf) but they were later
rendered unnecessary by commit 076796f8fd, which added the "= {}"
initializer to the declaration of "events".  Therefore the patch does
not change behavior in any way.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:04 +02:00
Artem Pisarenko
e81f86790f qemu-timer: avoid checkpoints for virtual clock timers in external subsystems
Adds EXTERNAL attribute definition to qemu timers subsystem and assigns
it to virtual clock timers, used in slirp (ICMP IPv6) and ui (key queue).
Virtual clock processing in rr mode can use this attribute instead of a
separate clock type.

Fixes: 87f4fe7653
Fixes: 775a412bf8
Fixes: 9888091404
Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <e771f96ab94e86b54b9a783c974f2af3009fe5d1.1539764043.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:03 +02:00
Artem Pisarenko
89a603a0c8 qemu-timer: introduce timer attributes
Attributes are simple flags, associated with individual timers for their
whole lifetime.  They intended to be used to mark individual timers for
special handling when they fire.

New/init functions family in timer interface updated and refactored (new
'attribute' argument added, timer_list replaced with timer_list_group+type
combinations, comments improved to avoid info duplication).  Also existing
aio interface extended with attribute-enabled variants of functions,
which create/initialize timers.

Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <f47b81dbce734e9806f9516eba8ca588e6321c2f.1539764043.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:03 +02:00
Artem Pisarenko
05ff8dc32f Revert some patches from recent [PATCH v6] "Fixing record/replay and adding reverse debugging"
That patch series introduced new virtual clock type for use in external
subsystems. It breaks desired behavior in non-record/replay usage
scenarios due to a small change to existing behavior.  Processing of
virtual timers belonging to new clock type is kicked off to the main
loop, which makes these timers asynchronous with vCPU thread and,
in icount mode, with whole guest execution. This breaks expected
determinism in non-record/replay icount mode of emulation where these
"external subsystems" are isolated from the host (i.e. they are
external only to guest core, not to the entire emulation environment).

Example for slirp ("user" backend for network device):
User runs qemu in icount mode with rtc clock=vm without any external
communication interfaces but with "-netdev user,restrict=on". It expects
deterministic execution, because network services are emulated inside
qemu and isolated from host. There are no reasons to get reply from DHCP
server with different delay or something like that.

The next patches revert reimplements the same changes in a better way.
This reverts commit 87f4fe7653.
This reverts commit 775a412bf8.
This reverts commit 9888091404.

Signed-off-by: Artem Pisarenko <artem.k.pisarenko@gmail.com>
Message-Id: <18b1e7c8f155fe26976f91be06bde98eef6f8751.1539764043.git.artem.k.pisarenko@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:03 +02:00
Paolo Bonzini
24f7973b67 es1370: more fixes for ADC_FRAMEADR and ADC_FRAMECNT
They are not consecutive with DAC1_FRAME* and DAC2_FRAME*; Coverity
still complains about es1370_read, while es1370_write was fixed in
commit cf9270e522.

Fixes: 154c1d1f96
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2018-10-19 13:44:00 +02:00
Daniel P. Berrangé
dea7a64e4c crypto: require libgcrypt >= 1.5.0 for building QEMU
libgcrypt 1.5.0 was released in 2011 and all the distros that are build
target platforms for QEMU [1] include it:

  RHEL-7: 1.5.3
  Debian (Stretch): 1.7.6
  Debian (Jessie): 1.6.3
  OpenBSD (ports): 1.8.2
  FreeBSD (ports): 1.8.3
  OpenSUSE Leap 15: 1.8.2
  Ubuntu (Xenial): 1.6.5
  macOS (Homebrew): 1.8.3

Based on this, it is reasonable to require libgcrypt >= 1.5.0 in QEMU
which allows for some conditional version checks in the code to be
removed.

[1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-10-19 12:26:57 +01:00
Daniel P. Berrangé
a0722409bc crypto: require gnutls >= 3.1.18 for building QEMU
gnutls 3.0.0 was released in 2011 and all the distros that are build
target platforms for QEMU [1] include it:

  RHEL-7: 3.1.18
  Debian (Stretch): 3.5.8
  Debian (Jessie): 3.3.8
  OpenBSD (ports): 3.5.18
  FreeBSD (ports): 3.5.18
  OpenSUSE Leap 15: 3.6.2
  Ubuntu (Xenial): 3.4.10
  macOS (Homebrew): 3.5.19

Based on this, it is reasonable to require gnutls >= 3.1.18 in QEMU
which allows for all conditional version checks in the code to be
removed.

[1] https://qemu.weilnetz.de/doc/qemu-doc.html#Supported-build-platforms

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
2018-10-19 12:26:57 +01:00
Peter Maydell
1b7490446b Add a workaround for clang bug and remove misleading comment (sparc)
-----BEGIN PGP SIGNATURE-----
 
 iQIcBAABAgAGBQJbyNhBAAoJEPMMOL0/L748u/sQALDmpdHXmqgiA9YPYGSg6Yn5
 J6TsMs9O+DcgIMmLkYcvHEajJf5R6j5hO4HRnrqefnEaAQHMtoDNxTMqTqyiRyyd
 rIeokVauBeDrnr88XxRGGDTfyKMp9qR255wjpaueKtRmloHN+EvgQ+a9vgZlqDoi
 CmpmA05wVYdW2ku3uk5QtrGsfmLsUnT9ETTs+/kU9uoVujnYe+Ix77kDb8BYe1zz
 OL2aBu5f5LdJKqvbIMsxHg7m32MxG4swLf3gjD6wl5R711Pin9Uidpg7mzVmmElp
 mUTuSSJtTbqqM15NanQbfXAoBBStM+ILH5juaHjNC5iA8Li1AL2+KVckWELNOJnd
 0tbagKS8MAiHw9sExMrREArpqsusJ6YUaHMhlLdtnV+r8YKry1iK1nFS8KIdbY3r
 4stL/H7dKfvtSlSA4bF0zcwZwqJMvX5qNKT8fXUV2j2/i6ttQahL/mwqClQDcuFA
 LkdMCcI+TXvbt04KeYE9eGbWUg2JFFlf2qiX2bD/tUqTDLjFP15YFtpY+3B0FnLW
 EUdooDKsjlyz562SIm9ccGlyNwKpsSVenUuU4n0tmCq8PVe3S7UyXQH+w+a7mWZZ
 sOOFc64gTB+8Z8wUdAm6MXCgUoKavfGUPYAIq5cbYcxEgQ8XuMunx5TtDnjC2tZ1
 EwPPfdzK8LLaVYxTLkr/
 =LSxi
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/vivier2/tags/linux-user-for-3.1-pull-request' into staging

Add a workaround for clang bug and remove misleading comment (sparc)

# gpg: Signature made Thu 18 Oct 2018 20:00:17 BST
# gpg:                using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>"
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* remotes/vivier2/tags/linux-user-for-3.1-pull-request:
  linux-user/sparc/signal.c: Remove unnecessary comment
  linux-user: Suppress address-of-packed-member warnings in __get/put_user_e

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19 11:20:05 +01:00
Peter Maydell
2ec24af237 MIPS queue October 2018, part1, v2
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQEcBAABAgAGBQJbyNNEAAoJENSXKoln91plgPMH/iHilm6MrW7r6tvJloYEBwfZ
 e8AkQzWcWq6rpWYhbiuGvWY2Qn1EWoTWFfohEvJ96gkJIZCVwO7sTqD2//58tksp
 wWpgeQwLxRCd+pB6zBMmYkpPD4WNEHGq7RYTzA+0pBIjwTEjdct0POgmLaiXBnFP
 mE5m6wohyAlxPpLLfluYEPz6cTIm20M191tv9scDoztKJnd/8u5M3j+yP8t2zbFs
 pRdOk68YsJl2fOKxgmnLsg83VxpoQahEzOgX1Zi396tcISZMCnuK+4GWJJ9MbPMl
 b1a9TdjHoNjx0v0qzNWk3RzECE7fBnnxpa8p1pil9Ff5MCq9gwFjxHvnCLZzY0s=
 =uRwj
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/amarkovic/tags/mips-queue-october-2018-part1-v2' into staging

MIPS queue October 2018, part1, v2

# gpg: Signature made Thu 18 Oct 2018 19:39:00 BST
# gpg:                using RSA key D4972A8967F75A65
# gpg: Good signature from "Aleksandar Markovic <amarkovic@wavecomp.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8526 FBF1 5DA3 811F 4A01  DD75 D497 2A89 67F7 5A65

* remotes/amarkovic/tags/mips-queue-october-2018-part1-v2: (28 commits)
  target/mips: Add opcodes for nanoMIPS EVA instructions
  target/mips: Fix misplaced 'break' in handling of NM_SHRA_R_PH
  target/mips: Fix emulation of microMIPS R6 <SELEQZ|SELNEZ>.<D|S>
  target/mips: Implement hardware page table walker for MIPS32
  target/mips: Add reset state for PWSize and PWField registers
  target/mips: Add CP0 PWCtl register
  target/mips: Add CP0 PWSize register
  target/mips: Add CP0 PWField register
  target/mips: Add CP0 PWBase register
  target/mips: Add CP0 Config2 to DisasContext
  target/mips: Improve DSP R2/R3-related naming
  target/mips: Add availability control for DSP R3 ASE
  target/mips: Add bit definitions for DSP R3 ASE
  target/mips: Reorganize bit definitions for insn_flags (ISAs/ASEs flags)
  target/mips: Increase 'supported ISAs/ASEs' flag holder size
  target/mips: Add opcode values of MXU ASE
  target/mips: Add organizational chart of MXU ASE
  target/mips: Add assembler mnemonics list for MXU ASE
  target/mips: Add basic description of MXU ASE
  target/mips: Add a comment before each CP0 register section in cpu.h
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2018-10-19 10:08:31 +01:00
Thomas Huth
37a4442a76 qemu-options: Fix bad "macaddr" property in the documentation
When using the "-device" option, the property is called "mac".
"macaddr" is only used for the legacy "-net nic" option.

Reported-by: Harald Hoyer <harald@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:04 +08:00
Jason Wang
1001cf45a7 e1000: indicate dropped packets in HW counters
The e1000 emulation silently discards RX packets if there's
insufficient space in the ring buffer. This leads to errors
on higher-level protocols in the guest, with no indication
about the error cause.

This patch increments the "Missed Packets Count" (MPC) and
"Receive No Buffers Count" (RNBC) HW counters in this case.
As the emulation has no FIFO for buffering packets that can't
immediately be pushed to the guest, these two registers are
practically equivalent (see 10.2.7.4, 10.2.7.33 in
https://www.intel.com/content/www/us/en/embedded/products/networking/82574l-gbe-controller-datasheet.html).

On a Linux guest, the register content  will be reflected in
the "rx_missed_errors" and "rx_no_buffer_count" stats from
"ethtool -S", and in the "missed" stat from "ip -s -s link show",
giving at least some hint about the error cause inside the guest.

If the cause is known, problems like this can often be avoided
easily, by increasing the number of RX descriptors in the guest
e1000 driver (e.g under Linux, "e1000.RxDescriptors=1024").

The patch also adds a qemu trace message for this condition.

Signed-off-by: Martin Wilck <mwilck@suse.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:04 +08:00
Jason Wang
1592a99470 net: ignore packet size greater than INT_MAX
There should not be a reason for passing a packet size greater than
INT_MAX. It's usually a hint of bug somewhere, so ignore packet size
greater than INT_MAX in qemu_deliver_packet_iov()

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:04 +08:00
Jason Wang
b1d80d12c5 pcnet: fix possible buffer overflow
In pcnet_receive(), we try to assign size_ to size which converts from
size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:04 +08:00
Jason Wang
1a326646fe rtl8139: fix possible out of bound access
In rtl8139_do_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:04 +08:00
Jason Wang
fdc89e90fa ne2000: fix possible out of bound access in ne2000_receive
In ne2000_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:04 +08:00
liujunjie
7da2d99fb9 clean up callback when del virtqueue
Before, we did not clear callback like handle_output when delete
the virtqueue which may result be segmentfault.
The scene is as follows:
1. Start a vm with multiqueue vhost-net,
2. then we write VIRTIO_PCI_GUEST_FEATURES in PCI configuration to
triger multiqueue disable in this vm which will delete the virtqueue.
In this step, the tx_bh is deleted but the callback virtio_net_handle_tx_bh
still exist.
3. Finally, we write VIRTIO_PCI_QUEUE_NOTIFY in PCI configuration to
notify the deleted virtqueue. In this way, virtio_net_handle_tx_bh
will be called and qemu will be crashed.

Although the way described above is uncommon, we had better reinforce it.

CC: qemu-stable@nongnu.org
Signed-off-by: liujunjie <liujunjie23@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
8e640892ec docs: Add COLO status diagram to COLO-FT.txt
This diagram make user better understand COLO.
Suggested by Markus Armbruster.

Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
zhanghailiang
2518aec192 COLO: quick failover process by kick COLO thread
COLO thread may sleep at qemu_sem_wait(&s->colo_checkpoint_sem),
while failover works begin, It's better to wakeup it to quick
the process.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
zhanghailiang
7b3435309d COLO: notify net filters about checkpoint/failover event
Notify all net filters about the checkpoint and failover event.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
24525e93c1 filter-rewriter: handle checkpoint and failover event
After one round of checkpoint, the states between PVM and SVM
become consistent, so it is unnecessary to adjust the sequence
of net packets for old connections, besides, while failover
happens, filter-rewriter will into failover mode that needn't
handle the new TCP connection.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
5fbba3d659 filter: Add handle_event method for NetFilterClass
Filter needs to process the event of checkpoint/failover or
other event passed by COLO frame.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
zhanghailiang
d1955d2219 COLO: flush host dirty ram from cache
Don't need to flush all VM's ram from cache, only
flush the dirty pages since last checkpoint

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
3f6df99d9d savevm: split the process of different stages for loadvm/savevm
There are several stages during loadvm/savevm process. In different stage,
migration incoming processes different types of sections.
We want to control these stages more accuracy, it will benefit COLO
performance, we don't have to save type of QEMU_VM_SECTION_START
sections everytime while do checkpoint, besides, we want to separate
the process of saving/loading memory and devices state.

So we add three new helper functions: qemu_load_device_state() and
qemu_savevm_live_state() to achieve different process during migration.

Besides, we make qemu_loadvm_state_main() and qemu_save_device_state()
public, and simplify the codes of qemu_save_device_state() by calling the
wrapper qemu_savevm_state_header().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
f56c0065b8 qapi: Add new command to query colo status
Libvirt or other high level software can use this command query colo status.
You can test this command like that:
{'execute':'query-colo-status'}

Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
41b6b77921 qapi/migration.json: Rename COLO unknown mode to none mode.
Suggested by Markus Armbruster rename COLO unknown mode to none mode.

Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
zhanghailiang
9ecff6d66e qmp event: Add COLO_EXIT event to notify users while exited COLO
If some errors happen during VM's COLO FT stage, it's important to
notify the users of this event. Together with 'x-colo-lost-heartbeat',
Users can intervene in COLO's failover work immediately.
If users don't want to get involved in COLO's failover verdict,
it is still necessary to notify users that we exited COLO mode.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
e6f4aa188c COLO: Flush memory data from ram cache
During the time of VM's running, PVM may dirty some pages, we will transfer
PVM's dirty pages to SVM and store them into SVM's RAM cache at next checkpoint
time. So, the content of SVM's RAM cache will always be same with PVM's memory
after checkpoint.

Instead of flushing all content of PVM's RAM cache into SVM's MEMORY,
we do this in a more efficient way:
Only flush any page that dirtied by PVM since last checkpoint.
In this way, we can ensure SVM's memory same with PVM's.

Besides, we must ensure flush RAM cache before load device state.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00
Zhang Chen
7d9acafa2c ram/COLO: Record the dirty pages that SVM received
We record the address of the dirty pages that received,
it will help flushing pages that cached into SVM.

Here, it is a trick, we record dirty pages by re-using migration
dirty bitmap. In the later patch, we will start the dirty log
for SVM, just like migration, in this way, we can record both
the dirty pages caused by PVM and SVM, we only flush those dirty
pages from RAM cache while do checkpoint.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Zhang Chen <chen.zhang@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2018-10-19 11:15:03 +08:00