* based on Eth, UDP, TCP struct present in eth.h instead of hardcoded
indexes and sizes.
* based on various macros present in eth.h.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch introduces emulation for the Intel 82574 adapter, AKA e1000e.
This implementation is derived from the e1000 emulation code, and
utilizes the TX/RX packet abstractions that were initially developed for
the vmxnet3 device. Although some parts of the introduced code may be
shared with e1000, the differences are substantial enough so that the
only shared resources for the two devices are the definitions in
hw/net/e1000_regs.h.
Similarly to vmxnet3, the new device uses virtio headers for task
offloads (for backends that support virtio extensions). Usage of
virtio headers may be forcibly disabled via a boolean device property
"vnet" (which is enabled by default). In such case task offloads
will be performed in software, in the same way it is done on
backends that do not support virtio headers.
The device code is split into two parts:
1. hw/net/e1000e.c: QEMU-specific code for a network device;
2. hw/net/e1000e_core.[hc]: Device emulation according to the spec.
The new device name is e1000e.
Intel specifications for the 82574 controller are available at:
http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf
Throughput measurement results (iperf2):
Fedora 22 guest, TCP, RX
4 ++------------------------------------------+
| |
| X X X X X
3.5 ++ X X X X |
| X |
| |
3 ++ |
G | X |
b | |
/ 2.5 ++ |
s | |
| |
2 ++ |
| |
| |
1.5 X+ |
| |
+ + + + + + + + + + + +
1 ++--+---+---+---+---+---+---+---+---+---+---+
32 64 128 256 512 1 2 4 8 16 32 64
B B B B B KB KB KB KB KB KB KB
Buffer size
Fedora 22 guest, TCP, TX
18 ++-------------------------------------------+
| X |
16 ++ X X X X X
| X |
14 ++ |
| |
12 ++ |
G | X |
b 10 ++ |
/ | |
s 8 ++ |
| |
6 ++ X |
| |
4 ++ |
| X |
2 ++ X |
X + + + + + + + + + + +
0 ++--+---+---+---+---+----+---+---+---+---+---+
32 64 128 256 512 1 2 4 8 16 32 64
B B B B B KB KB KB KB KB KB KB
Buffer size
Fedora 22 guest, UDP, RX
3 ++------------------------------------------+
| X
| |
2.5 ++ |
| |
| |
2 ++ X |
G | |
b | |
/ 1.5 ++ |
s | X |
| |
1 ++ |
| |
| X |
0.5 ++ |
| X |
X + + + + +
0 ++-------+--------+-------+--------+--------+
32 64 128 256 512 1
B B B B B KB
Datagram size
Fedora 22 guest, UDP, TX
1 ++------------------------------------------+
| X
0.9 ++ |
| |
0.8 ++ |
0.7 ++ |
| |
G 0.6 ++ |
b | |
/ 0.5 ++ |
s | X |
0.4 ++ |
| |
0.3 ++ |
0.2 ++ X |
| |
0.1 ++ X |
X X + + + +
0 ++-------+--------+-------+--------+--------+
32 64 128 256 512 1
B B B B B KB
Datagram size
Windows 2012R2 guest, TCP, RX
3.2 ++------------------------------------------+
| X |
3 ++ |
| |
2.8 ++ |
| |
2.6 ++ X |
G | X X X X X
b 2.4 ++ X X |
/ | |
s 2.2 ++ |
| |
2 ++ |
| X X |
1.8 ++ |
| |
1.6 X+ |
+ + + + + + + + + + + +
1.4 ++--+---+---+---+---+---+---+---+---+---+---+
32 64 128 256 512 1 2 4 8 16 32 64
B B B B B KB KB KB KB KB KB KB
Buffer size
Windows 2012R2 guest, TCP, TX
14 ++-------------------------------------------+
| |
| X X
12 ++ |
| |
10 ++ |
| |
G | |
b 8 ++ |
/ | X |
s 6 ++ |
| |
| |
4 ++ X |
| |
2 ++ |
| X X X |
+ X X + + X X + + + + +
0 X+--+---+---+---+---+----+---+---+---+---+---+
32 64 128 256 512 1 2 4 8 16 32 64
B B B B B KB KB KB KB KB KB KB
Buffer size
Windows 2012R2 guest, UDP, RX
1.6 ++------------------------------------------X
| |
1.4 ++ |
| |
1.2 ++ |
| X |
| |
G 1 ++ |
b | |
/ 0.8 ++ |
s | |
0.6 ++ X |
| |
0.4 ++ |
| X |
| |
0.2 ++ X |
X + + + + +
0 ++-------+--------+-------+--------+--------+
32 64 128 256 512 1
B B B B B KB
Datagram size
Windows 2012R2 guest, UDP, TX
0.6 ++------------------------------------------+
| X
| |
0.5 ++ |
| |
| |
0.4 ++ |
G | |
b | |
/ 0.3 ++ X |
s | |
| |
0.2 ++ |
| |
| X |
0.1 ++ |
| X |
X X + + + +
0 ++-------+--------+-------+--------+--------+
32 64 128 256 512 1
B B B B B KB
Datagram size
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Code that will be shared moved to a separate files.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
To make this device and network packets
abstractions ready for IOMMU.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch extends the TX/RX packet abstractions with features that will
be used by the e1000e device implementation.
Changes are:
1. Support iovec lists for RX buffers
2. Deeper RX packets parsing
3. Loopback option for TX packets
4. Extended VLAN headers handling
5. RSS processing for RX packets
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This patch drops "vmx" prefix from packet abstractions names
to emphasize the fact they are generic and not tied to any
specific network device.
These abstractions will be reused by e1000e emulation implementation
introduced by following patches so their names need generalization.
This patch (except renamed files, adjusted comments and changes in MAINTAINTERS)
was produced by:
git grep -lz 'vmxnet_tx_pkt' | xargs -0 perl -i'' -pE "s/vmxnet_tx_pkt/net_tx_pkt/g"
git grep -lz 'vmxnet_rx_pkt' | xargs -0 perl -i'' -pE "s/vmxnet_rx_pkt/net_rx_pkt/g"
git grep -lz 'VmxnetTxPkt' | xargs -0 perl -i'' -pE "s/VmxnetTxPkt/NetTxPkt/g"
git grep -lz 'VMXNET_TX_PKT' | xargs -0 perl -i'' -pE "s/VMXNET_TX_PKT/NET_TX_PKT/g"
git grep -lz 'VmxnetRxPkt' | xargs -0 perl -i'' -pE "s/VmxnetRxPkt/NetRxPkt/g"
git grep -lz 'VMXNET_RX_PKT' | xargs -0 perl -i'' -pE "s/VMXNET_RX_PKT/NET_RX_PKT/g"
sed -ie 's/VMXNET_/NET_/g' hw/net/vmxnet_rx_pkt.c
sed -ie 's/VMXNET_/NET_/g' hw/net/vmxnet_tx_pkt.c
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
These macros will be used by future commits introducing
e1000e device emulation and by vmxnet3 tracing code.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Added support for PCIe CAP v1, while reusing some of the existing v2
infrastructure.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This function will be used by e1000e device code.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Replace legacy cpu_to_le64w()/le64_to_cpup()
calls with stq_le_p()/ldq_le_p().
Motivation for this modification is that
follow up patches add utility function
pcie_dev_ser_num_init() for PCIe DSN
capability creation which uses
pci_set_quad() with a misaligned offset.
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Whatever passed in this variable will be appended to all
configure commands.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-15-git-send-email-famz@redhat.com
The script is not prefixed with test- so it won't run with "make docker-test",
because it can take too long.
Run it with "make docker-travis@ubuntu".
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-13-git-send-email-famz@redhat.com
The (currently partially commented out) configure options are suggested
by John Snow <jsnow@redhat.com>.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-11-git-send-email-famz@redhat.com
"requires" checks the "FEATURE" environment for specified prerequisits,
and skip the execution of test if not found.
"build_qemu" is the central routine to compile QEMU for tests to call.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-8-git-send-email-famz@redhat.com
It's better to have a launcher for all tests, to make it easier to
initialize and manage the environment.
If "DEBUG=1" a shell prompt will show up before the test runs.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-7-git-send-email-famz@redhat.com
This adds a group of make targets to run docker tests, all are available
in source tree without running ./configure.
The usage is shown with "make docker".
Besides the fixed ones, dynamic targets for building each image and
running each test in each image are generated automatically by make,
scanning $(SRC_PATH)/tests/docker/ files with specific patterns.
Alternative to manually list particular targets (docker-TEST@IMAGE)
set, you can control which tests/images to run by filtering variables,
TESTS= and IMAGES=, which are expressed in Makefile pattern syntax,
"foo% %bar ...". For example:
$ make docker-test IMAGES="ubuntu fedora"
Unfortunately, it's impossible to propagate "-j $JOBS" into make in
containers, however since each combination is made a first class target
in the top Makefile, "make -j$N docker-test" still parallels the tests
coarsely.
Still, $J is made a magic variable to let all make invocations in
containers to use -j$J.
Instead of providing a live version of the source tree to the docker
container we snapshot it with git-archive. This ensures the tree is in a
pristine state for whatever operations the container is going to run on
them.
Uncommitted changes known to files known by the git index will be
included in the snapshot if there are any.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-5-git-send-email-famz@redhat.com
When config-host.mak is not found it is safe to assume SRC_PATH is ".".
So, it is okay to move inclusion of ruls.mak out of the ifeq condition.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-4-git-send-email-famz@redhat.com
Using "," literal in $(call quiet-command, ...) arguments is awkward.
Add this constant to make it at least doable.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-3-git-send-email-famz@redhat.com
docker.py is added with a number of useful subcommands to manager docker
images and instances for QEMU docker testing. Subcommands are:
run: A wrapper of "docker run" (or "sudo -n docker run" if necessary),
which takes care of killing and removing the running container at
SIGINT.
clean: Tear down all the containers including inactive ones that are
started by docker_run.
build: Compare an image from given dockerfile and rebuild it if they're
different.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-2-git-send-email-famz@redhat.com
This function is from net/socket.c, move it to net.c and net.h.
Add SocketReadState to make others reuse net_fill_rstate().
suggestion from jason.
v4:
- move 'rs->finalize = finalize' to rs_init()
v3:
- remove SocketReadState init callback
- put finalize callback to net_fill_rstate()
v2:
- rename ReadState to SocketReadState
- add SocketReadState init and finalize callback
v1:
- init patch
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
All handling of defaults (default_* variables) is inside vl.c,
move default_net there too, so we can more easily refactor that
code later.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Here's another ppc patch queue. This batch is all preliminaries
towards two significant features:
1) Full hypervisor-mode support for POWER8
Patches 1-8 start fixing various bugs with TCG's handling of
hypervisor mode
2) CPU hotplug support
Patches 9-12 make some preliminary fixes towards implementing CPU
hotplug on ppc64 (and other non-x86 platforms). These patches are
actually to generic code, not ppc, but are included here with
Paolo's ACK.
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1
iQIcBAABAgAGBQJXTN1QAAoJEGw4ysog2bOSM4kP/2TKm/wkGo3nsGm7vG0CArs+
JVIlTWI9Le7Cq5ijCkTwV9gjeG2CYz+Us2PCh2ZAoHpXgZtP7px2HRcDv07SbCnt
SaCwCS+EGf3ZO9baQrzG0zfe8XrlJF+XXTejD2zWtOZw7sZ/4OPWF9KdcZbjWqFp
PzJuXrpYOAaIyXyEPJSZFpHY+AC9NIblqHlUrKntPLLOYbqQBYP4IMxsUmOgu2IX
rFK/5A8t20BJN0lbmx8JNKh0voorFpHY/hhaH/1T7rKxsRkKMh3VbYSxD6EYs3Uc
nZ4ufQQW6C4CEFta3YHNwoClcsQUbnZQh3Ra+gKo9bXvqDzasVpq/mBgl3BDjGeG
LQPSA6sfmEA8lqtRikVdgSgdXDnwy5YXJLVmIXeAIG1KHa6eRuUxC3o+ScOkcH3A
ynLglCEBl9slsG9/yYkDcFW2u0t/txTBUvaxMfOQomAejrGjOLGZqnSWMd2UC4Gt
KQRP+b7igkXC7+bfrpBPWKyYGvOOCukESw3OV90hLBIzOthI1dI5hO0Gj61C/nlI
NXMRbx0qTztgj3tfSTTs6e9Ke8PEKnyXqgol0+t9Yxntlz28f2alTubUoyv/vZjx
8J1IOlNms3PnO26TxUVBu7/KaCGCM25eTQgllbTx8rhaqin3wAH3dRc1RTlWWhwJ
SgADl+MWf8sa7DkcxnZa
=vAIf
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160531' into staging
ppc patch queue for 2016-05-31
Here's another ppc patch queue. This batch is all preliminaries
towards two significant features:
1) Full hypervisor-mode support for POWER8
Patches 1-8 start fixing various bugs with TCG's handling of
hypervisor mode
2) CPU hotplug support
Patches 9-12 make some preliminary fixes towards implementing CPU
hotplug on ppc64 (and other non-x86 platforms). These patches are
actually to generic code, not ppc, but are included here with
Paolo's ACK.
# gpg: Signature made Tue 31 May 2016 01:39:44 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.7-20160531:
cpu: Add a sync version of cpu_remove()
cpu: Reclaim vCPU objects
exec: Do vmstate unregistration from cpu_exec_exit()
exec: Remove cpu from cpus list during cpu_exec_exit()
ppc: Add PPC_64H instruction flag to POWER7 and POWER8
ppc: Get out of emulation on SMT "OR" ops
ppc: Fix sign extension issue in mtmsr(d) emulation
ppc: Change 'invalid' bit mask of tlbiel and tlbie
ppc: tlbie, tlbia and tlbisync are HV only
ppc: Do some batching of TCG tlb flushes
ppc: Use split I/D mmu modes to avoid flushes on interrupts
ppc: Remove MMU_MODEn_SUFFIX definitions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
reservations (Paolo)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v2
iQEcBAABCAAGBQJXSpYYAAoJEL/70l94x66DZ5UIAJiFDjWdeFw5eS+MgjflcjN9
zibrg+z2QFqFP7Z3FlQR5EjiML0g1S1eceUPBCyZazx6bQS3zvHkbhIWVx2RtHeI
Rtt5SYRF+R1cvSEJrQGPr3ysCLAThBFuQRayJRQHi/emiUNzfTSW1z0YGaaErimP
o4vF6Sjt1RQcnBkOPw6Qcoe4gSIbh9W4vQFbaZdB2KoWM9fOYBrjFgEFm1lo01vS
Da77vlr1dcYCjohNgo9BOZIOllTQK7WG1XWRE6HPbqa8w6oePXikxJ1lc4PQxXyT
clZy5uGJicp7QmEHf+yG8Evtm2mZG2iOEF4NbbFIm/GghOdFzwtHavxU/ahpAig=
=DaAU
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* docs/atomics fixes and atomic_rcu_* optimization (Emilio)
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
reservations (Paolo)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)
# gpg: Signature made Sun 29 May 2016 08:11:20 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
* remotes/bonzini/tags/for-upstream: (30 commits)
exec: hide mr->ram_addr from qemu_get_ram_ptr users
memory: split memory_region_from_host from qemu_ram_addr_from_host
exec: remove ram_addr argument from qemu_ram_block_from_host
memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
scsi-generic: Merge block max xfer len in INQUIRY response
scsi-block: always use SG_IO
scsi-disk: introduce scsi_disk_req_check_error
scsi-disk: add need_fua_emulation to SCSIDiskClass
scsi-disk: introduce dma_readv and dma_writev
scsi-disk: introduce a common base class
xen-hvm: ignore background I/O sections
docs/atomics: update comparison with Linux
atomics: do not emit consume barrier for atomic_rcu_read
atomics: emit an smp_read_barrier_depends() barrier only for Alpha and Thread Sanitizer
docs/atomics: update atomic_read/set comparison with Linux
bt: rewrite csrhci_write to avoid out-of-bounds writes
block/iscsi: avoid potential overflow of acb->task->cdb
scsi: megasas: check 'read_queue_head' index value
scsi: megasas: initialise local configuration data buffer
scsi: megasas: use appropriate property buffer size
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This sync API will be used by the CPU hotplug code to wait for the CPU to
completely get removed before flagging the failure to the device_add
command.
Sync version of this call is needed to correctly recover from CPU
realization failures when ->plug() handler fails.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html
Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
isn't needed as it is done from cpu_exec_exit()
- Use iothread mutex instead of global mutex during
destroy
- Don't cleanup vCPU object from vCPU thread context
but leave it to the callers (device_add/device_del)]
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
cpu_exec_init() does vmstate_register for the CPU device. This needs to be
undone from cpu_exec_exit(). This change is needed to support CPU hot
removal.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[dwg: added missing include to fix compile on some archs]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().
cpu_exec_exit() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.
Now -1 value for cpu->cpu_index indicates that we have already dequeued
the cpu for CONFIG_USER_ONLY case also.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This will enable decoding of hrfid
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Otherwise tight loops at smt_low for example, which OPAL does,
eat so much CPU that we can't boot a kernel anymore. With that,
I can boot 8 CPUs just fine with powernv.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Otherwise it will trip on the forms used in recent architecture.
Ideally, we should have different handlers for different architecture
levels but our current implementation of TLB flushing is dumb enough
that this will do for now.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Not that anything remotely recent supports tlbia but ...
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On ppc64 especially, we flush the tlb on any slbie or tlbie instruction.
However, those instructions often come in bursts of 3 or more (context
switch will favor a series of slbie's for example to an slbia if the
SLB has less than a certain number of entries in it, and tlbie's can
happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries
at a time.
Doing a tlb_flush() each time is a waste of time. We end up doing a memset
of the whole TLB, reloading it for the next instruction, memset'ing again,
etc...
Those instructions don't have to take effect immediately. For slbie, they
can wait for the next context synchronizing event. For tlbie, the next
tlbsync.
This implements batching by keeping a flag that indicates that we have a
TLB in need of flushing. We check it on interrupts, rfi's, isync's and
tlbsync and flush the TLB if needed.
This reduces the number of tlb_flush() on a boot to a ubuntu installer
first dialog screen from roughly 360K down to 36K.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: added a 'CPUPPCState *' variable in h_remove() and
h_bulk_remove() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: removed spurious whitespace change, use 0/1 not true/false
consistently, since tlb_need_flush has int type]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We rework the way the MMU indices are calculated, providing separate
indices for I and D side based on MSR:IR and MSR:DR respectively,
and thus no longer need to flush the TLB on context changes. This also
adds correct support for HV as a separate address space.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We don't use the resulting accessors and this gets in the way of
the split I/D TLB work.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>