We forgot to initialize the spinlock introduced in 94377115b2
("cpus: protect TimerState writes with a spinlock", 2018-08-23).
Fix it.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180903171831.15446-5-cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <20180903171831.15446-4-cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 2858ab09e6 changed
PS/2 keyboard/mouse buffers to the standard size. However, its state
may change when migrating from the old buffer size and therefore irq needs
updating. But this change made wrong, because it throws the whole queue
if there are too much data instead of cropping it.
That commit also updates irq (because the queue state may change).
But updating the irq may change the VM state (and determinism of
the execution). E.g., when replaying the execution, one may save
the VM state and the state of the interrupt controller will be updated
at the moment of saving, instead of using the recorded update events.
This patch makes the queue update deterministic: it removes the update_irq
call and crops the queue to prevent losing the characters and changing
the required irq status.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Message-Id: <20180511081601.14610.39946.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Coverity does not see anymore that qemu_mutex_lock is taking a lock.
Hide all the QSP magic so that static analysis works again.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Both virtio-blk and virtio-scsi use virtio_queue_empty() as the
loop condition in VQ handlers (virtio_blk_handle_vq,
virtio_scsi_handle_cmd_vq). When a device is marked broken in
virtqueue_pop, for example if a vIOMMU address translation failed, we
want to break out of the loop.
This fixes a hanging problem when booting a CentOS 3.10.0-862.el7.x86_64
kernel with ATS enabled:
$ qemu-system-x86_64 \
... \
-device intel-iommu,intremap=on,caching-mode=on,eim=on,device-iotlb=on \
-device virtio-scsi-pci,iommu_platform=on,ats=on,id=scsi0,bus=pci.4,addr=0x0
The dead loop happens immediately when the kernel boots and initializes
the device, where virtio_scsi_data_plane_handle_cmd will not return:
> ...
> #13 0x00005586602b7793 in virtio_scsi_handle_cmd_vq
> #14 0x00005586602b8d66 in virtio_scsi_data_plane_handle_cmd
> #15 0x00005586602ddab7 in virtio_queue_notify_aio_vq
> #16 0x00005586602dfc9f in virtio_queue_host_notifier_aio_poll
> #17 0x00005586607885da in run_poll_handlers_once
> #18 0x000055866078880e in try_poll_mode
> #19 0x00005586607888eb in aio_poll
> #20 0x0000558660784561 in aio_wait_bh_oneshot
> #21 0x00005586602b9582 in virtio_scsi_dataplane_stop
> #22 0x00005586605a7110 in virtio_bus_stop_ioeventfd
> #23 0x00005586605a9426 in virtio_pci_stop_ioeventfd
> #24 0x00005586605ab808 in virtio_pci_common_write
> #25 0x0000558660242396 in memory_region_write_accessor
> #26 0x00005586602425ab in access_with_adjusted_size
> #27 0x0000558660245281 in memory_region_dispatch_write
> #28 0x00005586601e008e in flatview_write_continue
> #29 0x00005586601e01d8 in flatview_write
> #30 0x00005586601e04de in address_space_write
> #31 0x00005586601e052f in address_space_rw
> #32 0x00005586602607f2 in kvm_cpu_exec
> #33 0x0000558660227148 in qemu_kvm_cpu_thread_fn
> #34 0x000055866078bde7 in qemu_thread_start
> #35 0x00007f5784906594 in start_thread
> #36 0x00007f5784639e6f in clone
With this patch, virtio_queue_empty will now return 1 as soon as the
vdev is marked as broken, after a "virtio: zero sized buffers are not
allowed" error.
To be consistent, update virtio_queue_empty_rcu as well.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180910145616.8598-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have some upcoming things planned for ppc that will require some
newer libfdt features. In preparation, update the dtc/libfdt
submodule to upstreasm version v1.4.7.
-----BEGIN PGP SIGNATURE-----
iQIzBAABCAAdFiEEdfRlhq5hpmzETofcbDjKyiDZs5IFAluy8s8ACgkQbDjKyiDZ
s5IGcQ//TbpUXkB9ihNPYuqw23HOfdjcuaBNIv24O3zJ7SwbgDzHmj4lRlfJQNQp
MNdUREJWy9ywALPy2gfTDh+Eel7t5X1kEOYPtjm1WREt+x1sl5oCm/Q6ag9L5Wcp
7AWsMN3y1VymfgFugJj9JZaLEa300Gzed24P1wvsbfEYi6coD8mZGe87W9Nh+dW/
DPRX3Uz1ewsoTphTIWzfphXM7Av5yV/ThGWsFmdh5kpCLusa1DibdyxZOxoERjc6
1i/aI2h/NKB8S+ruLK6IyqzbMKM0o2QyKLM7Hb5N/akn3GCvmpCOyb5JPIDjorXC
/v/r7NR9FwyKFd3FtzxT+tbdXGivEuNnTv+nUgZcK1e79OyQHtbcPiNKVaEpRZVl
C/ftjOFyG1j2cQmkd8THhQRCbRSEvTFUUKQY5hTgZfFA+OW91L1RdLQ3K5/nqlQn
heSKG73ABo3CiG+NL9n2wKXdKrPrjXqtkNw+bdjXxyKJCRCIetuXEWl3aC4xmhuo
FQPG1KBrye7/5ohIu1qCmqxGZRDsi4mqTkZru3UlePRYz9L0EAbTUXc6PBgYEM9c
ijB0ISiip6w93RTBr+QbK4GbxcNZQqb/QJB8MC5qxSXxv/788AV2XnjDx2eArAmZ
/gxV2MuaxRlnReGu+MKWpqgTfWfM6+wTpeF3CrnZGczw+vzFrT4=
=O+EV
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/dgibson/tags/libfdt-20181002' into staging
Update dtc submodule to v1.4.7
We have some upcoming things planned for ppc that will require some
newer libfdt features. In preparation, update the dtc/libfdt
submodule to upstreasm version v1.4.7.
# gpg: Signature made Tue 02 Oct 2018 05:23:43 BST
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/libfdt-20181002:
Update dtc/libfdt submodule to v1.4.7
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Separate generation of per-instruction code (such as raising exceptions
and terminating TB) from per-opcode code.
-----BEGIN PGP SIGNATURE-----
iQJHBAABCAAxFiEEK2eFS5jlMn3N6xfYUfnMkfg/oEQFAluyZAoTHGpjbXZia2Jj
QGdtYWlsLmNvbQAKCRBR+cyR+D+gRO/JD/0bTPxmqU/8svhNLdlG7woSICG4T/z3
DIhttiedDqtLBVKpmziqFtC7EK5Mo/Pydpo5R0jxsLUvxuBSHwnlPUdixzrA5L/t
GY9Xr1VLdjjv2C8i/9SUyIRswMutp++Gxy4DNi93oqBoaxh5fbcMmWEa4CVApn6m
/7z6MHiVUVtuS3HXqs7uvDl8fKv4//CISMpVRNhZ9aTp99/Oc+Xiwlmg/Gl4SNCG
1RMI6UzFy0CYfzwZr9YRO58wvWTH5mv+YoYkXsMKiQ2MFYZ5/SWhi7bzANXsMGgh
u5oFfwbJa6o5//3EHeohmdwg8vuyOMasE352Sx//sSxgVFheBEoU21qJdujQiyKU
2RNpVWDHd7JTP+nlGvIrc/kpZmVYirn9YUi64S9CunCLrPHTKIexrXHpr7QxS+Pk
zWcrAAehzZ7nM4R1VWWWcg2g9FECLT+Nuqpvsr3JFJ+fXT7mjgKvDAMuUV+SnYFx
514Jx0epsoVdbDB7PIwn8J3liiPRfHGiCHew6ZU8OBMBCqnOcTc/l7Ibqcnbtvb8
PqtkB+1/D8DkbWANLh1hUs8SUnwIrXZ4q7GJbK9+jC4A5i2CVsHQJn0PAIzOVSbr
3AkumUYalMnAtk7AwJ0IJyuvHY2znqP+IcXLG2Y4GQ/vIpwKnHWK9jJVldYqTMxM
q8sDFdeQ9/0bvw==
=u1IL
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/xtensa/tags/20181001-xtensa' into staging
target/xtensa: preparation for FLIX support
Separate generation of per-instruction code (such as raising exceptions
and terminating TB) from per-opcode code.
# gpg: Signature made Mon 01 Oct 2018 19:14:34 BST
# gpg: using RSA key 51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg: aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg: aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB 17D8 51F9 CC91 F83F A044
* remotes/xtensa/tags/20181001-xtensa:
target/xtensa: extract gen_check_interrupts call
target/xtensa: make rsr/wsr helpers return void
target/xtensa: extract unconditional TB termination via slot 0
target/xtensa: always end TB on CCOUNT access/CCOMPARE write
target/xtensa: change SR number checks to assertions
target/xtensa: extract unconditional TB termination
target/xtensa: extract test for division by zero
target/xtensa: extract test for cpdisabled exception
target/xtensa: extract test for alloca exception
target/xtensa: extract test for window underflow exception
target/xtensa: extract test for window overflow exception
target/xtensa: extract test for debug exception
target/xtensa: extract test for syscall instruction
target/xtensa: extract test for privileged instruction
target/xtensa: extract test for an illegal instruction
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
dtc v1.4.7 contains a bunch of improvements to make libfdt safer against
handling a corrupted or malicious tree, which is a good thing to have. It
also includes an explicit fdt checking function that we'll be wanting in
future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
- mark instructions that affect active IRQ level;
- put call for gen_check_interrupts right after the instruction
translation; when FLIX is enabled it will need to appear before
other exits from the TB as well;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Now that all logic for TB termination is extracted from rsr/wsr their
return value is not used and may be dropped.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- mark instructions that require TB termination via slot 0;
- put TB termination right after the instruction translation loop, if
termination w/o TB linking wasn't requested;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Currently we only end TB in icount mode, because access to CCOUNT or
write to CCOMPARE are IO operations. Simplify the behaviour a bit and
end TB unconditionally.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Opcode decoding with libisa takes care about range of valid group SRs,
like CCOMPARE, IBREAKA, DBREAKA or DBREAKC. Turn range checks in wsr
implementations into assertions.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- mark all instructions that exit TB and require dynamic search for the
next TB;
- put TB termination right after the instruction translation loop;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- mark quos/quou/rems/remu instructions;
- drop parameter 0 from the translate_quou and split translate_remu from
it;
- put test for division by zero exception right after the coprocessor
exception test;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- add XtensaOpcodeOps::coprocessor with bitmask of coprocessors used by
the instruction;
- replace coprocessor id parameter of gen_check_cpenable with the
bitmask of used coprocessors;
- collect coprocessor IDs used by an instruction in the disassembly
loop;
- put test for coprocessor disabled exception after the alloca test;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- mark retw and retw.n instructions;
- extract window inderflow test from retw helper;
- put underflow exception check generation right after the overflow
check;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- add ps.callinc to the TB flags, that allows testing all instructions
for window overflow statically;
- drop gen_window_check* functions; replace them with get_window_check
that accepts bitmask of used registers;
- add XtensaOpcodeOps::test_overflow that returns bitmask of implicitly
used registers; use it for entry and call{,x}{4,8,12};
- drop window overflow test from the entry helper;
- drop parameter 0 from translate_[di]cache and use translate_nop for
d/i cache opcodes that don't need memory accessibility check;
- add bitmask XtensaOpcodeOps::windowed_register_op that marks opcode
arguments that refer to windowed registers;
- translate windowed_register_op mask to a mask of actually used
registers in the disassembly loop;
- add check for window overflow right after the check for debug
exception;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- mark break and break.n instructions;
- collect debug cause bits from parameter 0 of instructions marked for
debug exception;
- put debug exception check right after syscall check;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- mark privileged instructions;
- put single privileged instruction check after disassembly loop;
- translate_[di]cache: drop parameter 0, shift parameters one down;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- TB flags: add XTENSA_TBFLAG_CWOE that corresponds to the architectural
CWOE state;
- entry: move CWOE check from the helper to the test_ill_entry;
- retw: move CWOE check from the helper to the test_ill_retw;
- separate instruction disassembly loop and translation loop; save
disassembly results in local array;
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- qcow2 cache option default changes (Linux: 32 MB maximum, limited by
whatever cache size can be made use of with the specific image;
default cache-clean-interval of 10 minutes)
- reopen: Allow specifying unchanged child node references, and changing
a few generic options (discard, detect-zeroes)
- Fix werror/rerror defaults for -device drive=<node-name>
- Test case fixes
-----BEGIN PGP SIGNATURE-----
iQIcBAABAgAGBQJbslavAAoJEH8JsnLIjy/Wi0kP/jU18AzfISoIhcJ2GBXYU2aV
/FnUdB/L3mjMZOYkIgjDunw/fgfvelLqNdWb7xlijYeDPAiYKNEmJHX+iznE5ieP
KnpHOxASSe8w5SFlnF8h30rLK05gcy/rg/QcuMX4KkU46E0C8t0rSLBJE5FdYiRU
HN00jraTNfzyixuFxRVpqyadbhbCCEVwlwjDg3GMjGEML/WRk6jmhOOF5tVX72om
gmVrzA1lAlzkFnx32Bloevp72iolWFLkyA86oNgPMwIFG0zj9lnK5B/fvnkVTY2v
MnXGPwEVZUoZnif4nAXA2+bBqKT4Nbo21N8OylJhmNUi8K/rndiZdHH5Kph+yFod
RGkBI4Pb5KxiI+YDiRKJmyQd/7IiWLarjP1nV3UjvPLnpmuTA54jRjDVmA6AW8OH
BFu34+jfA4rll2dorVmQAFES4yvvj/brtTsCZfG5VNl60tigdqeLCZrQkNwR188q
osKGWBEKy7+2SYj5q+s0BSO+caXmU2XLSdcE1gEHFQ51eU0mRZA0OrooNUuUk30E
42n8BZ77P8EGb7UQBmKqYwWL4hXQPWL3m3i7Mnz19+iwk/m8SHvj2nriouDoiVtf
gtUwfr7TKvL9JcPLHrS3/j8boC5S4Rm+wlyyIlta8n2rS4bh1e2bGEZuNxZKyKCg
Y9WO6KxbztbO9X0ZnxFW
=ai81
-----END PGP SIGNATURE-----
Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches:
- qcow2 cache option default changes (Linux: 32 MB maximum, limited by
whatever cache size can be made use of with the specific image;
default cache-clean-interval of 10 minutes)
- reopen: Allow specifying unchanged child node references, and changing
a few generic options (discard, detect-zeroes)
- Fix werror/rerror defaults for -device drive=<node-name>
- Test case fixes
# gpg: Signature made Mon 01 Oct 2018 18:17:35 BST
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (23 commits)
tests/test-bdrv-drain: Fix too late qemu_event_reset()
test-replication: Lock AioContext around blk_unref()
qcow2: Fix cache-clean-interval documentation
block-backend: Set werror/rerror defaults in blk_new()
qcow2: Explicit number replaced by a constant
qcow2: Set the default cache-clean-interval to 10 minutes
qcow2: Resize the cache upon image resizing
qcow2: Increase the default upper limit on the L2 cache size
qcow2: Assign the L2 cache relatively to the image size
qcow2: Avoid duplication in setting the refcount cache size
qcow2: Make sizes more humanly readable
include: Add a lookup table of sizes
qcow2: Options' documentation fixes
block: Allow changing 'detect-zeroes' on reopen
block: Allow changing 'discard' on reopen
file-posix: Forbid trying to change unsupported options during reopen
block: Forbid trying to change unsupported options during reopen
block: Allow child references on reopen
block: Don't look for child references in append_open_options()
block: Remove child references from bs->{options,explicit_options}
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu_event_reset() must be called before the AIO request in a different
iothread is submitted. Otherwise the request could be completed before
we do the qemu_event_reset() and the test would hang in
qemu_event_wait().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Max Reitz <mreitz@redhat.com>
Recently, the test case has started failing because some job related
functions want to drop the AioContext lock even though it hasn't been
taken:
(gdb) bt
#0 0x00007f51c067c9fb in raise () from /lib64/libc.so.6
#1 0x00007f51c067e77d in abort () from /lib64/libc.so.6
#2 0x0000558c9d5dde7b in error_exit (err=<optimized out>, msg=msg@entry=0x558c9d6fe120 <__func__.18373> "qemu_mutex_unlock_impl") at util/qemu-thread-posix.c:36
#3 0x0000558c9d6b5263 in qemu_mutex_unlock_impl (mutex=mutex@entry=0x558c9f3999a0, file=file@entry=0x558c9d6fd36f "util/async.c", line=line@entry=516) at util/qemu-thread-posix.c:96
#4 0x0000558c9d6b0565 in aio_context_release (ctx=ctx@entry=0x558c9f399940) at util/async.c:516
#5 0x0000558c9d5eb3da in job_completed_txn_abort (job=0x558c9f68e640) at job.c:738
#6 0x0000558c9d5eb227 in job_finish_sync (job=0x558c9f68e640, finish=finish@entry=0x558c9d5eb8d0 <job_cancel_err>, errp=errp@entry=0x0) at job.c:986
#7 0x0000558c9d5eb8ee in job_cancel_sync (job=<optimized out>) at job.c:941
#8 0x0000558c9d64d853 in replication_close (bs=<optimized out>) at block/replication.c:148
#9 0x0000558c9d5e5c9f in bdrv_close (bs=0x558c9f41b020) at block.c:3420
#10 bdrv_delete (bs=0x558c9f41b020) at block.c:3629
#11 bdrv_unref (bs=0x558c9f41b020) at block.c:4685
#12 0x0000558c9d62a3f3 in blk_remove_bs (blk=blk@entry=0x558c9f42a7c0) at block/block-backend.c:783
#13 0x0000558c9d62a667 in blk_delete (blk=0x558c9f42a7c0) at block/block-backend.c:402
#14 blk_unref (blk=0x558c9f42a7c0) at block/block-backend.c:457
#15 0x0000558c9d5dfcea in test_secondary_stop () at tests/test-replication.c:478
#16 0x00007f51c1f13178 in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
#17 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
#18 0x00007f51c1f1337b in g_test_run_suite_internal () from /lib64/libglib-2.0.so.0
#19 0x00007f51c1f13552 in g_test_run_suite () from /lib64/libglib-2.0.so.0
#20 0x00007f51c1f13571 in g_test_run () from /lib64/libglib-2.0.so.0
#21 0x0000558c9d5de31f in main (argc=<optimized out>, argv=<optimized out>) at tests/test-replication.c:581
It is yet unclear whether this should really be considered a bug in the
test case or whether blk_unref() should work for callers that haven't
taken the AioContext lock, but in order to fix the build tests quickly,
just take the AioContext lock around blk_unref().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Fixing cache-clean-interval documentation following the recent change to
a default of 600 seconds on supported plarforms (only Linux currently).
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, the default values for werror and rerror have to be set
explicitly with blk_set_on_error() by the callers of blk_new(). The only
caller actually doing this is blockdev_init(), which is called for
BlockBackends created using -drive.
In particular, anonymous BlockBackends created with
-device ...,drive=<node-name> didn't get the correct default set and
instead defaulted to the integer value 0 (= BLOCKDEV_ON_ERROR_REPORT).
This is the intended default for rerror anyway, but the default for
werror should be BLOCKDEV_ON_ERROR_ENOSPC.
Set the defaults in blk_new() instead so that they apply no matter what
way the BlockBackend was created.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The default cache-clean-interval is set to 10 minutes, in order to lower
the overhead of the qcow2 caches (before the default was 0, i.e.
disabled).
* For non-Linux platforms the default is kept at 0, because
cache-clean-interval is not supported there yet.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The caches are now recalculated upon image resizing. This is done
because the new default behavior of assigning L2 cache relatively to
the image size, implies that the cache will be adapted accordingly
after an image resize.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The upper limit on the L2 cache size is increased from 1 MB to 32 MB
on Linux platforms, and to 8 MB on other platforms (this difference is
caused by the ability to set intervals for cache cleaning on Linux
platforms only).
This is done in order to allow default full coverage with the L2 cache
for images of up to 256 GB in size (was 8 GB). Note, that only the
needed amount to cover the full image is allocated. The value which is
changed here is just the upper limit on the L2 cache size, beyond which
it will not grow, even if the size of the image will require it to.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Sufficient L2 cache can noticeably improve the performance when using
large images with frequent I/O.
Previously, unless 'cache-size' was specified and was large enough, the
L2 cache was set to a certain size without taking the virtual image size
into account.
Now, the L2 cache assignment is aware of the virtual size of the image,
and will cover the entire image, unless the cache size needed for that is
larger than a certain maximum. This maximum is set to 1 MB by default
(enough to cover an 8 GB image with the default cluster size) but can
be increased or decreased using the 'l2-cache-size' option. This option
was previously documented as the *maximum* L2 cache size, and this patch
makes it behave as such, instead of as a constant size. Also, the
existing option 'cache-size' can limit the sum of both L2 and refcount
caches, as previously.
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The refcount cache size does not need to be set to its minimum value in
read_cache_sizes(), as it is set to at least its minimum value in
qcow2_update_options_prepare().
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Adding a lookup table for the powers of two, with the appropriate size
prefixes. This is needed when a size has to be stringified, in which
case something like '(1 * KiB)' would become a literal '(1 * (1L << 10))'
string. Powers of two are used very often for sizes, so such a table
will also make it easier and more intuitive to write them.
This table is generatred using the following AWK script:
BEGIN {
suffix="KMGTPE";
for(i=10; i<64; i++) {
val=2**i;
s=substr(suffix, int(i/10), 1);
n=2**(i%10);
pad=21-int(log(n)/log(10));
printf("#define S_%d%siB %*d\n", n, s, pad, val);
}
}
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Leonid Bloch <lbloch@janustech.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'detect-zeroes' is one of the basic BlockdevOptions available for all
drivers, but it's not handled by bdrv_reopen_prepare(), so any attempt
to change it results in an error:
(qemu) qemu-io virtio0 "reopen -o detect-zeroes=on"
Cannot change the option 'detect-zeroes'
Since there's no reason why we shouldn't allow changing it and the
implementation is simple let's just do it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
'discard' is one of the basic BlockdevOptions available for all
drivers, but it's not handled by bdrv_reopen_prepare() so any attempt
to change it results in an error:
(qemu) qemu-io virtio0 "reopen -o discard=on"
Cannot change the option 'discard'
Since there's no reason why we shouldn't allow changing it and the
implementation is simple let's just do it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The file-posix code is used for the "file", "host_device" and
"host_cdrom" drivers, and it allows reopening images. However the only
option that is actually processed is "x-check-cache-dropped", and
changes in all other options (e.g. "filename") are silently ignored:
(qemu) qemu-io virtio0 "reopen -o file.filename=no-such-file"
While we could allow changing some of the other options, let's keep
things as they are for now but return an error if the user tries to
change any of them.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The bdrv_reopen_prepare() function checks all options passed to each
BlockDriverState (in the reopen_state->options QDict) and makes all
necessary preparations to apply the option changes requested by the
user.
Options are removed from the QDict as they are processed, so at the
end of bdrv_reopen_prepare() only the options that can't be changed
are left. Then a loop goes over all remaining options and verifies
that the old and new values are identical, returning an error if
they're not.
The problem is that at the moment there are options that are removed
from the QDict although they can't be changed. The consequence of this
is any modification to any of those options is silently ignored:
(qemu) qemu-io virtio0 "reopen -o discard=on"
This happens when all options from bdrv_runtime_opts are removed
from the QDict but then only a few of them are processed. Since
it's especially important that "node-name" and "driver" are not
changed, the code puts them back into the QDict so they are checked
at the end of the function. Instead of putting only those two options
back into the QDict, this patch puts all unprocessed options using
qemu_opts_to_qdict().
update_flags_from_options() also needs to be modified to prevent
BDRV_OPT_CACHE_NO_FLUSH, BDRV_OPT_CACHE_DIRECT and BDRV_OPT_READ_ONLY
from going back to the QDict.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the previous patches we removed all child references from
bs->{options,explicit_options} because keeping them is useless and
wrong.
Because of this, any attempt to reopen a BlockDriverState using a
child reference as one of its options would result in a failure,
because bdrv_reopen_prepare() would detect that there's a new option
(the child reference) that wasn't present in bs->options.
But passing child references on reopen can be useful. It's a way to
specify a BDS's child without having to pass recursively all of the
child's options, and if the reference points to a different BDS then
this can allow us to replace the child.
However, replacing the child is something that needs to be implemented
case by case and only when it makes sense. For now, this patch allows
passing a child reference as long as it points to the current child of
the BlockDriverState.
It's also important to remember that, as a consequence of the
previous patches, this child reference will be removed from
bs->{options,explicit_options} after the reopening has been completed.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the previous patch we removed child references from bs->options, so
there's no need to look for them here anymore.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block drivers allow opening their children using a reference to an
existing BlockDriverState. These references remain stored in the
'options' and 'explicit_options' QDicts, but we don't need to keep
them once everything is open.
What is more important, these values can become wrong if the children
change:
$ qemu-img create -f qcow2 hd0.qcow2 10M
$ qemu-img create -f qcow2 hd1.qcow2 10M
$ qemu-img create -f qcow2 hd2.qcow2 10M
$ $QEMU -drive if=none,file=hd0.qcow2,node-name=hd0 \
-drive if=none,file=hd1.qcow2,node-name=hd1,backing=hd0 \
-drive file=hd2.qcow2,node-name=hd2,backing=hd1
After this hd2 has hd1 as its backing file. Now let's remove it using
block_stream:
(qemu) block_stream hd2 0 hd0.qcow2
Now hd0 is the backing file of hd2, but hd2's options QDicts still
contain backing=hd1.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The default value of x-check-cache-dropped is false. There's no reason
to use the previous value as a default in raw_reopen_prepare() because
bdrv_reopen_queue_child() already takes care of putting the old
options in the BDRVReopenState.options QDict.
If x-check-cache-dropped was previously set but is now missing from
the reopen QDict then it should be reset to false.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
"qemu-io reopen" doesn't allow changing the writethrough setting of
the cache, but the check is wrong, causing an error even on a simple
reopen with the default parameters:
$ qemu-img create -f qcow2 hd.qcow2 1M
$ qemu-system-x86_64 -monitor stdio -drive if=virtio,file=hd.qcow2
(qemu) qemu-io virtio0 reopen
Cannot change cache.writeback: Device attached
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>