mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Max Reitz	d1b9d19f99	qcow2: Limit total allocation range to INT_MAX When the COW areas are included, the size of an allocation can exceed INT_MAX. This is kind of limited by handle_alloc() in that it already caps avail_bytes at INT_MAX, but the number of clusters still reflects the original length. This can have all sorts of effects, ranging from the storage layer write call failing to image corruption. (If there were no image corruption, then I suppose there would be data loss because the .cow_end area is forced to be empty, even though there might be something we need to COW.) Fix all of it by limiting nb_clusters so the equivalent number of bytes will not exceed INT_MAX. Cc: qemu-stable@nongnu.org Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Kevin Wolf	495bf893b0	qemu-nbd: Support help options for --object Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-nbd, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-10-14 17:12:48 +02:00
Kevin Wolf	c6e5cdfd4b	qemu-img: Support help options for --object Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-img, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-10-14 17:12:48 +02:00
Kevin Wolf	4fa1f0dc05	qemu-io: Support help options for --object Instead of parsing help options as normal object properties and returning an error, provide the same help functionality as the system emulator in qemu-io, too. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-10-14 17:12:48 +02:00
Kevin Wolf	3e9297f365	vl: Split off user_creatable_print_help() Printing help for --object is something that we not only want in the system emulator, but also in tools that support --object. Move it into a separate function in qom/object_interfaces.c to make the code accessible for tools. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com>	2019-10-14 17:12:48 +02:00
Max Reitz	48c8d3ce6d	iotests/028: Fix for long $TEST_DIRs For long test image paths, the order of the "Formatting" line and the "(qemu)" prompt after a drive_backup HMP command may be reversed. In fact, the interaction between the prompt and the line may lead to the "Formatting" to being greppable at all after "read"-ing it (if the prompt injects an IFS character into the "Formatting" string). So just wait until we get a prompt. At that point, the block job must have been started, so "info block-jobs" will only return "No active jobs" once it is done. Reported-by: Thomas Huth <thuth@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com> Reviewed-by: John Snow <jsnow@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Alberto Garcia	f2208fdc5b	block: Reject misaligned write requests with BDRV_REQ_NO_FALLBACK The BDRV_REQ_NO_FALLBACK flag means that an operation should only be performed if it can be offloaded or otherwise performed efficiently. However a misaligned write request requires a RMW so we should return an error and let the caller decide how to proceed. This hits an assertion since commit `c8bb23cbdb` if the required alignment is larger than the cluster size: qemu-img create -f qcow2 -o cluster_size=2k img.qcow2 4G qemu-io -c "open -o driver=qcow2,file.align=4k blkdebug::img.qcow2" \ -c 'write 0 512' qemu-io: block/io.c:1127: bdrv_driver_pwritev: Assertion `!(flags & BDRV_REQ_NO_FALLBACK)' failed. Aborted The reason is that when writing to an unallocated cluster we try to skip the copy-on-write part and zeroize it using BDRV_REQ_NO_FALLBACK instead, resulting in a write request that is too small (2KB cluster size vs 4KB required alignment). Signed-off-by: Alberto Garcia <berto@igalia.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Pavel Dovgalyuk	e4ec5ad464	replay: add BH oneshot event for block layer Replay is capable of recording normal BH events, but sometimes there are single use callbacks scheduled with aio_bh_schedule_oneshot function. This patch enables recording and replaying such callbacks. Block layer uses these events for calling the completion function. Replaying these calls makes the execution deterministic. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Pavel Dovgalyuk	ae25dccb7d	replay: finish record/replay before closing the disks After recent updates block devices cannot be closed on qemu exit. This happens due to the block request polling when replay is not finished. Therefore now we stop execution recording before closing the block devices. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Pavel Dovgalyuk	c8aa7895eb	replay: don't drain/flush bdrv queue while RR is working In record/replay mode bdrv queue is controlled by replay mechanism. It does not allow saving or loading the snapshots when bdrv queue is not empty. Stopping the VM is not blocked by nonempty queue, but flushing the queue is still impossible there, because it may cause deadlocks in replay mode. This patch disables bdrv_drain_all and bdrv_flush_all in record/replay mode. Stopping the machine when the IO requests are not finished is needed for the debugging. E.g., breakpoint may be set at the specified step, and forcing the IO requests to finish may break the determinism of the execution. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Pavel Dovgalyuk	de499eb67c	replay: update docs for record/replay with block devices This patch updates the description of the command lines for using record/replay with attached block devices. Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Pavel Dovgalyuk	2586397544	replay: disable default snapshot for record/replay This patch disables setting '-snapshot' option on by default in record/replay mode. This is needed for creating vmstates in record and replay modes. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Pavel Dovgalyuk	3c6c4348f2	block: implement bdrv_snapshot_goto for blkreplay This patch enables making snapshots with blkreplay used in block devices. This function is required to make bdrv_snapshot_goto without calling .bdrv_open which is not implemented. Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru> Acked-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Peter Lieven	6caaad46de	block/vhdx: add check for truncated image files qemu is currently not able to detect truncated vhdx image files. Add a basic check if all allocated blocks are reachable at open and report all errors during bdrv_co_check. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2019-10-14 17:12:48 +02:00
Peter Maydell	22dbfdecc3	VFIO update 2019-10-10 - Fix MSI error path double free (Evgeny Yakovlev) -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.14 (GNU/Linux) iQIcBAABAgAGBQJdn4F7AAoJECObm247sIsiOF8QAIDn3WqfnCJME8JN5Jwq4M9k HBxSfjE+VkjTcrHa3eEq58RF8SDyV0dseo61WPkix5QxmKAa/+pC9LmHGv6TGfav tUQRELjhPW3ZXavudHovhjdYDISeOSzvALejK0WcBpYZVGBIPochXi1Dqt7dwh+1 8KUAS93F5W9h5AY2g3sZfBFNlObR1xxa30e1LVPccnv7hnbDolp1F2AZhIdu4pc3 bjV1MYNTjZS5lTpteX81jJKEJbQapFMexV5PmPMgM5ABb6hhkzuDXUUK1AV/4Dcr T+VvleQk1mli/qrNxd54ZZcPlQvDNzjgnsqNNn5U2d/PH++mUxJJZM8hXl6GRjs3 2cgzmv41ebplbWuDGCCj5T4kg1tkZPHTti724du27Kmrm2cTJfLwNHx7zki/kk2X t5Ndgo595O4xkZ9HtKqkEm/JGsoyx4nWXoPlgF4nFPYSpyvTPkF5v64xiBamdSy1 6tDSOhttxpQOYKUCL4Ffu0dTKtOw16U4I2iSgDDHV/5GAvsHNgCSvCl8G6S54uV4 BGcuXuGByGMtOJYkERld8LSGLzdM4pvmh4MSwUkdGa9tBFU4NSKliDtfe817H1PL uwcQT2L9V8Bkceh8eF1TF42vlvODo+O1sL4ykIFA11j909xjGFU+EPQP+NqPRUtA LcCa3h7BEQSclFZjVMT4 =Fxpr -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20191010.0' into staging VFIO update 2019-10-10 - Fix MSI error path double free (Evgeny Yakovlev) # gpg: Signature made Thu 10 Oct 2019 20:07:39 BST # gpg: using RSA key 239B9B6E3BB08B22 # gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>" [full] # gpg: aka "Alex Williamson <alex@shazbot.org>" [full] # gpg: aka "Alex Williamson <alwillia@redhat.com>" [full] # gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>" [full] # Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22 * remotes/awilliam/tags/vfio-update-20191010.0: hw/vfio/pci: fix double free in vfio_msi_disable Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-14 15:09:08 +01:00
Peter Maydell	c8b2bc5185	The most notable change is that we now detect cross-device setups in the host since it may cause inode number collision and mayhem in the guest. A new fsdev property is added for the user to choose the appropriate policy to handle that: either remap all inode numbers or fail I/Os to another host device or just print out a warning (default behaviour). This is also my last PR as _active_ maintainer of 9pfs. -----BEGIN PGP SIGNATURE----- iQIzBAABCAAdFiEEtIKLr5QxQM7yo0kQcdTV5YIvc9YFAl2fEn8ACgkQcdTV5YIv c9bnTxAApYimbNUT+OjfNfPDjMHrezHCLnczuAWya3JcUCEkZC2E+qEwYdCzdwvq TGcdXPcbiUKUNY/3V3pEefuckPJ2+UVmqPpzYcuRjZNYrxqo7SzVPyxxMtG3f5Fh +dMu6Hx1s/vkoWf81HO1tnkTdL9aiOMQS7yUtEYidD8yoqJRLwbKGB+uGZrY6aDy 65n9z/0uwwzOwJsFlRjLMeifkmMC4tA1DLIZHQxGLCUk9K0/xCcI2CbYITgt1T4m 2xf/0t/+RQT/n6sXheskDpI8hf3A0rvEDETrvHp90zal3iDq93ZfvPd134LFRZIu tWsRYNKsaJE4ecIHa/wp535isb4uQa7PL10+oD075o+BF98Nk10ALyAQf7RTefkC 90lkXeRAGfJaMCuDuTmxFVBmQPgUjXsfKvASG8V4yweqO7oUSl5D8m+aOu7t3+f4 8n+DhEZp1ANQPgLv4raAxwFhlsVl+BImOZRv/SGKzqgf0jy+NT1/ebfTFyPttFff vn7kYfm1V/hPhQVVm7xqGwyRybP+V8td3mWo8hVsiqziZIN4x1wb/qFpJeuHuFSj IcJymcH7BgeBYWyjpmn+W94DdIoj20cLwcLHxU6d2L61oUrhKHd7R2g1Ow/aXh4L ohoK104GUqTBPbmxn0Dpal/Xz26X4k4l0JvVXzwPdBv99JkRF4I= =TqfQ -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/gkurz/tags/9p-next-2019-10-10' into staging The most notable change is that we now detect cross-device setups in the host since it may cause inode number collision and mayhem in the guest. A new fsdev property is added for the user to choose the appropriate policy to handle that: either remap all inode numbers or fail I/Os to another host device or just print out a warning (default behaviour). This is also my last PR as _active_ maintainer of 9pfs. # gpg: Signature made Thu 10 Oct 2019 12:14:07 BST # gpg: using RSA key B4828BAF943140CEF2A3491071D4D5E5822F73D6 # gpg: Good signature from "Greg Kurz <groug@kaod.org>" [full] # gpg: aka "Gregory Kurz <gregory.kurz@free.fr>" [full] # gpg: aka "[jpeg image of size 3330]" [full] # Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6 * remotes/gkurz/tags/9p-next-2019-10-10: MAINTAINERS: Downgrade status of virtio-9p to "Odd Fixes" 9p: Use variable length suffixes for inode remapping 9p: stat_to_qid: implement slow path 9p: Added virtfs option 'multidevs=remap\|forbid\|warn' 9p: Treat multiple devices on one export as an error fsdev: Add return value to fsdev_throttle_parse_opts() 9p: Simplify error path of v9fs_device_realize_common() 9p: unsigned type for type, version, path Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-14 13:34:39 +01:00
Peter Maydell	088d67096d	Block patches: - Parallelized request handling for qcow2 - Backup job refactoring to use a filter node instead of before-write notifiers - Add discard accounting information to file-posix nodes - Allow trivial reopening of nbd nodes - Some iotest fixes -----BEGIN PGP SIGNATURE----- iQFGBAABCAAwFiEEkb62CjDbPohX0Rgp9AfbAGHVz0AFAl2fGLISHG1yZWl0ekBy ZWRoYXQuY29tAAoJEPQH2wBh1c9A5JEH/2Hluzk0kfpYK+Ju3Mpf6syE2XdtYL7q zJNQgx4aIQOnBkCnUhQckNnRLWbiv9DxcJQ9iueRyst5nQhOpGisNw5LS4vYUbKV rHA3oITdV9Ozsr2d8SL+ncvY91I5zpzFySDsYIwMD6Y7H42NVcs7yvUjilHW2SmN 9bPFp0mocMLeH/2keQE3H5mJGb+tAogM9FW/jQ/fjD5eql05gb9McDjjPD6jHLk6 AEzWxWh6M56krEgke390gxy/N7r9u5+HHRAfFldEGoI+jw0iTt3L1MXcz0zytxfx Gdh+gyihauQVTIfTvyAAHHYaOuXUwSWJOjlbILleLEhnTYd/cFGRYzM= =u0I2 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2019-10-10' into staging Block patches: - Parallelized request handling for qcow2 - Backup job refactoring to use a filter node instead of before-write notifiers - Add discard accounting information to file-posix nodes - Allow trivial reopening of nbd nodes - Some iotest fixes # gpg: Signature made Thu 10 Oct 2019 12:40:34 BST # gpg: using RSA key 91BEB60A30DB3E8857D11829F407DB0061D5CF40 # gpg: issuer "mreitz@redhat.com" # gpg: Good signature from "Max Reitz <mreitz@redhat.com>" [full] # Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40 * remotes/maxreitz/tags/pull-block-2019-10-10: (36 commits) iotests/162: Fix for newer Linux 5.3+ tests: fix I/O test for hosts defaulting to LUKSv2 nbd: add empty .bdrv_reopen_prepare block/backup: use backup-top instead of write notifiers block: introduce backup-top filter driver block/block-copy: split block_copy_set_callbacks function block/backup: move write_flags calculation inside backup_job_create block/backup: move in-flight requests handling from backup to block-copy iotests: Use stat -c %b in 125 iotests: Disable 125 on broken XFS versions iotests: Fix 125 for growth_mode = metadata qapi: query-blockstat: add driver specific file-posix stats file-posix: account discard operations scsi: account unmap operations scsi: move unmap error checking to the complete callback scsi: store unmap offset and nb_sectors in request struct ide: account UNMAP (TRIM) operations block: add empty account cookie type qapi: add unmap to BlockDeviceStats qapi: group BlockDeviceStats fields ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-14 12:26:37 +01:00
Peter Maydell	cdfc44ac3c	- MMU DAT translation rewrite and cleanup - Implement more TCG CPU features related to the MMU (e.g., IEP) - Add the current instruction length to unwind data and clean up - Resolve one TODO for the MVCL instruction -----BEGIN PGP SIGNATURE----- iQJFBAABCAAvFiEEG9nKrXNcTDpGDfzKTd4Q9wD/g1oFAl2fFRIRHGRhdmlkQHJl ZGhhdC5jb20ACgkQTd4Q9wD/g1pLZBAApbNdwchTcYR483BWFCDRktU+jILOvepe yZ8ek6kvVAL0U0psVPYrUw74C11ig7c06JADL9ON3aF5RHRppNpLG/ZNVSat/1P5 hcgCMwNYkXVJeL5PDWW2sDVjBvY9n8sDH6rlslmtZB+uetIpTS6ixfv/GhZak4E5 YHJPK2eNAJHMOuasvGeBdnObQNSTAr+pE9I7k4+wt4OKHZiT+k6Dlm44JYQtiv6s DRJClt25pdxSxjrMzG9nEDm5Ql+K/9qJ23sSniqfTD4UtgILBHODc9p9VNVf+92Q y2iMRVnHHA8wlp6UI6uJLPIoVPcEKcBQYFnEN1zTKGwoPHIxlzh1zj6wgbQTJfns bUASRu3o6coUdUAX1YeLczzP5Gac+nWzhbF8jf8p9WdKAgcMyMgQfox+sC8GRh+v Gdc6/tFCLZjpgRXFZL8cL6xRrHMBGjP9DmZC7tzVJUGpfLei7RE9WBJ3HHzUiQIp Q/zg/SkriJJwTiWh5QiSMYcQlHsUfA5qaex7ZbQKM+JKUuLAbydb7yN82HcaVyam zhopRFkNsobIev4ywCGYeypQ5MhO2DDzDqyH6g4P+Q2DO2l8wVj+vNcGgNZ1lu9R Bn/NSsREae6jCTTSKc4TFYs63R5xaCkfSoz0NlLwSEgO7BbcilX9tjlk80UO7eFE VuQDlPl/Sg0= =+rtL -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/davidhildenbrand/tags/s390x-tcg-2019-10-10' into staging - MMU DAT translation rewrite and cleanup - Implement more TCG CPU features related to the MMU (e.g., IEP) - Add the current instruction length to unwind data and clean up - Resolve one TODO for the MVCL instruction # gpg: Signature made Thu 10 Oct 2019 12:25:06 BST # gpg: using RSA key 1BD9CAAD735C4C3A460DFCCA4DDE10F700FF835A # gpg: issuer "david@redhat.com" # gpg: Good signature from "David Hildenbrand <david@redhat.com>" [unknown] # gpg: aka "David Hildenbrand <davidhildenbrand@gmail.com>" [full] # Primary key fingerprint: 1BD9 CAAD 735C 4C3A 460D FCCA 4DDE 10F7 00FF 835A * remotes/davidhildenbrand/tags/s390x-tcg-2019-10-10: (31 commits) s390x/tcg: MVCL: Exit to main loop if requested target/s390x: Remove ILEN_UNWIND target/s390x: Remove ilen argument from trigger_pgm_exception target/s390x: Remove ilen argument from trigger_access_exception target/s390x: Remove ILEN_AUTO target/s390x: Rely on unwinding in s390_cpu_virt_mem_rw target/s390x: Rely on unwinding in s390_cpu_tlb_fill target/s390x: Simplify helper_lra target/s390x: Remove fail variable from s390_cpu_tlb_fill target/s390x: Return exception from translate_pages target/s390x: Return exception from mmu_translate target/s390x: Remove exc argument to mmu_translate_asce target/s390x: Return exception from mmu_translate_real target/s390x: Handle tec in s390_cpu_tlb_fill target/s390x: Push trigger_pgm_exception lower in s390_cpu_tlb_fill target/s390x: Use tcg_s390_program_interrupt in TCG helpers target/s390x: Remove ilen parameter from s390_program_interrupt target/s390x: Remove ilen parameter from tcg_s390_program_interrupt target/s390x: Add ilen to unwind data s390x/cpumodel: Add new TCG features to QEMU cpu model ... Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2019-10-14 10:42:35 +01:00
Evgeny Yakovlev	d964d3b5ab	hw/vfio/pci: fix double free in vfio_msi_disable The following guest behaviour patter leads to double free in VFIO PCI: 1. Guest enables MSI interrupts vfio_msi_enable is called, but fails in vfio_enable_vectors. In our case this was because VFIO GPU device was in D3 state. Unhappy path in vfio_msi_enable will g_free(vdev->msi_vectors) but not set this pointer to NULL 2. Guest still sees MSI an enabled after that because emulated config write is done in vfio_pci_write_config unconditionally before calling vfio_msi_enable 3. Guest disables MSI interrupts vfio_msi_disable is called and tries to g_free(vdev->msi_vectors) in vfio_msi_disable_common => double free Signed-off-by: Evgeny Yakovlev <wrfsh@yandex-team.ru> Signed-off-by: Alex Williamson <alex.williamson@redhat.com>	2019-10-10 11:07:28 -06:00
Greg Kurz	e410bbc52a	MAINTAINERS: Downgrade status of virtio-9p to "Odd Fixes" Latest submissions for 9pfs made me realize that I no longer have time and motivation to actively support it. I'll stay around for odd fixes though. Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 12:36:28 +02:00
David Hildenbrand	1f6493be08	s390x/tcg: MVCL: Exit to main loop if requested MVCL is interruptible and we should check for interrupts and process them after writing back the variables to the registers. Let's check for any exit requests and exit to the main loop. Introduce a new helper function for that: cpu_loop_exit_requested(). When booting Fedora 30, I can see a handful of these exits and it seems to work reliable. Also, Richard explained why this works correctly even when MVCL is called via EXECUTE: (1) TB with EXECUTE runs, at address Ae - env->psw_addr stored with Ae. - helper_ex() runs, memory address Am computed from D2a(X2a,B2a) or from psw.addr+RI2. - env->ex_value stored with memory value modified by R1a (2) TB of executee runs, - env->ex_value stored with 0. - helper_mvcl() runs, using and updating R1b, R1b+1, R2b, R2b+1. (3a) helper_mvcl() completes, - TB of executee continues, psw.addr += ilen. - Next instruction is the one following EXECUTE. (3b) helper_mvcl() exits to main loop, - cpu_loop_exit_restore() unwinds psw.addr = Ae. - Next instruction is the EXECUTE itself... - goto 1. As the PoP mentiones that an interruptible instruction called via EXECUTE should avoid modifying storage/registers that are used by EXECUTE itself, it is fine to retrigger EXECUTE. Cc: Alex Bennée <alex.bennee@linaro.org> Cc: Peter Maydell <peter.maydell@linaro.org> Cc: Paolo Bonzini <pbonzini@redhat.com> Suggested-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Richard Henderson <richard.henderson@linaro.org> Reviewed-by: Alex Bennée <alex.bennee@linaro.org> Signed-off-by: David Hildenbrand <david@redhat.com>	2019-10-10 12:27:15 +02:00
Max Reitz	35f05b2e2e	iotests/162: Fix for newer Linux 5.3+ Linux 5.3 has made 0.0.0.0/8 a working IPv4 subnet. As such, "42" is now a valid host, and the connection to it will (hopefully) time out over a long period rather than quickly return with EINVAL. So let us use a negative integer for testing that NBD will not crash when it receives integer hosts. This way, the connection will again fail quickly and reliably. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20191002174052.5773-1-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 12:13:23 +02:00
Christian Schoenebeck	6b6aa8285d	9p: Use variable length suffixes for inode remapping Use variable length suffixes for inode remapping instead of the fixed 16 bit size prefixes before. With this change the inode numbers on guest will typically be much smaller (e.g. around >2^1 .. >2^7 instead of >2^48 with the previous fixed size inode remapping. Additionally this solution is more efficient, since inode numbers in practice can take almost their entire 64 bit range on guest as well, so there is less likely a need for generating and tracking additional suffixes, which might also be beneficial for nested virtualization where each level of virtualization would shift up the inode bits and increase the chance of expensive remapping actions. The "Exponential Golomb" algorithm is used as basis for generating the variable length suffixes. The algorithm has a parameter k which controls the distribution of bits on increasing indeces (minimum bits at low index vs. maximum bits at high index). With k=0 the generated suffixes look like: Index Dec/Bin -> Generated Suffix Bin 1 [1] -> [1] (1 bits) 2 [10] -> [010] (3 bits) 3 [11] -> [110] (3 bits) 4 [100] -> [00100] (5 bits) 5 [101] -> [10100] (5 bits) 6 [110] -> [01100] (5 bits) 7 [111] -> [11100] (5 bits) 8 [1000] -> [0001000] (7 bits) 9 [1001] -> [1001000] (7 bits) 10 [1010] -> [0101000] (7 bits) 11 [1011] -> [1101000] (7 bits) 12 [1100] -> [0011000] (7 bits) ... 65533 [1111111111111101] -> [1011111111111111000000000000000] (31 bits) 65534 [1111111111111110] -> [0111111111111111000000000000000] (31 bits) 65535 [1111111111111111] -> [1111111111111111000000000000000] (31 bits) Hence minBits=1 maxBits=31 And with k=5 they would look like: Index Dec/Bin -> Generated Suffix Bin 1 [1] -> [000001] (6 bits) 2 [10] -> [100001] (6 bits) 3 [11] -> [010001] (6 bits) 4 [100] -> [110001] (6 bits) 5 [101] -> [001001] (6 bits) 6 [110] -> [101001] (6 bits) 7 [111] -> [011001] (6 bits) 8 [1000] -> [111001] (6 bits) 9 [1001] -> [000101] (6 bits) 10 [1010] -> [100101] (6 bits) 11 [1011] -> [010101] (6 bits) 12 [1100] -> [110101] (6 bits) ... 65533 [1111111111111101] -> [0011100000000000100000000000] (28 bits) 65534 [1111111111111110] -> [1011100000000000100000000000] (28 bits) 65535 [1111111111111111] -> [0111100000000000100000000000] (28 bits) Hence minBits=6 maxBits=28 Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:23 +02:00
Antonios Motakis	f3fe4a2d92	9p: stat_to_qid: implement slow path stat_to_qid attempts via qid_path_prefixmap to map unique files (which are identified by 64 bit inode nr and 32 bit device id) to a 64 QID path value. However this implementation makes some assumptions about inode number generation on the host. If qid_path_prefixmap fails, we still have 48 bits available in the QID path to fall back to a less memory efficient full mapping. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next (SHA1 7fc4c49e91). - Updated hash calls to new xxhash API. - Removed unnecessary parantheses in qpf_lookup_func(). - Removed unnecessary g_malloc0() result checks. - Log error message when running out of prefixes in qid_path_fullmap(). - Log warning message about potential degraded performance in qid_path_prefixmap(). - Wrapped qpf_table initialization to dedicated qpf_table_init() function. - Fixed typo in comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:14 +02:00
Antonios Motakis	1a6ed33cc5	9p: Added virtfs option 'multidevs=remap\|forbid\|warn' 'warn' (default): Only log an error message (once) on host if more than one device is shared by same export, except of that just ignore this config error though. This is the default behaviour for not breaking existing installations implying that they really know what they are doing. 'forbid': Like 'warn', but except of just logging an error this also denies access of guest to additional devices. 'remap': Allows to share more than one device per export by remapping inodes from host to guest appropriately. To support multiple devices on the 9p share, and avoid qid path collisions we take the device id as input to generate a unique QID path. The lowest 48 bits of the path will be set equal to the file inode, and the top bits will be uniquely assigned based on the top 16 bits of the inode and the device id. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Rebased to https://github.com/gkurz/qemu/commits/9p-next (SHA1 7fc4c49e91). - Added virtfs option 'multidevs', original patch simply did the inode remapping without being asked. - Updated hash calls to new xxhash API. - Updated docs for new option 'multidevs'. - Fixed v9fs_do_readdir() not having remapped inodes. - Log error message when running out of prefixes in qid_path_prefixmap(). - Fixed definition of QPATH_INO_MASK. - Wrapped qpp_table initialization to dedicated qpp_table_init() function. - Dropped unnecessary parantheses in qpp_lookup_func(). - Dropped unnecessary g_malloc0() result checks. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [groug: - Moved "multidevs" parsing to the local backend. - Added hint to invalid multidevs option error. - Turn "remap" into "x-remap". ] Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:05 +02:00
Antonios Motakis	3b5ee9e86b	9p: Treat multiple devices on one export as an error The QID path should uniquely identify a file. However, the inode of a file is currently used as the QID path, which on its own only uniquely identifies files within a device. Here we track the device hosting the 9pfs share, in order to prevent security issues with QID path collisions from other devices. We only print a warning for now but a subsequent patch will allow users to have finer control over the desired behaviour. Failing the I/O will be one the proposed behaviour, so we also change stat_to_qid() to return an error here in order to keep other patches simpler. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Assign dev_id to export root's device already in v9fs_device_realize_common(), not postponed in stat_to_qid(). - error_report_once() if more than one device was shared by export. - Return -ENODEV instead of -ENOSYS in stat_to_qid(). - Fixed typo in log comment. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> [groug, changed to warning, updated message and changelog] Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:05 +02:00
Greg Kurz	ea52cdd443	fsdev: Add return value to fsdev_throttle_parse_opts() It is more convenient to use the return value of the function to notify errors, rather than to be tied up setting up the &local_err boilerplate. Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:05 +02:00
Greg Kurz	c0da0cb761	9p: Simplify error path of v9fs_device_realize_common() Make v9fs_device_unrealize_common() idempotent and use it for rollback, in order to reduce code duplication. Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:04 +02:00
Antonios Motakis	8703283352	9p: unsigned type for type, version, path There is no need for signedness on these QID fields for 9p. Signed-off-by: Antonios Motakis <antonios.motakis@huawei.com> [CS: - Also make QID type unsigned. - Adjust donttouch_stat() to new types. - Adjust trace-events to new types. ] Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com> Signed-off-by: Greg Kurz <groug@kaod.org>	2019-10-10 11:36:04 +02:00
Daniel P. Berrangé	da51e998f8	tests: fix I/O test for hosts defaulting to LUKSv2 Some distros are now defaulting to LUKS version 2 which QEMU cannot process. For our I/O test that validates interoperability between the kernel/cryptsetup and QEMU, we need to explicitly ask for version 1 of the LUKS format. Signed-off-by: Daniel P. Berrangé <berrange@redhat.com> Message-id: 20190927101155.25896-1-berrange@redhat.com Tested-by: Maxim Levitsky <mlevitsk@redhat.com> Reviewed-by: Maxim Levitsky <mlevitsk@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Maxim Levitsky	e99754b42e	nbd: add empty .bdrv_reopen_prepare Fixes commit job / qemu-img commit, when commiting qcow2 file which is based on nbd export. Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1718727 Signed-off-by: Maxim Levitsky <mlevitsk@redhat.com> Message-id: 20190930213820.29777-2-mlevitsk@redhat.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	00e30f05de	block/backup: use backup-top instead of write notifiers Drop write notifiers and use filter node instead. = Changes = 1. Add filter-node-name argument for backup qmp api. We have to do it in this commit, as 257 needs to be fixed. 2. There are no more write notifiers here, so is_write_notifier parameter is dropped from block-copy paths. 3. To sync with in-flight requests at job finish we now have drained removing of the filter, we don't need rw-lock. 4. Block-copy is now using BdrvChildren instead of BlockBackends 5. As backup-top owns these children, we also move block-copy state into backup-top's ownership. = Iotest changes = 56: op-blocker doesn't shoot now, as we set it on source, but then check on filter, when trying to start second backup. To keep the test we instead can catch another collision: both jobs will get 'drive0' job-id, as job-id parameter is unspecified. To prevent interleaving with file-posix locks (as they are dependent on config) let's use another target for second backup. Also, it's obvious now that we'd like to drop this op-blocker at all and add a test-case for two backups from one node (to different destinations) actually works. But not in these series. 141: Output changed: prepatch, "Node is in use" comes from bdrv_has_blk check inside qmp_blockdev_del. But we've dropped block-copy blk objects, so no more blk objects on source bs (job blk is on backup-top filter bs). New message is from op-blocker, which is the next check in qmp_blockdev_add. 257: The test wants to emulate guest write during backup. They should go to filter node, not to original source node, of course. Therefore we need to specify filter node name and use it. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191001131409.14202-6-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	7df7868b96	block: introduce backup-top filter driver Backup-top filter caches write operations and does copy-before-write operations. The driver will be used in backup instead of write-notifiers. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191001131409.14202-5-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	0f4b02b73e	block/block-copy: split block_copy_set_callbacks function Split block_copy_set_callbacks out of block_copy_state_new. It's needed for further commit: block-copy will use BdrvChildren of backup-top filter, so it will be created from backup-top filter creation function. But callbacks will still belong to backup job and will be set in separate. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191001131409.14202-4-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	843670f30f	block/backup: move write_flags calculation inside backup_job_create This is logic-less refactoring, which simplifies further patch, as we'll need write_flags for backup-top filter creation and backup-top should be created before block job creation. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191001131409.14202-3-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	a6ffe1998c	block/backup: move in-flight requests handling from backup to block-copy Move synchronization mechanism to block-copy, to be able to use one block-copy instance from backup job and backup-top filter in parallel. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20191001131409.14202-2-vsementsov@virtuozzo.com Reviewed-by: Max Reitz <mreitz@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Max Reitz	f2d86ade4d	iotests: Use stat -c %b in 125 125 should not use qemu-img to get the on-disk image size, because that reports it in a human-readable format that is useless to us. Just use stat instead (like we do to get the image file length). Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190925183231.11196-4-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Max Reitz	285f595d74	iotests: Disable 125 on broken XFS versions And by that I mean all XFS versions, as far as I can tell. All details are in the comment below. We never noticed this problem because we only read the first number from qemu-img info's "disk size" output -- and that is effectively useless, because qemu-img prints a human-readable value (which generally includes a decimal point). That will be fixed in the next patch. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190925183231.11196-3-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Max Reitz	e6e8db0337	iotests: Fix 125 for growth_mode = metadata If we use growth_mode = metadata, it is very much possible that the file uses more disk space after we have written something to the added area. We did indeed want to test for this case, but unfortunately we evidently just copied the code from the "Test creation preallocation" section and forgot to replace "$create_mode" by "$growth_mode". We never noticed because we only read the first number from qemu-img info's "disk size" output -- and that is effectively useless, because qemu-img prints a human-readable value (which generally includes a decimal point). That will be fixed in the patch after the next one. Signed-off-by: Max Reitz <mreitz@redhat.com> Message-id: 20190925183231.11196-2-mreitz@redhat.com Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	d924559953	qapi: query-blockstat: add driver specific file-posix stats A block driver can provide a callback to report driver-specific statistics. file-posix driver now reports discard statistics Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Acked-by: Markus Armbruster <armbru@redhat.com> Message-id: 20190923121737.83281-10-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	1c45036636	file-posix: account discard operations This will help to identify how many of the user-issued discard operations (accounted on a device level) have actually suceeded down on the host file (even though the numbers will not be exactly the same if non-raw format driver is used (e.g. qcow2 sending metadata discards)). Note that these numbers will not include discards triggered by write-zeroes + MAY_UNMAP calls. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190923121737.83281-9-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	4989ef5781	scsi: account unmap operations Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190923121737.83281-8-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	90ebf8431e	scsi: move unmap error checking to the complete callback This will help to account the operation in the following commit. The difference is that we don't call scsi_disk_req_check_error() before the 1st discard iteration anymore. That function also checks if the request is cancelled, however it shouldn't get canceled until it yields in blk_aio() functions anyway. Same approach is already used for emulate_write_same. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Message-id: 20190923121737.83281-7-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	6d0680823c	scsi: store unmap offset and nb_sectors in request struct it allows to report it in the error handler Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Message-id: 20190923121737.83281-6-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	99f18035b5	ide: account UNMAP (TRIM) operations Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190923121737.83281-5-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	f344446654	block: add empty account cookie type Each block_acct_done/failed call is designed to correspond to a previous block_acct_start call, which initializes the stats cookie. However sometimes it is not the case, e.g. some error paths might report the same cookie twice because it is hard to accurately track if the cookie was reported yet or not. This patch cleans the cookie after report. (Note: block_acct_failed/done without a previous block_acct_start at all should be avoided. Uninitialized cookie might hold a garbage value and there is still "< BLOCK_MAX_IOTYPE" assertion for that) It will be particularly useful in ide code where it's hard to keep track whether the request done its accounting or not: in the following patch of the series, trim requests will do the accounting separately. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190923121737.83281-4-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	159f85ddc8	qapi: add unmap to BlockDeviceStats Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Eric Blake <eblake@redhat.com> Message-id: 20190923121737.83281-3-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Anton Nefedov	329d27e38c	qapi: group BlockDeviceStats fields Make the stat fields definition slightly more readable. Also reorder total_time_ns stats read-write-flush as done elsewhere. Cosmetic change only. Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com> Reviewed-by: Alberto Garcia <berto@igalia.com> Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Message-id: 20190923121737.83281-2-anton.nefedov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	15e4e6e6ad	iotests: 257: drop device_add SCSI devices are unused in test, drop them. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190920142056.12778-12-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:18 +02:00
Vladimir Sementsov-Ogievskiy	f164845479	iotests: 257: drop unused Drive.device field After previous commit Drive.device is actually unused. Drop it together with .name property. While being here reuse .node in qmp commands instead of writing 'drive0' twice. Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Message-id: 20190920142056.12778-11-vsementsov@virtuozzo.com Signed-off-by: Max Reitz <mreitz@redhat.com>	2019-10-10 10:56:17 +02:00

1 2 3 4 5 ...

72384 Commits