mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Liu Yuan	6f74c260b4	sheepdog: pass vdi_id to sheep daemon for sd_close() Sheep daemon needs vdi_id to identify which vdi is closed to release resources such as object cache. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-02-01 14:58:28 +01:00
Othmar Pasteka	7f2039f611	vmdk: Allow selecting SCSI adapter in image creation Introduce a new option "adapter_type" when converting to vmdk images. It can be one of the following: ide (default), buslogic, lsilogic or legacyESX (according to the vmdk spec from vmware). In case of a non-ide adapter, heads is set to 255 instead of the 16. The latter is used for "ide". Also see LP#545089 Signed-off-by: Othmar Pasteka <pasteka@kabsi.at> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-02-01 14:58:28 +01:00
Markus Armbruster	6528499fa4	g_malloc(0) and g_malloc0(0) return NULL; simplify Once upon a time, it was decided that qemu_malloc(0) should abort. Switching to glib retired that bright idea. Some code that was added to cope with it (e.g. in commits `702ef63`, `b76b6e9`) is still around. Bury it. Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-30 11:14:46 +01:00
Paolo Bonzini	88ff0e48ee	mirror: do nothing on zero-sized disk On a zero-sized disk we need to break out of the job successfully before bdrv_dirty_iter_init is called, otherwise you will get an assertion failure with the next patch. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Laszlo Ersek <lersek@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Stefan Weil	0e87ba2ccb	block/vdi: Check for bad signature vdi_open did not check for a bad signature. This check was only in vdi_probe. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Stefan Weil	8937f8222c	block/vdi: Improved return values from vdi_open vdi_open returned -1 in case of any error, but it should return an error code (negative value of errno or -EMEDIUMTYPE). Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Stefan Weil	9f0470bb2d	block/vdi: Improve debug output for signature The signature is a 32 bit value and needs up to 8 hex digits for printing. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Stefan Weil	15bac0d54f	block: Use error code EMEDIUMTYPE for wrong format in some block drivers This improves error reports for bochs, cow, qcow, qcow2, qed and vmdk when a file with the wrong format is selected. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Paolo Bonzini	884fea4e87	mirror: support arbitrarily-sized iterations Yet another optimization is to extend the mirroring iteration to include more adjacent dirty blocks. This limits the number of I/O operations and makes mirroring efficient even with a small granularity. Most of the infrastructure is already in place; we only need to put a loop around the computation of the origin and sector count of the iteration. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Paolo Bonzini	402a47411b	mirror: support more than one in-flight AIO operation With AIO support in place, we can start copying more than one chunk in parallel. This patch introduces the required infrastructure for this: the buffer is split into multiple granularity-sized chunks, and there is a free list to access them. Because of copy-on-write, a single operation may already require multiple chunks to be available on the free list. In addition, two different iterations on the HBitmap may want to copy the same cluster. We avoid this by keeping a bitmap of in-flight I/O operations, and blocking until the previous iteration completes. This should be a pretty rare occurrence, though; as long as there is no overlap the next iteration can start before the previous one finishes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:35 +01:00
Paolo Bonzini	08e4ed6cde	mirror: add buf-size argument to drive-mirror This makes sense when the next commit starts using the extra buffer space to perform many I/O operations asynchronously. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	bd48bde8f0	mirror: switch mirror_iteration to AIO There is really no change in the behavior of the job here, since there is still a maximum of one in-flight I/O operation between the source and the target. However, this patch already introduces the AIO callbacks (which are unmodified in the next patch) and some of the logic to count in-flight operations and only complete the job when there is none. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	eee13dfe30	mirror: allow customizing the granularity The desired granularity may be very different depending on the kind of operation (e.g. continuous replication vs. collapse-to-raw) and whether the VM is expected to perform lots of I/O while mirroring is in progress. Allow the user to customize it, while providing a sane default so that in general there will be no extra allocated space in the target compared to the source. Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	50717e941b	block: allow customizing the granularity of the dirty bitmap Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:34 +01:00
Paolo Bonzini	acc906c6c5	block: return count of dirty sectors, not chunks Reviewed-by: Laszlo Ersek <lersek@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:33 +01:00
Paolo Bonzini	b812f6719c	mirror: perform COW if the cluster size is bigger than the granularity When mirroring runs, the backing files for the target may not yet be ready. However, this means that a copy-on-write operation on the target would fill the missing sectors with zeros. Copy-on-write only happens if the granularity of the dirty bitmap is smaller than the cluster size (and only for clusters that are allocated in the source after the job has started copying). So far, the granularity was fixed to 1MB; to avoid the problem we detected the situation and required the backing files to be available in that case only. However, we want to lower the granularity for efficiency, so we need a better solution. The solution is to always copy a whole cluster the first time it is touched. The code keeps a bitmap of clusters that have already been allocated by the mirroring job, and only does "manual" copy-on-write if the chunk being copied is zero in the bitmap. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:33 +01:00
Paolo Bonzini	8f0720ecbc	block: implement dirty bitmap using HBitmap This actually uses the dirty bitmap in the block layer, and converts mirroring to use an HBitmapIter. Reviewed-by: Laszlo Ersek <lersek@redhat.com> (except block/mirror.c parts) Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-25 18:18:33 +01:00
Peter Lieven	7371d56fb2	iscsi: add support for iovectors This patch adds support for directly passing the iovec array from QEMUIOVector if libiscsi supports it (1.8.0 or newer). Signed-off-by: Peter Lieven <pl@kamp.de> [Preserve the improvements from commit `4cc841b`, iscsi: partly avoid iovec linearization in iscsi_aio_writev, 2012-11-19 - Paolo] Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-24 15:37:55 +01:00
Paolo Bonzini	4790b03d30	iscsi: do not leak acb->buf when commands are aborted acb->buf is freed in the WRITE(16) callback, but this may not get called at all when commands are aborted. Add another free in the ABORT TASK callback, which requires setting acb->buf to NULL everywhere. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-24 15:37:55 +01:00
Anthony Liguori	177f7fc688	Merge remote-tracking branch 'bonzini/scsi-next' into staging # By Peter Lieven (3) and others # Via Paolo Bonzini * bonzini/scsi-next: scsi: Drop useless null test in scsi_unit_attention() lsi: use qbus_reset_all to reset SCSI bus scsi: fix segfault with 0-byte disk iscsi: add support for iSCSI NOPs [v2] iscsi: partly avoid iovec linearization in iscsi_aio_writev iscsi: add iscsi_create support	2013-01-23 09:08:54 -06:00
Peter Lieven	5b5d34ec98	iscsi: add support for iSCSI NOPs [v2] This patch will send NOP-Out PDUs every 5 seconds to the iSCSI target. If a consecutive number of NOP-In replies fail a reconnect is initiated. iSCSI NOPs help to ensure that the connection to the target is still operational. This should not, but in reality may be the case even if the TCP connection is still alive if there are bugs in either the target or the initiator implementation. v2: - track the NOPs inside libiscsi so libiscsi can reset the counter in case it initiates a reconnect. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-22 15:07:03 +01:00
Peter Lieven	4cc841b57c	iscsi: partly avoid iovec linearization in iscsi_aio_writev libiscsi expects all write16 data in a linear buffer. If the iovec only contains one buffer we can skip the linearization step as well as the additional malloc/free and pass the buffer directly. Reported-by: Ronnie Sahlberg <ronniesahlberg@gmail.com> Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-22 15:07:03 +01:00
Peter Lieven	de8864e5ae	iscsi: add iscsi_create support This patch adds support for bdrv_create. This allows e.g. to use qemu-img to convert from any supported device to an iscsi backed storage as destination. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-22 15:07:03 +01:00
Anthony Liguori	8b17ed4caa	Merge remote-tracking branch 'stefanha/block' into staging # By Kevin Wolf (4) and others # Via Stefan Hajnoczi * stefanha/block: dataplane: support viostor virtio-pci status bit setting dataplane: avoid reentrancy during virtio_blk_data_plane_stop() win32-aio: use iov utility functions instead of open-coding them win32-aio: Fix memory leak win32-aio: Fix vectored reads aio: Fix return value of aio_poll() ide: Remove wrong assertion block: fix null-pointer bug on error case in block commit	2013-01-20 11:01:10 -06:00
Andreas Färber	c36dd8a09f	block/raw-posix: Make hdev_aio_discard() available outside Linux Fixes the build on OpenBSD among others. Suggested-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Andreas Färber <andreas.faerber@web.de> Cc: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Blue Swirl <blauwirbel@gmail.com>	2013-01-19 14:35:02 +00:00
Michael Tokarev	3249dbe661	win32-aio: use iov utility functions instead of open-coding them We have iov_from_buf() and iov_to_buf(), use them instead of open-coding these in block/win32-aio.c Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-18 09:57:51 +01:00
Kevin Wolf	e8bccad5ac	win32-aio: Fix memory leak The buffer is allocated for both reads and writes, and obviously it should be freed even if an error occurs. Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-17 10:58:09 +01:00
Kevin Wolf	bcbbd234d4	win32-aio: Fix vectored reads Copying data in the right direction really helps a lot! Cc: qemu-stable@nongnu.org Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-17 10:57:13 +01:00
Jeff Cody	6d759117d3	block: fix null-pointer bug on error case in block commit This is a bug that was caught by a coverity run by Markus. In the error case when we errored out to exit_restore_open early in the function, 'overlay_bs' was still NULL at that point, although it is used to look up flags and perform a bdrv_reopen(). Move the overlay_bs lookup to where it is needed, and check for NULL before restoring the flags. Also get rid of the unneeded parameter initialization. Reported-By: Markus Armbruster <armbru@redhat.com> Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-17 10:51:11 +01:00
Markus Armbruster	7191bf311e	block: Fix how mirror_run() frees its buffer It allocates with qemu_blockalign(), therefore it must free with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 17:28:55 +01:00
Markus Armbruster	7479acdbce	win32-aio: Fix how win32_aio_process_completion() frees buffer win32_aio_submit() allocates it with qemu_blockalign(), therefore it must be freed with qemu_vfree(), not g_free(). Signed-off-by: Markus Armbruster <armbru@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 16:47:45 +01:00
Liu Yuan	f700f8e346	sheepdog: clean up sd_aio_setup() The last two parameters of sd_aio_setup() are never used, so remove them. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 13:40:10 +01:00
Liu Yuan	4778307278	sheepdog: multiplex the rw FD to flush cache This will reduce sockfds connected to the sheep server to one, which simply the future hacks. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 11:18:49 +01:00
Paolo Bonzini	8238010b26	block: make discard asynchronous This is easy with the thread pool, because we can use s->is_xfs and s->has_discard from the worker function. QEMU has a widespread assumption that each I/O operation writes less than 2^32 bytes. This patch doesn't fix it throughout of course, but it starts correcting struct RawPosixAIOData so that there is no regression with respect to the synchronous discard implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Paolo Bonzini	fcd9d45552	raw: support discard on block devices Block devices use a ioctl instead of fallocate, so add a separate implementation. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Paolo Bonzini	c85191e5c9	raw-posix: remember whether discard failed Avoid sending system calls repeatedly if they shall fail. This does not apply to XFS: if the filesystem-specific ioctl fails, something weird is happening. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Kusanagi Kouichi	3d4fa43e64	raw-posix: support discard on more filesystems Linux 2.6.38 introduced the filesystem independent interface to deallocate part of a file. As of Linux 3.7, btrfs, ext4, ocfs2, tmpfs and xfs support it. Even though the system calls here are in practice issued on Linux, the code is structured to allow plugging in alternatives for other Unix variants. EOPNOTSUPP is used unconditionally in this patch, but it is supported in both OpenBSD and Mac OS X since forever (see for example http://lists.debian.org/debian-glibc/2006/02/msg00337.html). Signed-off-by: Kusanagi Kouichi <slash@ac.auone-net.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 10:03:47 +01:00
Kevin Wolf	8d2497c355	qcow2: Fix segfault on zero-length write One of the recent refactoring patches (commit `f50f88b9`) didn't take care to initialise l2meta properly, so with zero-length writes, which don't even enter the write loop, qemu just segfaulted. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-15 09:08:55 +01:00
Anthony Liguori	da758bd7a3	Merge remote-tracking branch 'kwolf/for-anthony' into staging * kwolf/for-anthony: dataplane: handle misaligned virtio-blk requests dataplane: extract virtio-blk read/write processing into do_rdwr_cmd() block: make qiov_is_aligned() public raw-posix: fix bdrv_aio_ioctl sheepdog: implement direct write semantics block: do not probe zero-sized disks Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>	2013-01-14 10:26:26 -06:00
Stefan Hajnoczi	c53b1c5114	block: make qiov_is_aligned() public The qiov_is_aligned() function checks whether a QEMUIOVector meets a BlockDriverState's alignment requirements. This is needed by virtio-blk-data-plane so: 1. Move the function from block/raw-posix.c to block/block.c. 2. Make it public in block/block.h. 3. Rename to bdrv_qiov_is_aligned(). 4. Change return type from int to bool. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-14 10:06:56 +01:00
Paolo Bonzini	b608c8dc02	raw-posix: fix bdrv_aio_ioctl When the raw-posix aio=thread code was moved from posix-aio-compat.c to block/raw-posix.c, there was an unintended change to the ioctl code. The code used to return the ioctl command, which posix_aio_read() would later morph into a zero. This hack is not necessary anymore, and in fact breaks scsi-generic (which expects a zero return code). Remove it. Cc: qemu-stable@nongnu.org Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-14 10:06:56 +01:00
Liu Yuan	0e7106d8b5	sheepdog: implement direct write semantics Sheepdog supports both writeback/writethrough write but has not yet supported DIRECTIO semantics which bypass the cache completely even if Sheepdog daemon is set up with cache enabled. Suppose cache is enabled on Sheepdog daemon size, the new cache control is cache=writeback # enable the writeback semantics for write cache=writethrough # enable the emulated writethrough semantics for write cache=directsync # disable cache competely Guest WCE toggling on the run time to toggle writeback/writethrough is also supported. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Cc: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2013-01-14 10:06:56 +01:00
Paolo Bonzini	4d4545743f	qemu-option: move standard option definitions out of qemu-config.c Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2013-01-12 17:17:53 +01:00
Stefan Weil	eb7ff6fb0b	Replace remaining gmtime, localtime by gmtime_r, localtime_r This allows removing of MinGW specific code and improves reentrancy for POSIX hosts. [Removed unused ret variable in qemu_get_timedate() to fix warning: vl.c: In function ‘qemu_get_timedate’: vl.c:451:16: error: variable ‘ret’ set but not used [-Werror=unused-but-set-variable] -- Stefan Hajnoczi] Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-11 09:44:37 +01:00
Liu Yuan	d6b1ef89a1	sheepdog: pass oid directly to send_pending_req() Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 16:09:00 +01:00
Liu Yuan	bd751f2204	sheepdog: don't update inode when create_and_write fails For the error case such as SD_RES_NO_SPACE, we shouldn't update the inode bitmap to avoid the scenario that the object is allocated but wasn't created at the server side. This will result in VM's IO error on the failed object. Cc: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Cc: Kevin Wolf <kwolf@redhat.com> Signed-off-by: Liu Yuan <tailai.ly@taobao.com> Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 16:08:58 +01:00
Stefan Weil	fccedc624c	block/raw-win32: Fix compiler warnings (wrong format specifiers) Commit `fbcad04d6b` added fprintf statements with wrong format specifiers. GetLastError() returns a DWORD which is unsigned long, so %lu must be used. Signed-off-by: Stefan Weil <sw@weilnetz.de> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 16:08:57 +01:00
Stefan Hajnoczi	4065742ac0	raw-posix: add raw_get_aio_fd() for virtio-blk-data-plane The raw_get_aio_fd() function allows virtio-blk-data-plane to get the file descriptor of a raw image file with Linux AIO enabled. This interface is really a layering violation that can be resolved once the block layer is able to run outside the global mutex - at that point virtio-blk-data-plane will switch from custom Linux AIO code to using the block layer. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2013-01-02 15:31:39 +01:00
Paolo Bonzini	9c17d615a6	softmmu: move include files to include/sysemu/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:32:45 +01:00
Paolo Bonzini	1de7afc984	misc: move include files to include/qemu/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:32:39 +01:00
Paolo Bonzini	caf71f86a3	migration: move include files to include/migration/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:32 +01:00
Paolo Bonzini	737e150e89	block: move include files to include/block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Paolo Bonzini	7b1b5d1913	qapi: move include files to include/qobject/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:31:31 +01:00
Paolo Bonzini	f8fe796407	janitor: do not include qemu-char everywhere Touching char/char.h basically causes the whole of QEMU to be rebuilt. Avoid this, it is usually unnecessary. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:59 +01:00
Paolo Bonzini	077805fa92	janitor: do not rely on indirect inclusions of or from qemu-char.h Various header files rely on qemu-char.h including qemu-config.h or main-loop.h, but they really do not need qemu-char.h at all (particularly interesting is the case of the block layer!). Clean this up, and also add missing inclusions of qemu-char.h itself. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:52 +01:00
Paolo Bonzini	525877c999	build: move rules from Makefile to */Makefile.objs Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-12-19 08:29:06 +01:00
Kevin Wolf	226c3c26b9	qcow2: Factor out handle_dependencies() Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	4e95314e2b	qcow2: Execute run_dependent_requests() without lock There's no reason for run_dependent_requests() to hold s->lock, and a later patch will require that in fact the lock is not held. Also, before this patch, run_dependent_requests() not only does what its name suggests, but also removes the l2meta from the list of in-flight requests. When changing this, it becomes an one-liner, so just inline it completely. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	280d373579	qcow2: Enable dirty flag in qcow2_alloc_cluster_link_l2 This is closer to where the dirty flag is really needed, and it avoids having checks for special cases related to cluster allocation directly in the writev loop. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	f50f88b9fe	qcow2: Allocate l2meta only for cluster allocations Even for writes to already allocated clusters, an l2meta is allocated, though it stays effectively unused. After this patch, only allocating requests still have one. Each l2meta now describes an in-flight request that writes to clusters that are not yet hooked up in the L2 table. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	060bee8943	qcow2: Drop l2meta.cluster_offset There's no real reason to have an l2meta for normal requests that don't allocate anything. Before we can get rid of it, we must return the host cluster offset in a different way. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	cf5c1a231e	qcow2: Allocate l2meta dynamically As soon as delayed COW is introduced, the l2meta struct is needed even after completion of the request, so it can't live on the stack. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	593fb83cac	qcow2: Introduce Qcow2COWRegion This makes it easier to address the areas for which a COW must be performed. As a nice side effect, the COW code in qcow2_alloc_cluster_link_l2 becomes really trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	1d3afd649b	qcow2: Round QCowL2Meta.offset down to cluster boundary The offset within the cluster is already present as n_start and this is what the code uses. QCowL2Meta.offset is only needed at a cluster granularity. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-13 15:37:59 +01:00
Kevin Wolf	67a7a0ebe5	qcow2: Move BLKDBG_EVENT out of the lock We want to use these events to suspend requests for testing concurrent AIO requests. Suspending requests while they are holding the CoMutex is rather boring for this purpose. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Kevin Wolf	3c90c65d7a	blkdebug: Implement suspend/resume of AIO requests This allows more systematic AIO testing. The patch adds three new operations to blkdebug: * Setting a "breakpoint" on a blkdebug event. The next request that triggers this breakpoint is suspended and is tagged with a name. The breakpoint is removed after a request has triggered it. * A suspended request (identified by it's tag) can be resumed * It's possible to check whether a suspended request with a given tag exists. This can be used for waiting for an event. Ideally, we would instead tag requests right when they are created and set breakpoints for individual requests. However, at this point the block layer doesn't allow this easily, and breakpoints that trigger for any request already allow a lot of useful testing. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Kevin Wolf	9e35542b0f	blkdebug: Factor out remove_rule() The cleanup work to remove a rule depends on the type of the rule. It's easy for the existing rules as there is no data that must be cleaned up and is specific to a type yet, but the next patch will change this. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Kevin Wolf	312a2ba0eb	blkdebug: Allow usage without config file As soon as new rules can be set during runtime, as introduced by the next patch, blkdebug makes sense even without a config file. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-12 12:33:48 +01:00
Fabien Chouteau	fbcad04d6b	Fix error code checking for SetFilePointer() call An error has occurred if the return value is invalid_set_file_pointer and getlasterror doesn't return no_error. Signed-off-by: Fabien Chouteau <chouteau@adacore.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:36:57 +01:00
Stefan Priebe	473c7f0255	rbd: Fix race between aio completition and aio cancel This one fixes a race which qemu had also in iscsi block driver between cancellation and io completition. qemu_rbd_aio_cancel was not synchronously waiting for the end of the command. To archieve this it introduces a new status flag which uses -EINPROGRESS. Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Reviewed-by: Stefan Hajnoczi <stefanha@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-11 11:05:11 +01:00
Paolo Bonzini	c208e8c2d8	raw-posix: inline paio_ioctl into hdev_aio_ioctl clang now warns about an unused function: CC block/raw-posix.o block/raw-posix.c:707:26: warning: unused function paio_ioctl [-Wunused-function] static BlockDriverAIOCB paio_ioctl(BlockDriverState bs, int fd, ^ 1 warning generated. because the only use of paio_ioctl() is inside a #if defined(__linux__) guard and it is static now. Reported-by: Peter Maydell <peter.maydell@linaro.org> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:04:26 +01:00
Charles Arnold	258d2edbcd	block: vpc support for ~2 TB disks The VHD specification allows for up to a 2 TB disk size. The current implementation in qemu emulates EIDE and ATA-2 hardware which only allows for up to 127 GB. This disk size limitation can be overridden by allowing up to 255 heads instead of the normal 4 bit limitation of 16. Doing so allows disk images to be created of up to nearly 2 TB. This change does not violate the VHD format specification nor does it change how smaller disks (ie, <=127GB) are defined. [Charles Arnold also writes: "In analyzing a 160 GB VHD fixed disk image created on Windows 2008 R2, it appears that MS is also ignoring the CHS values in the footer geometry field in whatever driver they use for accessing the image. The CHS values are set at 65535,16,255 which obviously doesn't represent an image size of 160 GB." -- Stefan] Signed-off-by: Charles Arnold <carnold@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:04:26 +01:00
Charles Arnold	1fe1fa510a	block: vpc initialize the uuid footer field Initialize the uuid field in the footer with a generated uuid. Signed-off-by: Charles Arnold <carnold@suse.com> Reviewed-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-12-11 11:04:25 +01:00
Kevin Wolf	c57b6656c3	aio: Get rid of qemu_aio_flush() There are no remaining users, and new users should probably be using bdrv_drain_all() in the first place. Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-12-11 11:04:25 +01:00
Peter Lieven	f807ecd574	iscsi: do not assume device is zero initialized Without any complex checks we can't assume that an iscsi target is initialized to zero. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-28 12:51:58 +01:00
Peter Lieven	e829b0bb05	iscsi: fix deadlock during login If the connection is interrupted before the first login is successfully completed qemu-kvm is waiting forever in qemu_aio_wait(). This is fixed by performing an sync login to the target. If the connection breaks after the first successful login errors are handled internally by libiscsi. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-28 12:50:56 +01:00
Peter Lieven	8da1e18b0c	iscsi: fix segfault in url parsing If an invalid URL is specified iscsi_get_error(iscsi) is called with iscsi == NULL. Signed-off-by: Peter Lieven <pl@kamp.de> Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-28 12:46:13 +01:00
Stefan Priebe	08448d5195	use int64_t for return values from rbd instead of int rbd / rados tends to return pretty often length of writes or discarded blocks. These values might be bigger than int. The steps to reproduce are: mkfs.xfs -f a whole device bigger than int in bytes. mkfs.xfs sends a discard. Important is that you use scsi-hd and set discard_granularity=512. Otherwise rbd disabled discard support. Signed-off-by: Stefan Priebe <s.priebe@profihost.ag> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-11-21 09:43:23 +01:00
Stefan Hajnoczi	8ba2aae32c	vdi: don't override libuuid symbols It's poor symbol hygiene to provide a global symbols that collide with a common library like libuuid. If QEMU links against a shared library that depends on uuid_generate() it can end up calling our stub version of the function. This exact scenario happened with GlusterFS libgfapi.so, which depends on libglusterfs.so's uuid_generate(). Scope the uuid stubs for vdi.c only and avoid affecting other shared objects. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com>	2012-11-21 09:40:29 +01:00
Jeff Cody	1bc6b705ee	block: add bdrv_reopen() support for raw hdev, floppy, and cdrom For hdev, floppy, and cdrom, the reopen() handlers are the same as for the file reopen handler. For floppy and cdrom types, however, we keep O_NONBLOCK, as in the _open function. Signed-off-by: Jeff Cody <jcody@redhat.com> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-11-21 09:40:29 +01:00
Gerhard Wiesinger	b1649fae49	vmdk: Fix data corruption bug in WRITE and READ handling Fixed a MAJOR BUG in VMDK files on file boundaries on reads and ALSO ON WRITES WHICH MIGHT CORRUPT THE IMAGE AND DATA!!!!!! Triggered for example with the following VMDK file (partly listed): RW 4193792 FLAT "XP-W1-f001.vmdk" 0 RW 2097664 FLAT "XP-W1-f002.vmdk" 0 RW 4193792 FLAT "XP-W1-f003.vmdk" 0 RW 512 FLAT "XP-W1-f004.vmdk" 0 RW 4193792 FLAT "XP-W1-f005.vmdk" 0 RW 2097664 FLAT "XP-W1-f006.vmdk" 0 RW 4193792 FLAT "XP-W1-f007.vmdk" 0 RW 512 FLAT "XP-W1-f008.vmdk" 0 Patch includes: 1.) Patch fixes wrong calculation on extent boundaries. Especially it fixes the relativeness of the sector number to the current extent. Verfied correctness with: 1.) Converted either with Virtualbox to VDI and then with qemu-img and then with qemu-img only: VBoxManage clonehd --format vdi /VM/XP-W/new/XP-W1.vmdk ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi ./qemu-img convert -O raw ~/.VirtualBox/Harddisks/XP-W1-new-test.vdi /root/QEMU/VM-XP-W1/XP-W1-via-VBOX.img md5sum /root/QEMU/VM-XP-W/XP-W1-direct.img md5sum /root/QEMU/VM-XP-W/XP-W1-via-VBOX.img => same MD5 hash 2.) Verified debug log files 3.) Run Windows XP successfully 4.) chkdsk run successfully without any errors Signed-off-by: Gerhard Wiesinger <lists@wiesinger.com> Acked-by: Fam Zheng <famcool@gmail.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-11-14 18:19:23 +01:00
Stefan Hajnoczi	d7331bed11	aio: rename AIOPool to AIOCBInfo Now that AIOPool no longer keeps a freelist, it isn't really a "pool" anymore. Rename it to AIOCBInfo and make it const since it no longer needs to be modified. Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-11-14 18:19:21 +01:00
Stefan Weil	cee40d2d2d	block: Workaround for older versions of MinGW gcc Versions before gcc-4.6 don't support unnamed fields in initializers (see http://gcc.gnu.org/bugzilla/show_bug.cgi?id=10676). Offset and OffsetHigh belong to an unnamed struct which is part of an unnamed union. Therefore the original code does not work with older versions of gcc. Signed-off-by: Stefan Weil <sw@weilnetz.de> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-11-14 18:19:21 +01:00
Kevin Wolf	a354807706	qcow2: Fix refcount table size calculation A missing factor for the refcount table entry size in the calculation could mean that too little memory was allocated for the in-memory representation of the table, resulting in a buffer overflow. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Michael Tokarev <mjt@tls.msk.ru> Tested-by: Michael Tokarev <mjt@tls.msk.ru>	2012-11-14 18:19:21 +01:00
Paolo Bonzini	1d7d2a9d21	nbd: accept URIs The URI syntax is consistent with the Gluster syntax. Export names are specified in the path, preceded by one or more (otherwise unused) slashes. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-12 14:38:28 +01:00
Paolo Bonzini	d04b0bbbc9	nbd: accept relative path to Unix socket Adding the "is_unix" member now will simplify the parsing of NBD URIs. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-11-12 11:33:29 +01:00
Paolo Bonzini	f563a5d7a8	Merge remote-tracking branch 'origin/master' into threadpool Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:42:51 +01:00
Paolo Bonzini	a27365265c	raw-win32: implement native asynchronous I/O With the new support for EventNotifiers in the AIO event loop, we can hook a completion port to every opened file and use asynchronous I/O on them. Wine's support is extremely inefficient, also because it really does the I/O synchronously on regular files. (!) But it works, and it is good to keep the Win32 and POSIX ports as similar as possible. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	10fb6e0682	raw-posix: move linux-aio.c to block/ Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	fc4edb84bf	raw-win32: add emulated AIO support Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:13 +01:00
Paolo Bonzini	9f8540ecef	raw-posix: rename raw-posix-aio.h, hide unavailable prototypes Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:12 +01:00
Paolo Bonzini	de81a16936	raw: merge posix-aio-compat.c into block/raw-posix.c Making the qemu_paiocb specific to raw devices will let us access members of the BDRVRawState arbitrarily. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:12 +01:00
Paolo Bonzini	47e6b251a5	block: switch posix-aio-compat to threadpool This is not meant for portability, but to remove code duplication. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-31 10:38:12 +01:00
Paolo Bonzini	f42b22077b	aio: add Win32 implementation The Win32 implementation will only accept EventNotifiers, thus a few drivers are disabled under Windows. EventNotifiers are a good match for the GSource implementation, too, because the Win32 port of glib allows to place their HANDLEs in a GPollFD. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>	2012-10-30 09:30:53 +01:00
Paolo Bonzini	b952b5589a	mirror: add support for on-source-error/on-target-error Error management is important for mirroring; otherwise, an error on the target (even something as "innocent" as ENOSPC) requires to start again with a full copy. Similar to on_read_error/on_write_error, two separate knobs are provided for on_source_error (reads) and on_target_error (writes). The default is 'report' for both. The 'ignore' policy will leave the sector dirty, so that it will be retried later. Thus, it will not cause corruption. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:22 +02:00
Paolo Bonzini	d63ffd87ac	mirror: implement completion Switching to the target of the migration is done mostly asynchronously, and reported to management via the BLOCK_JOB_COMPLETED event; the only synchronous phase is opening the backing files. bdrv_open_backing_file can always be done, even for migration of the full image (aka sync: 'full'). In this case, qmp_drive_mirror will create the target disk with no backing file at all, and bdrv_open_backing_file will be a no-op. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:22 +02:00
Paolo Bonzini	893f7ebafe	mirror: introduce mirror job This patch adds the implementation of a new job that mirrors a disk to a new image while letting the guest continue using the old image. The target is treated as a "black box" and data is copied from the source to the target in the background. This can be used for several purposes, including storage migration, continuous replication, and observation of the guest I/O in an external program. It is also a first step in replacing the inefficient block migration code that is part of QEMU. The job is possibly never-ending, but it is logically structured into two phases: 1) copy all data as fast as possible until the target first gets in sync with the source; 2) keep target in sync and ensure that reopening to the target gets a correct (full) copy of the source data. The second phase is indicated by the progress in "info block-jobs" reporting the current offset to be equal to the length of the file. When the job is cancelled in the second phase, QEMU will run the job until the source is clean and quiescent, then it will report successful completion of the job. In other words, the BLOCK_JOB_CANCELLED event means that the target may _not_ be consistent with a past state of the source; the BLOCK_JOB_COMPLETED event means that the target is consistent with a past state of the source. (Note that it could already happen that management lost the race against QEMU and got a completion event instead of cancellation). It is not yet possible to complete the job and switch over to the target disk. The next patches will fix this and add many refinements to the basic idea introduced here. These include improved error management, some tunable knobs and performance optimizations. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
Paolo Bonzini	65f4632243	block: rename block_job_complete to block_job_completed The imperative will be used for the QMP command. Signed-off-by: Paolo Bonzini <pbonzini@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
Jeff Cody	d5208c45be	block: in commit, determine base image from the top image This simplifies some code and error checking, and also fixes a bug. bdrv_find_backing_image() should only be passed absolute filenames, or filenames relative to the chain. In the QMP message handler for block commit, when looking up the base do so from the determined top image, so we know it is reachable from top. Some of the error messages put out by block-commit have changed slightly, which causes 2 tests cases for block-commit to fail. This patch updates the test cases to look for the correct error output. Signed-off-by: Jeff Cody <jcody@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2012-10-24 10:26:19 +02:00
MORITA Kazutaka	2f5368017f	sheepdog: use bool for boolean variables This improves readability. Signed-off-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>	2012-10-12 10:47:35 +02:00

1 2 3 4 5 ...

854 Commits