Commit Graph

38331 Commits

Author SHA1 Message Date
Peter Lieven
9eac3622a2 block/iscsi: use the allocationmap also if cache.direct=on
the allocationmap has only a hint character. The driver always
double checks that blocks marked unallocated in the cache are
still unallocated before taking the fast path and return zeroes.
So using the allocationmap is migration safe and can
also be enabled with cache.direct=on.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-10-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
03e40fef46 block/iscsi: bump year in copyright notice
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-9-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
e380aff831 block/iscsi: handle SCSI_STATUS_TASK_SET_FULL
a target may issue a SCSI_STATUS_TASK_SET_FULL status
if there is more than one "BUSY" command queued already.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-8-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
59dd0a22ca block/iscsi: increase retry count
The idea is that a command is retried in a BUSY condition
up a time of approx. 60 seconds before it is failed. This should
be far higher than any command timeout in the guest.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-7-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
73b5394e2e block/iscsi: optimize WRITE10/16 if cache.writeback is not set
SCSI allowes to tell the target to not return from a write command
if the date is not written to the disk. Use this so called FUA
bit if it is supported to optimize WRITE commands if writeback is
not allowed.

In this case qemu always issues a WRITE followed by a FLUSH. This
is 2 round trip times. If we set the FUA bit we can ignore the
following FLUSH.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-6-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
752ce45150 block/iscsi: store DPOFUA bit from the modesense command
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-5-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
7191f2080c block/iscsi: rename iscsi_write_protected and let it return void
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-4-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
0a386e4852 block/iscsi: change all iscsilun properties from uint8_t to bool
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-3-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:10 +02:00
Peter Lieven
20474e9aa0 block/iscsi: do not forget to logout from target
We actually were always impolitely dropping the connection and
not cleanly logging out.

CC: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1429193313-4263-2-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Alberto Garcia
d5a8ee60a0 qmp: fill in the image field in BlockDeviceInfo
The image field in BlockDeviceInfo is supposed to contain an ImageInfo
object. However that is being filled in by bdrv_query_info(), not by
bdrv_block_device_info(), which is where BlockDeviceInfo is actually
created.

Anyone calling bdrv_block_device_info() directly will get a null image
field. As a consequence of this, the HMP command 'info block -n -v'
crashes QEMU.

This patch moves the code that fills in that field from
bdrv_query_info() to bdrv_block_device_info().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1429271563-3765-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Stefan Hajnoczi
9419874f70 Revert "hmp: fix crash in 'info block -n -v'"
This reverts commit 638b836620.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Alberto Garcia
dc881b441d block: add 'node-name' field to BLOCK_IMAGE_CORRUPTED
Since this event can occur in nodes that cannot have a device name
associated, include also a field with the node name.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 147cec5b3594f4bec0cb41c98afe5fcbfb67567c.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Alberto Garcia
81e5f78a9f block: use bdrv_get_device_or_node_name() in error messages
There are several error messages that identify a BlockDriverState by
its device name. However those errors can be produced in nodes that
don't have a device name associated.

In those cases we should use bdrv_get_device_or_node_name() to fall
back to the node name and produce a more meaningful message. The
messages are also updated to use the more generic term 'node' instead
of 'device'.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 9823a1f0514fdb0692e92868661c38a9e00a12d6.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Alberto Garcia
9b2aa84f87 block: add bdrv_get_device_or_node_name()
This function gets the device name associated with a BlockDriverState,
or its node name if the device name is empty.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 4fa30aa8d61d9052ce266fd5429a59a14e941255.1428485266.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Stefan Hajnoczi
ec683d6040 block: document block-stream in qmp-commands.hx
The 'block-stream' QMP command is documented in block-core.json but not
qmp-commands.hx.  Add a summary of the command to qmp-commands.hx
(similar to the documentation for 'block-commit').

Reported-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1429094622-26218-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Stefan Hajnoczi
c485cf9c92 m25p80: fix s->blk usage before assignment
Delay the call to blk_blockalign() until s->blk has been assigned.

This never caused a crash because blk_blockalign(NULL, size) defaults to
4096 alignment but it's technically incorrect.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1429091024-25098-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Paolo Bonzini
d07063e460 m25p80: add missing blk_attach_dev_nofail
Of the block devices that poked into -drive options via drive_get_next,
m25p80 was the only one who also did not attach itself to the BlockBackend.

Since sd does it, and all other devices go through a "drive" property,
with this change all block backends attached to the guest will have a
non-NULL result for blk_get_attached_dev().

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1429025387-11077-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Michael S. Tsirkin
4eb867e98c virtio_blk: comment fix
update virtio blk header from latest linux, include comment fixups.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1428854036-12806-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Paolo Bonzini
0b5a24454f block: avoid unnecessary bottom halves
bdrv_aio_* APIs can use coroutines to achieve asynchronicity.  However,
the coroutine may terminate without having yielded back to the caller
(for example because of something that invokes a nested event loop,
or because the coroutine is doing nothing at all).  In this case,
the bdrv_aio_* API must delay the completion to the next iteration
of the main loop, because bdrv_aio_* will never invoke the callback
before returning.

This can be done with a bottom half, and indeed bdrv_aio_* is always
using one for simplicity.  It is possible to gain some performance
(~3%) by avoiding this in the common case.  A new field in the
BlockAIOCBCoroutine struct is set to true until the first time the
corotine has yielded to its creator, and completion goes through a
new function bdrv_co_complete.  If the flag is false, bdrv_co_complete
invokes the callback immediately.  If it is true, the caller will
notice that the coroutine has completed and schedule the bottom
half itself.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427524638-28157-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
a7282330c0 blockjob: Update function name in comments
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-5-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
e62303a437 qemu-iotests: Test that "stop" doesn't drain block jobs
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
69da3b0b47 block: Pause block jobs in bdrv_drain_all
This is necessary to suppress more IO requests from being generated from
block job coroutines.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
751ebd76e6 blockjob: Allow nested pause
This patch changes block_job_pause to increase the pause counter and
block_job_resume to decrease it.

The counter will allow calling block_job_pause/block_job_resume
unconditionally on a job when we need to suspend the IO temporarily.

From now on, each block_job_resume must be paired with a block_job_pause
to keep the counter balanced.

The user pause from QMP or HMP will only trigger block_job_pause once
until it's resumed, this is achieved by adding a user_paused flag in
BlockJob.

One occurrence of block_job_resume in mirror_complete is replaced with
block_job_enter which does what is necessary.

In block_job_cancel, the cancel flag is good enough to instruct
coroutines to quit loop, so use block_job_enter to replace the unpaired
block_job_resume.

Upon block job IO error, user is notified about the entering to the
pause state, so this pause belongs to user pause, set the flag
accordingly and expect a matching QMP resume.

[Extended doc comments as suggested by Paolo Bonzini
<pbonzini@redhat.com>.
--Stefan]

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1428069921-2957-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
199667a8c8 MAINTAINERS: Add Fam Zheng as Null block driver maintainer
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
1c2b49a172 block/null: Support reopen
Reopen is used in block-commit. With this always-succeed operation, it
is now possible to test committing to a null drive, by specifying
"null-aio://" or "null-co://" as the backing image when creating the
qcow2 image.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Fam Zheng
e5e51dd3af block/null: Latency simulation by adding new option "latency-ns"
Aio context switch should just work because the requests will be
drained, so the scheduled timer(s) on the old context will be freed.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427852740-24315-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Stefan Hajnoczi
9eddd6a4b3 scripts: add 'qemu coroutine' command to qemu-gdb.py
The 'qemu coroutine <coroutine-address>' GDB command prints the
backtrace for a CoroutineUContext.  This is useful for peeking inside
yielded coroutines that are waiting for file descriptor events, timers,
etc.

For example:

  $ gdb tests/test-coroutine
  (gdb) b test_yield
  (gdb) r
  (gdb) b qemu_coroutine_enter
  (gdb) c
  (gdb) c
  Continuing.

  Breakpoint 2, qemu_coroutine_enter (co=0x555555c66520, opaque=0x0) at qemu-coroutine.c:103
  103	{
  (gdb) source scripts/qemu-gdb.py
  (gdb) qemu coroutine 0x555555c66520
  #0  0x000055555557a740 in qemu_coroutine_switch (from_=<optimized out>, to_=0x7ffff7f90a70, action=COROUTINE_YIELD) at coroutine-ucontext.c:177
  #1  0x0000555555566af9 in yield_5_times (opaque=0x7fffffffdbb7) at tests/test-coroutine.c:107
  #2  0x000055555557a7aa in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at coroutine-ucontext.c:80
  #3  0x00007ffff08de000 in __start_context () at /lib64/libc.so.6

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427409754-8556-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Stefan Hajnoczi
1faa5bb732 thread-pool: clean up thread_pool_completion_bh()
This patch simplifies thread_pool_completion_bh().

The function first checks elem->state:

  if (elem->state != THREAD_DONE) {
      continue;
  }

It then goes on to check elem->state == THREAD_DONE although we already
know this must be the case.

The QLIST_REMOVE() is duplicated down both branches of an if-else
statement so that can be lifted out as well.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1427992762-10126-1-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:09 +02:00
Kevin Wolf
d1a126c53d vhdx: Fix zero-fill iov length
Fix the length of the zero-fill for the back, which was accidentally
using the same value as for the front. This is caught by qemu-iotests
033.

For consistency, change the code for the front as well to use the length
stored in the iov (it is the same value, copied four lines above).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Jeff Cody <jcody@redhat.com>
2015-04-28 15:36:09 +02:00
Kevin Wolf
8eedfbd4a5 blkdebug: Add bdrv_truncate()
This is, amongst others, required for qemu-iotests 033 to run as
intended on VHDX, which uses explicit bdrv_truncate() calls to bs->file
when allocating new blocks.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2015-04-28 15:36:09 +02:00
Kevin Wolf
e4f5874923 qemu-iotests: Some qemu-img convert tests
This adds a regression test for some problems that the qemu-img convert
rewrite just fixed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-04-28 15:36:09 +02:00
Kevin Wolf
690c730160 qemu-img convert: Rewrite copying logic
The implementation of qemu-img convert is (a) messy, (b) buggy, and
(c) less efficient than possible. The changes required to beat some
sense into it are massive enough that incremental changes would only
make my and the reviewers' life harder. So throw it away and reimplement
it from scratch.

Let me give some examples what I mean by messy, buggy and inefficient:

(a) The copying logic of qemu-img convert has two separate branches for
    compressed and normal target images, which roughly do the same -
    except for a little code that handles actual differences between
    compressed and uncompressed images, and much more code that
    implements just a different set of optimisations and bugs. This is
    unnecessary code duplication, and makes the code for compressed
    output (unsurprisingly) suffer from bitrot.

    The code for uncompressed ouput is run twice to count the the total
    length for the progress bar. In the first run it just takes a
    shortcut and runs only half the loop, and when it's done, it toggles
    a boolean, jumps out of the loop with a backwards goto and starts
    over. Works, but pretty is something different.

(b) Converting while keeping a backing file (-B option) is broken in
    several ways. This includes not writing to the image file if the
    input has zero clusters or data filled with zeros (ignoring that the
    backing file will be visible instead).

    It also doesn't correctly limit every iteration of the copy loop to
    sectors of the same status so that too many sectors may be copied to
    in the target image. For -B this gives an unexpected result, for
    other images it just does more work than necessary.

    Conversion with a compressed target completely ignores any target
    backing file.

(c) qemu-img convert skips reading and writing an area if it knows from
    metadata that copying isn't needed (except for the bug mentioned
    above that ignores a status change in some cases). It does, however,
    read from the source even if it knows that it will read zeros, and
    then search for non-zero bytes in the read buffer, if it's possible
    that a write might be needed.

This reimplementation of the copying core reorganises the code to remove
the duplication and have a much more obvious code flow, by essentially
splitting the copy iteration loop into three parts:

1. Find the number of contiguous sectors of the same status at the
   current offset (This can also be called in a separate loop before the
   copying loop in order to determine the total sectors for the progress
   bar.)

2. Read sectors. If the status implies that there is no data there to
   read (zero or unallocated cluster), don't do anything.

3. Write sectors depending on the status. If it's data, write it. If
   we want the backing file to be visible (with -B), don't write it. If
   it's zeroed, skip it if you can, otherwise use bdrv_write_zeroes() to
   optimise the write at least where possible.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-04-28 15:36:09 +02:00
Kevin Wolf
0df89e8e6f block-backend: Expose bdrv_write_zeroes()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-04-28 15:36:08 +02:00
Paolo Bonzini
a0710f7995 iothread: release iothread around aio_poll
This is the first step towards having fine-grained critical sections in
dataplane threads, which resolves lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.

Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Paolo Bonzini
49110174f8 AioContext: acquire/release AioContext during aio_poll
This is the first step in pushing down acquire/release, and will let
rfifolock drop the contention callback feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Paolo Bonzini
e98ab09709 aio-posix: move pollfds to thread-local storage
By using thread-local storage, aio_poll can stop using global data during
g_poll_ns.  This will make it possible to drop callbacks from rfifolock.

[Moved npfd = 0 assignment to end of walking_handlers region as
suggested by Paolo.  This resolves the assert(npfd == 0) assertion
failure in pollfds_cleanup().
--Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1424449612-18215-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Fam Zheng
de50a20a4c block: Switch to host monotonic clock for IO throttling
Currently, throttle timers won't make any progress when VCPU is not
running, which would stall the request queue in utils, qtest, vm
suspending, and live migration, without special handling.

Block jobs are confusingly inconsistent between with and without
throttling: if user sets a bps limit, stops the vm, then start a block
job, the block job will not make any progress; in contrary, if user
unsets the bps limit, or if it's not set, the block job will run
normally.

After this patch, with the host clock, even if the VCPUs are stopped,
the throttle queues will be processed.

This patch also enables potential to add throttle to bdrv_drain_all.
Currently all requests are drained immediately. In other words whenever
it is called, IO throttling goes ineffective (examples: system reset,
migration and many block job operations.). This is a loophole that guest
could exploit. If we use the host clock, we can later just trust the
nested poll. This could be done on top.

Note that for qemu-iotests case 093, which uses qtest, we still keep vm
clock so the script can control the clock stepping in order to be
deterministic.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1427268446-6426-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
8b6ee9aeb3 checkpatch: complain about ffs(3) calls
The ffs(3) family of functions is not portable.  MinGW doesn't always
provide the function.

Use ctz32() or ctz64() instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-10-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
f450a85899 os-win32: drop ffs(3) prototype
The lack of ffs(3) in the MinGW headers is a hint that we shouldn't rely
on it.  MinGW 4.9.2 does not make it available for linking when QEMU's
./configure --enable-debug is used (release builds are fine though).

Now that all QEMU code has been switched to ctz32() there is no need for
ffs(3).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-9-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Paolo Bonzini
41074f3d3f omap_intc: convert ffs(3) to ctz32() in omap_inth_sir_update()
Rewrite the loop using level &= level - 1 to clear the least significant
bit after each iteration.  This simplifies the loop and makes it easy to
replace ffs(3) with ctz32().

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-8-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
c9d9331851 sd: convert sd_normal_command() ffs(3) call to ctz32()
ffs() cannot be replaced with ctz32() when the argument might be zero,
because ffs(0) returns 0 while ctz32(0) returns 32.

The ffs(3) call in sd_normal_command() is a special case though.  It can
be converted to ctz32() + 1 because the argument is never zero:

  if (!(req.arg >> 8) || (req.arg >> (ctz32(req.arg & ~0xff) + 1))) {
      ~~~~~~~~~~~~~~~
            ^--------------- req.arg cannot be zero

Cc: Markus Armbruster <armbru@redhat.com>
Cc: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-7-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
bd2a88840e Convert ffs() != 0 callers to ctz32()
There are a number of ffs(3) callers that do roughly:

  bit = ffs(val);
  if (bit) {
      do_something(bit - 1);
  }

This pattern can be converted to ctz32() like this:

  zeroes = ctz32(val);
  if (zeroes != 32) {
      do_something(zeroes);
  }

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-6-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
786a4ea82e Convert (ffs(val) - 1) to ctz32(val)
This commit was generated mechanically by coccinelle from the following
semantic patch:

@@
expression val;
@@
- (ffs(val) - 1)
+ ctz32(val)

The call sites have been audited to ensure the ffs(0) - 1 == -1 case
never occurs (due to input validation, asserts, etc).  Therefore we
don't need to worry about the fact that ctz32(0) == 32.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-5-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
5863d374a3 uninorth: convert ffs(3) to ctz32()
It is not clear from the code how a 0 parameter should be handled by the
hardware.  Keep the same behavior as ffs(0) - 1 == -1.

Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-4-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
ad5f5fdca8 hw/arm/nseries: convert ffs(3) to ctz32()
It is not clear from the code how a 0 parameter should be handled by the
hardware.  Keep the same behavior as ffs(0) - 1 == -1.

Cc: Andrzej Zaborowski <balrog@zabor.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-3-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Stefan Hajnoczi
588ef9d411 bt-sdp: fix broken uuids power-of-2 calculation
The binary search in sdp_uuid_match() only works when the number of
elements to search is a power of two.

  lo = record->uuid;
  hi = record->uuids;
  while (hi >>= 1)
      if (lo[hi] <= val)
          lo += hi;

  return *lo == val;

I noticed that the record->uuids calculation in
sdp_service_record_build() was suspect:

  record->uuids = 1 << ffs(record->uuids - 1);

Unlike most ffs(val) - 1 users, the expression is ffs(val - 1)!

Actually ffs() is the wrong function to use for power-of-2.  Use
pow2ceil() to achieve the correct effect.  Now the record->uuid[] array
is sized correctly and the binary search in sdp_uuid_match() should
work.

I'm not sure how to run/test this code.

Cc: Andrzej Zaborowski <balrog@zabor.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1427124571-28598-2-git-send-email-stefanha@redhat.com
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Alberto Garcia
ecdda9e03d MAINTAINERS: Add myself as the maintainer of the Quorum driver
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1426522925-14444-1-git-send-email-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Yi Wang
407bc15033 savevm: create snapshot failed when id_str already exists
The command "virsh create" will fail in such condition: vm has two
disks: vda and vdb. vda has snapshot s1 with id "1", vdb doesn't have
s1 but has snapshot s2 with id "1".  When we want to run command "virsh
create s1", del_existing_snapshots() only deletes s1 in vda, and
bdrv_snapshot_create() tries to create vdb's snapshot s1 with id "1",
but id "1" alreay exists in vdb with name "s2"!

The simplest way is call find_new_snapshot_id() unconditionally.

Signed-off-by: Yi Wang <up2wing@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-04-28 15:36:08 +02:00
Peter Maydell
84cbd63f87 X86 queue, 2015-04-27 (v2)
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJVPoMfAAoJECgHk2+YTcWmzuQQAIOf4o/0t/lV87uR1/smP+E/
 mvdWm9FiaRXkc0SYrIdpNoPyyMfNnzwDocywIaMcMNEzxMe1oP/objm3yWuOpeeH
 Q26oMt/phXsPBRl5vxhAQ29LxuCagDivRFyTUxa7E8/D9GKmVFE6/orwHeMfwUEs
 frILNM5lCLiXk4dgeQPW9xWb8iAm/RVRWFRIya5HVh51PjjsN/iLQDtSLM5Ivvvn
 Pn5+cspcjD6bjM3tGiXYJZtUdoKh/7Dc8QLHhOOGd0XGysnfEbsmNQbk5Kl62PL/
 3Ug0yQWpecM5IrmPWi/Omh2YJNdTbrjsOMaLWZv2LtWl/o8mD51rPJeK9Pfym/26
 dsQgcYK/0GowEOxb/PUkWoX7eNziUyKryksgtdLxyrcllO5QKhrkTibDh8UOKYwI
 X6qEqrhCJ8pN6GhFJYK/2ieMajOoa3TE7WE5CenEPFyEmDqQkKWrjUyc0YZjcZ98
 cHJESCyXxaetzpwt17WT+vaEwpKyj7RBBS55fuV7S6+QRpWewyvxsKcEe7nfHDPe
 k7VOkG7YD0LX9dnoFGLvHbxms/QUji5K2rprvZYdA5PNywTwRsyW58XR/G1jOU2Q
 oDkLL+B52A3+cBUz4n7yrP4GumvxTR9mfOGbvdKXRYHyJka74aoGecPblAVz3ew4
 UTgqXmdlJzA5SzyPObcq
 =t4Wd
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging

X86 queue, 2015-04-27 (v2)

# gpg: Signature made Mon Apr 27 19:42:39 2015 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  target-i386: Remove AMD feature flag aliases from CPU model table
  target-i386: X86CPU::xlevel2 QOM property
  target-i386: Make "level" and "xlevel" properties static
  qemu-config: Accept empty option values
  MAINTAINERS: Change status of X86 to Maintained
  MAINTAINERS: Add myself to X86

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-28 12:22:20 +01:00
Peter Maydell
54965ee61d NUMA queue, 2015-04-27
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJVPnmrAAoJECgHk2+YTcWm9fkQALwhU06fh1C3zOqW162cYnPi
 omkrj89606UzbpC9/NxP1XsjJcXysjD2Dn+yZ5F4WKwHOuqRcoj9wdp/PdE1+lUt
 +Dwl3+c0waVmRm91fBA71EgkENGpd1+d0BmddAKhCcoM/R1qtH/RJBVmmojHnt7g
 43uAVunV2PLpnW00nug4fJZnjS/MN3d5MDyJLw0AbaLLGDojG7r9sI6UMbOK8I2l
 zjCX+RUDAzf4QPPszwvwD+2JsZOSUoqNRts42FxGe1AOMvH4REBnIJZzbFoZ58lr
 tqL+1vJYx6HHvvZu0/gWSD4tmQenWelhlfVIig1L02srJKLq2Zw6oq18D9QOJXdd
 7a/lkM+RRCFYwBQR/74ZIFEvS5VxtJzQbxpuHAf0A5tjHzV7E97Nz7SwvuSAAdCv
 aKQSKqMa5aQFxCaUK6sl7u5X08Av9o6cNg2z7Ri+hFFVXQu1JIKoNFxPH9As/gDZ
 YtGSrsVDBEYHxLF70iLXWEMdxM5j5Aec1HIBYWGXA930hSXDxZYRSFnbRhiPe7HI
 wArOPvk81w/DrD0Pqy+4r9emxPsiDbAn/U03VYHVoSZXl6BbHN8gMnyF+X0sZT3f
 kaLQEFihzTHron7b3nzfrFsktkbrRWnxCeqVhhMn0yPCZwJV6I/MD01FcDdTe7y0
 f16IYLpvqMDTDT+HUt/d
 =P+kR
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging

NUMA queue, 2015-04-27

# gpg: Signature made Mon Apr 27 19:02:19 2015 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/numa-pull-request:
  MAINTAINERS: Add myself as NUMA code maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-04-28 11:33:47 +01:00