Since cc0681c454 ("block: Enable the new
throttling code in the block layer.") bdrv_drain_all() no longer spins.
The code used to look as follows:
do {
busy = qemu_aio_wait();
/* FIXME: We do not have timer support here, so this is effectively
* a busy wait.
*/
QTAILQ_FOREACH(bs, &bdrv_states, list) {
while (qemu_co_enter_next(&bs->throttled_reqs)) {
busy = true;
}
}
} while (busy);
Note that throttle requests are kicked but I/O throttling limits are
still in effect. The loop spins until the vm_clock time allows the
request to make progress and complete.
The new throttling code introduced bdrv_start_throttled_reqs(). This
function not only kicks throttled requests but also temporarily disables
throttling so requests can run.
The outdated FIXME comment can be removed. Also drop the busy = true
assignment since we overwrite it immediately afterwards.
Reviewed-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
when doing very large jobs updating the progress only every 2%
is too rare.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
this patch shortens requests to end at an aligned sector so that
the next request starts aligned.
[Squashed Peter's fix for bdrv_get_info() failure discussed on the
mailing list.
--Stefan]
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Adding xhci support to seabios made it jump over the 128k line.
Changing the bios size breaks migration, so we have to keep a
128k seabios binary for old machine types. New machine types can
use a large 256k bios which should be big enougth for a while.
This patch updates the seabios build process to build seabios twice,
once full featured and once with xen and xhci turned off so the
resulting binary is small enougth to fit into 128k.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Updates seabios to git master snapshot. seabios is in freeze now,
update to final 1.7.4 will follow later this year.
Summary of major changes:
* Support for acpi table loading from qemu.
* Support for the xhci host adapter.
* Support for the pvscsi HBA.
* Various minor bug fixes.
* Lots of cleanups.
Full shortlog since 1.7.3 (note that some of these changes have been
cherry-picked into 1.7.3-stable):
Evgeny Budilovsky (1):
Add pvscsi boot support
Gerd Hoffmann (27):
coreboot: add cbmem console support
Add CONFIG_DEBUG_COREBOOT config option
apm: fix shutdown
ahci: add missing check for allocation failure
bochsvga: fallback to stdvga if dispi interface isn't present
Add generic qemu detection
Drop coreboot qemu detection
Add qemu detection to csm
uas: add (temporary) superspeed stopgap
usb: add usb_update_pipe()
usb: add xhci support
fix buildversion.sh
build: simplify cross builds
build: create output dirs in do-kconfig
build: explicitly set ROM size
Add qemu_cfg_e820 function.
Add support for etc/e820 fw_cfg file
pci: don't reorder entries when moving to 64bit list
pci: don't map usb host adapters above 4G
pci: align 64bit pci regions to 1G
pci: tweak + comment minimum allocations
pci: log pci windows
pci: map 64-bit BARs at location provided by emulator
ahci: zap real mode macros
ahci: remote some parentheses
ahci: alloc structs in high memory
add hw/serialio.c to SRC32SEG
Jonathan A. Kollasch (1):
vgahooks: add SM720 VGA BIOS hooks for WIN Enterprises MB-60470
Kevin O'Connor (80):
Fix USB EHCI detection that was broken in hlist conversion of PCIDevices.
Update README to include info on VARLOW variables.
PIC code cleanups.
Move internal timer code from clock.c to a new file timer.c.
Don't pass khz to pmtimer_setup - it's always PM_TIMER_FREQUENCY.
Add helper functions to convert timer irqs to milliseconds.
Improve accuracy of internal timers.
Rename cpu_khz to TimerKHz.
Shift CPU TSC down to reduce need for 64bit variables.
Rename check_timer() function (and similar) to irqtimer_check().
Rename check_tsc() (and similar) to timer_check() and use u32.
Separate out timer setup code.
Unify pmtimer_read() and pittimer_read() code.
Default unused UMB areas to be read-only.
Add missing mathcp_setup() call to CSM code.
Fix bug in CBFS file walking with compressed files.
Support custom boot menu prompt and custom boot menu key.
Minor cleanups to smm assembler.
Add config option to support memory allocations in 9-segment.
Minor - no need to declare MaxCountCPUs as VARFSEG.
Minor - simplify rom_reserve().
Rename tools/ directory to scripts/ directory.
Update kconfig to latest version.
build: Don't use vpath makefile directive.
Move code centered around specific hardware devices to src/hw/
Move code cenetered around firmware initialization to src/fw/
build: Reorder makefile source list to group like files together.
README: Update readme to note scripts/ directory rename and vgasrc/ directory.
vgabios: Rename stdvga_bpp_factor to stdvga_vram_ratio.
vgabios: Limit the range of the VBE number of "pages" parameter.
readme: Minor - fix typo in readme.
Split x86 specific functions out of util.c/h to new files x86.c/h.
Move keyboard calling code from util.c to boot.c.
Rename util.c to string.c and introduce string.h.
build: Perform compile checking on vgasrc code.
Move stacks.c definitions from util.h to new file stacks.h.
Move romfile definitions from util.h to new file romfile.h.
Move malloc code from pmm.c to new files malloc.c and malloc.h.
Move function definitions for output.c from util.h to new file output.h.
Move definition of struct segoff_s from farptr.h to types.h.
build: Fix import of gcc dependency files.
Move pirtable definitions from hw/pci.h to std/pirtable.h and util.h.
Move optionroms.h to std/optionrom.h and util.h.
Move vbe.h to std/vbe.h.
Move fw/LegacyBios.h to std/LegacyBios.h and remove csm.h.
Move fw/smbios.h to std/smbios.h.
Move fw/mptable.h to std/mptable.h.
Move fw/acpi.h to std/acpi.h.
Move pnpbios definition to new file std/pnpbios.h.
Move pmm definitions to new file std/pmm.h.
Split disk.h into block.h and std/disk.h.
Move standard bda type info from biosvar.h to std/bda.h.
Merge bmp.h, boot.h, jpeg.h, and post.h into util.h.
Sort the sections of util.h.
Move PIT setup from clock.c to hw/timer.c.
Rename hw/cmos.h to hw/rtc.h and copy RTC code from clock.c to hw/rtc.c.
Move dma code to new file hw/dma.c.
Remove ioport.h; disperse its contents to other header files.
Minor - update file comments in src/malloc.c.
Rename fields of 'struct chs_s' and use in floppy lba2chs().
Rearrange stack_hop_back() call in wait_irq, check_irqs, and _farcall16.
Minor - move call16 assembler in romlayout.S.
Make __call16 use C calling convention and support two passed parameters.
Update _farcall16() to pass segment of callregs explicitly.
Support call16() calls after entering 32bit mode from call32().
Run ahci code entirely in 32bit mode.
Build different final files for QEMU, coreboot, and CSM.
Convert op->drive_g from a 16bit pointer to a 32 bit "GLOBALFLAT" pointer.
megasas: Don't attempt to access 'struct pci_device' at runtime.
Minor - eliminate the SET_GLOBAL macro.
Move low-level hardware writing from output.c to new file hw/serialio.c.
vgabios: Load the DAC palette in "packed" modes on Cirrus and BochsVGA.
vgabios: Support custom fonts in vga framebuffer text writing.
vgabios: Add bochsvga "HDTV" resolutions.
vgabios: Avoid possible divide by zero in bochsvga_set_displaystart.
vgabios: Work around lack of support for "calll" in x86emu emulation.
Minor - update file comment on bootsplash.c.
vgabios: Support allocating an extra stack for vgabios calls and default on.
vgabios: Move initialization code to new file vgainit.c.
floppy: Minor - add warnings if timeouts occur.
Michael S. Tsirkin (6):
acpi: sync FADT flags from PIIX4 to Q35
acpi_extract.py: document DEVICE directives
biostables: support looking up RSDP
romfile_loader: utility to patch in-memory ROM files
acpi: load and link tables through romfile loader
acpi: strip compiler info in built-in DSDT if any
Paul Menzel (2):
ACPI DSDT: Make control method `IQCR` serialized
hw/usb-xhci.c: Code refactoring to not override initializers in `speed_from_xhci[16]`
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Fix cpuid leaf 0x0d which incorrectly parsed eax and ebx.
However, before this patch the CPUID worked fine -- the .offset
field contained the size _and_ was stored in the register that
is supposed to hold the size (eax), and likewise the .size field
contained the offset _and_ was stored in the register trhat is
supposed to hold the offset (ebx).
Signed-off-by: Liu Jinsong <jinsong.liu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
since the convert process is basically a sync operation it might
be benificial in some case to change the hardcoded I/O buffer
size to a greater value.
This patch increases the I/O buffer size if the output
driver advertises an optimal transfer length or discard alignment
that is greater than the default buffer size of 2M.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
this patch aims to set bdi->cluster_size to the internal page size
of the iscsi target so that enabled callers can align requests
properly.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
we currently do not check if a sector is allocated during convert.
This means if a sector is unallocated that we allocate a bounce
buffer of zeroes, find out its zero later and do not write it
in the best case. In the worst case this can lead to reading
blocks from a raw device (like iSCSI) altough we could easily
know via get_block_status that they are zero and simply skip them.
This patch also fixes the progress output not being at 100% after
a successful conversion.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now qemu-img convert have similar options as qemu-nbd for internal
snapshot.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This case can't run when IMGPROTO=nbd, since it needs to create some
internal snapshot which would fail for EOF write request, even when
TEST_IMG is exported with "-f raw" in common.rc, so set _supported_proto
to file.
_require_command() is changed to tip what util is missing, instead
of printing a blank.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Now it is possible to directly export an internal snapshot, which
can be used to probe the snapshot's contents without qemu-img
convert.
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since later this function will be used so improve it. The only caller of it
now is qemu-img, and it is not impacted by introduce function
bdrv_snapshot_load_tmp_by_id_or_name() that call bdrv_snapshot_load_tmp()
twice to keep old search logic. bdrv_snapshot_load_tmp_by_id_or_name() return
int to let caller know the errno, and errno will be used later.
Also fix a typo in comments of bdrv_snapshot_delete().
Signed-off-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Format "raw" doesn't always work on certain file systems (e.g. tmpfs).
Use qcow2 to make the allocation status explicit and split into a new
case.
[Resolved merge conflict due to "qemu-io> " prompt filter, added 074 to
group file, and fixed up s/048/074/ copy-paste mistake.
--Stefan]
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
So that the tests can run faster.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This replaces _unsupported_qemu_io_options and check for support of
current cache mode, and allow to provide a default if user didn't
specify.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will allow overriding cache mode from the "-c mode" option.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The option sets cache mode used in the tests. "-nocache" is changed to
an alias to "-c none", and internally passes "-t none" to qemu-io.
Python scripts will make use of option this in the next commit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Strictly speaking, this is only required for has_zero_init() == false,
but it's easy enough to just do a cluster-aligned write that is padded
with zeros after the header.
This fixes that after 'qemu-img create' header extensions are attempted
to be parsed that are really just random leftover data.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Leaving the backing file open although it is not needed anymore can
cause problems if it is opened through a block driver which allows
exclusive access only and if the create function of the block driver
used for the top image (the one being created) tries to close and reopen
the image file (which will include opening the backing file a second
time).
In particular, this will happen with a backing file opened through
qemu-nbd and using qcow2 as the top image file format (which reopens the
image to flush it to disk).
In addition, the BlockDriverState in bdrv_img_create() is used for the
backing file only; it should therefore be made local to the respective
block.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fetch the data to be written from the input buffer. If it is all zeroes,
we can use the write_zeroes call (possibly with the new MAY_UNMAP flag).
Otherwise, do as many write cycles as needed, writing 512k at a time.
Strictly speaking, this is still incorrect because a zero cluster should
only be written if the MAY_UNMAP flag is set. But this is a bug in qcow2
and the other formats, not in the SCSI code.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Since we report ANC_SUP==0 in VPD page B2h, we need to return
an error (ILLEGAL REQUEST/INVALID FIELD IN CDB) for all WRITE SAME
requests with ANCHOR==1.
Inspired by a similar patch to the LIO in-kernel target.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is the same that is already done for WRITE SAME.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The code is similar to the implementation of discard and write_zeroes
with UNMAP. However, failure must be propagated up to block.c.
The stale page cache problem can be reproduced as follows:
# modprobe scsi-debug lbpws=1 lbprz=1
# ./qemu-io /dev/sdXX
qemu-io> write -P 0xcc 0 2M
qemu-io> write -z 0 1M
qemu-io> read -P 0x00 0 512
Pattern verification failed at offset 0, 512 bytes
qemu-io> read -v 0 512
00000000: cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc cc ................
...
# ./qemu-io --cache=none /dev/sdXX
qemu-io> write -P 0xcc 0 2M
qemu-io> write -z 0 1M
qemu-io> read -P 0x00 0 512
qemu-io> read -v 0 512
00000000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 ................
...
And similarly with discard instead of "write -z".
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
See the next commit for the description of the Linux kernel problem
that is worked around in raw_open_common.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Writing zeroes to a file can be done by punching a hole if
MAY_UNMAP is set.
Note that in this case ENOTSUP is not ignored, but makes
the block layer fall back to the generic implementation.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The current check is right for MAY_UNMAP=1. For MAY_UNMAP=0, just
try and fall back to regular writes as soon as a WRITE SAME command
fails.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
added myself to reflect recent work on the iscsi block driver.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
since commit 3ac21627 the default value changed to 0.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will let misaligned but large requests use zero clusters. This
is important because the cluster size is not guest visible.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Right now, bdrv_co_do_write_zeroes will only try to align the
beginning of the request. However, it is simpler for many
formats to expect the block layer to separate both the head *and*
the tail. This makes sure that the format's bdrv_co_write_zeroes
function will be called with aligned sector_num and nb_sectors for
the bulk of the request.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Similar to write_zeroes, let the generic code receive a ENOTSUP for
discard operations. Since bdrv_discard has advisory semantics,
we can just swallow the error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This will be used by the SCSI layer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This lets bdrv_co_do_rw receive flags, so that it can be used for
zero writes.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_co_discard is only covering drivers which have a .bdrv_co_discard()
implementation, but not those with .bdrv_aio_discard(). Not very nice,
and easy to avoid.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>