mirrors/qemu - qemu - SynapseOS git

Author	SHA1	Message	Date
Kevin Wolf	83fd6dd3e7	block: Move bdrv_commit() to block/commit.c No code changes, just moved from one file to another. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	adad6496c5	block: Convert bdrv_co_do_readv/writev to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:27 +02:00
Kevin Wolf	0d1049c7d1	block: Convert bdrv_aio_writev() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	ebb7af2173	block: Convert bdrv_aio_readv() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	25ec177d90	block: Convert bdrv_co_writev() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	28b04a8f65	block: Convert bdrv_co_readv() to BdrvChild Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	db1e80ee2e	vhdx: Some more BlockBackend use in vhdx_create() This does some easy conversions from bdrv_* to blk_* functions in vhdx_create(). We should avoid bypassing the BlockBackend layer whenever possible. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	e858a9705c	blkreplay: Convert to byte-based I/O The blkreplay driver only forwards the requests it gets, so converting it to byte granularity is trivial. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Max Reitz <mreitz@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	eecc77473b	vvfat: Use BdrvChild for s->qcow vvfat uses a temporary qcow file to cache written data in read-write mode. In order to do things properly, this should show up in the BDS graph and I/O should go through BdrvChild like for every other node. Signed-off-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Kevin Wolf	a9d52a7563	block/qdev: Fix NULL access when using BB twice BlockBackend has only a single pointer to its guest device, so it makes sure that only a single guest device is attached to it. device-add returns an error if you try to attach a second device to a BB. In order to make the error message nicer, -device that manually connects to a if=none block device get a different message than -drive that implicitly creates a guest device. The if=... option is stored in DriveInfo. However, since blockdev-add exists, not every BlockBackend has a DriveInfo any more. Check that it exists before we dereference it. QMP reproducer resulting in a segfault: {"execute":"blockdev-add","arguments":{"options":{"id":"disk","driver":"file","filename":"/tmp/test.img"}}} {"execute":"device_add","arguments":{"driver":"virtio-blk-pci","drive":"disk"}} {"execute":"device_add","arguments":{"driver":"virtio-blk-pci","drive":"disk"}} Signed-off-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Eric Blake <eblake@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>	2016-07-05 16:46:26 +02:00
Denis V. Lunev	1c42f149dd	block: fix return code for partial write for Linux AIO Partial write most likely means that there is not space rather than "something wrong happens". Thus it would be more natural to return ENOSPC rather than EINVAL. The problem actually happens with NBD server, which has reported EINVAL rather then ENOSPC on the first error using its protocol, which makes report to the user wrong. Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Pavel Borzenkov <pborzenkov@virtuozzo.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	5411541270	block: Use bool as appropriate for BDS members Using int for values that are only used as booleans is confusing. While at it, rearrange a couple of members so that all the bools are contiguous. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	8cc9c6e92b	block: Fix error message style error_setg() is not supposed to be used for multi-sentence messages; tweak the message to append a hint instead. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	a5b8dd2ce8	block: Move request_alignment into BlockLimit It makes more sense to have ALL block size limit constraints in the same struct. Improve the documentation while at it. Simplify a couple of conditionals, now that we have audited and documented that request_alignment is always non-zero. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	d9e0dfa246	block: Split bdrv_merge_limits() from bdrv_refresh_limits() During bdrv_merge_limits(), we were computing initial limits based on another BDS in two places. At first glance, the two computations are not identical (one is doing straight copying, the other is doing merging towards or away from zero) - but when you realize that the first round is starting with all-0 memory, all of the merging happens to work. Factoring out the merging makes it easier to track how two BDS limits are merged, in case we have future reasons to merge in even more limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:26 +02:00
Eric Blake	ad82be2f4f	block: Drop raw_refresh_limits() The raw block driver was blindly copying all limits from bs->file, even though: 1. the main bdrv_refresh_limits() already does this for many of the limits, and 2. blindly copying from the children can weaken any stricter limits that were already inherited from the backing chain during the main bdrv_refresh_limits(). Also, a future patch is about to move .request_alignment into BlockLimits, and that is a limit that should NOT be copied from other layers in the BDS chain. Thus, we can completely drop raw_refresh_limits(), and rely on the block layer setting up the proper limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	b9f7855a50	block: Switch discard length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_discard and discard_alignment. Rename them, using 'pdiscard' as an aid to track which remaining discard interfaces need conversion, and so that the compiler will help us catch the change in semantics across any rebased code. The BlockLimits type is now completely byte-based; and in iscsi.c, sector_limits_lun2qemu() is no longer needed. pdiscard_alignment is made unsigned (we use power-of-2 alignments as bitmasks, where unsigned is easier to think about) while leaving max_pdiscard signed (since we still have an 'int' interface); this is comparable to what commit `cf081fc` did for write zeroes limits. We may later want to make everything an unsigned 64-bit limit - but that requires a bigger code audit. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	29cc6a6834	block: Wording tweaks to write zeroes limits Improve the documentation of the write zeroes limits, to mention additional constraints that drivers should observe. Worth squashing into commit `cf081fca`, if that hadn't been pushed already :) Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	5def6b80e1	block: Switch transfer length bounds to byte-based Sector-based limits are awkward to think about; in our on-going quest to move to byte-based interfaces, convert max_transfer_length and opt_transfer_length. Rename them (dropping the _length suffix) so that the compiler will help us catch the change in semantics across any rebased code, and improve the documentation. Use unsigned values, so that we don't have to worry about negative values and so that bit-twiddling is easier; however, we are still constrained by 2^31 of signed int in most APIs. When a value comes from an external source (iscsi and raw-posix), sanitize the results to ensure that opt_transfer is a power of 2. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	79ba8c986a	block: Set default request_alignment during bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Now that all drivers have been updated to supply an override of request_alignment during their .bdrv_refresh_limits(), as needed, the block layer itself can defer setting the default alignment until part of the overall bdrv_refresh_limits(). Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	a65064816d	block: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Add a .bdrv_refresh_limits() to all four of our legacy devices that will always be sector-only (bochs, cloop, dmg, vvfat), in spite of their recent conversion to expose a byte interface. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	2914a1de99	raw-win32: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. In this case, raw_probe_alignment() already did what we needed, so just fix its signature and wire it in correctly. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	a84178ccff	qcow2: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	c8b3b998e2	iscsi: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	835db3ee7b	blkdebug: Set request_alignment during .bdrv_refresh_limits() We want to eventually stick request_alignment alongside other BlockLimits, but first, we must ensure it is populated at the same time as all other limits, rather than being a special case that is set only when a block is first opened. Note that when the user does not provide "align", then we were defaulting to bs->request_alignment - but at this stage in the initialization, that was always 512. We were also rejecting an explicit "align":0 from the user; this patch now allows that, as an explicit request for the default alignment (which may not always be 512 in the future). qemu-iotests 77 is particularly sensitive to the fact that we can specify an artificial alignment override in blkdebug, and that override must continue to work even when limits are refreshed on an already open device. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	24ce9a2026	block: Give nonzero result to blk_get_max_transfer_length() Making all callers special-case 0 as unlimited is awkward, and we DO have a hard maximum of BDRV_REQUEST_MAX_SECTORS given our current block layer API limits. In the case of scsi, this means that we now always advertise a limit to the guest, even in cases where the underlying layers previously use 0 for no inherent limit beyond the block layer. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	efaf4781a9	scsi: Advertise limits by blocksize, not 512 s->blocksize may be larger than 512, in which case our tweaks to max_xfer_len and opt_xfer_len must be scaled appropriately. CC: qemu-stable@nongnu.org Reported-by: Fam Zheng <famz@redhat.com> Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	f9e95af0a6	iscsi: Advertise realistic limits to block layer The function sector_limits_lun2qemu() returns a value in units of the block layer's 512-byte sector, and can be as large as 0x40000000, which is much larger than the block layer's inherent limit of BDRV_REQUEST_MAX_SECTORS. The block layer already handles '0' as a synonym to the inherent limit, and it is nicer to return this value than it is to calculate an arbitrary maximum, for two reasons: we want to ensure that the block layer continues to special-case '0' as 'no limit beyond the inherent limits'; and we want to be able to someday expand the block layer to allow 64-bit limits, where auditing for uses of BDRV_REQUEST_MAX_SECTORS will help us make sure we aren't artificially constraining iscsi to old block layer limits. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:25 +02:00
Eric Blake	202204717a	nbd: Advertise realistic limits to block layer We were basing the advertisement of maximum discard and transfer length off of UINT32_MAX, but since the rest of the block layer has signed int limits on a transaction, nothing could ever reach that maximum, and we risk overflowing an int once things are converted to byte-based rather than sector-based limits. What's more, we DO have a much smaller limit: both the current kernel and qemu-nbd have a hard limit of 32M on a read or write transaction, and while they may also permit up to a full 32 bits on a discard transaction, the upstream NBD protocol is proposing wording that without any explicit advertisement otherwise, clients should limit ALL requests to the same limits as read and write, even though the other requests do not actually require as many bytes across the wire. So the better limit to tell the block layer is 32M for both values. Behavior doesn't actually change with this patch (the block layer is currently ignoring the max_transfer advertisements); but when that problem is fixed in a later series, this patch will prevent the exposure of a latent bug. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	476b923c32	nbd: Allow larger requests The NBD layer was breaking up request at a limit of 2040 sectors (just under 1M) to cater to old qemu-nbd. But the server limit was raised to 32M in commit `2d8214885` to match the kernel, more than three years ago; and the upstream NBD Protocol is proposing documentation that without any explicit communication to state otherwise, a client should be able to safely assume that a 32M transaction will work. It is time to rely on the larger sizing, and any downstream distro that cares about maximum interoperability to older qemu-nbd servers can just tweak the value of #define NBD_MAX_SECTORS. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Acked-by: Paolo Bonzini <pbonzini@redhat.com> Cc: qemu-stable@nongnu.org Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	82524274ea	block: Fix harmless off-by-one in bdrv_aligned_preadv() If the amount of data to read ends exactly on the total size of the bs, then we were wasting time creating a local qiov to read the data in preparation for what would normally be appending zeroes beyond the end, even though this corner case has nothing further to do. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	a604fa2ba5	block: Document supported flags during bdrv_aligned_preadv() We don't pass any flags on to drivers to handle. Tighten an assert to explain why we pass 0 to bdrv_driver_preadv(), and add some comments on things to be aware of if we want to turn on per-BDS BDRV_REQ_FUA support during reads in the future. Also, document that we may want to consider using unmap during copy-on-read operations where the read is all zeroes. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Eric Blake	cff86b38ac	block: Tighter assertions on bdrv_aligned_pwritev() For symmetry with bdrv_aligned_preadv(), assert that the caller really has aligned things properly. This requires adding an align parameter, which is used now only in the new asserts, but will come in handy in a later patch that adds auto-fragmentation to the max transfer size, since that value need not always be a multiple of the alignment, and therefore must be rounded down. Signed-off-by: Eric Blake <eblake@redhat.com> Reviewed-by: Kevin Wolf <kwolf@redhat.com> Reviewed-by: Fam Zheng <famz@redhat.com> Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Denis V. Lunev	cfef6a45c7	qemu-img: fix failed autotests There are 9 iotests failed on Ubuntu 15.10 at the moment. The problem is that options parsing in qemu-img is broken by the following commit: commit `10985131e3` Author: Denis V. Lunev <den@openvz.org> Date: Fri Jun 17 17:44:13 2016 +0300 qemu-img: move common options parsing before commands processing This strange command line reports error ./qemu-img create -f qcow2 TEST_DIR/t.qcow2 -- 1024 qemu-img: Invalid image size specified! while original code parses it successfully. The problem is that getopt_long state should be reset. This could be done using this assignment according to the manual: optind = 0 Signed-off-by: Denis V. Lunev <den@openvz.org> CC: Eric Blake <eblake@redhat.com> CC: Kevin Wolf <kwolf@redhat.com> CC: Max Reitz <mreitz@redhat.com> Signed-off-by: Kevin Wolf <kwolf@redhat.com>	2016-07-05 16:46:24 +02:00
Peter Maydell	60a0f1af07	ipxe: update submodule from 4e03af8ec to 041863191 e1000e+vmxnet3: add boot rom -----BEGIN PGP SIGNATURE----- Version: GnuPG v2.0.22 (GNU/Linux) iQIcBAABAgAGBQJXegFqAAoJEEy22O7T6HE4BasP/A3kDhgn6J3dqJN85eVw9jcb 52zxttQuDJBibVKZaSIRiZAnBCTEn5WbUvZ5ZRTt5MVH3Whhv09ZhLfdWUJho9az Dnebv3wqyHg3SlAWFNBE3CjSkHswwIAhTrERqjJqXL2jU2erPAHIdTzprx1LdGc7 hKxNy9tPHjyEeGWLUWrTCIC/YcTNkIp1gdSNcVIQ8TbEHIqo0vSK1Jyl429GYVYD bZO6VbX5D/QAEHs6a+Twf5lFnLZ4hlplm+9SbfjsfaVocx/+D5VtMD0AgYukR60M sHNKqzRimHsLQoz1PV4z9+ZexRDdge4y0eNlVHM8XdXYq0AHybv2/MYW6DHBAXaE nayEPPrwNnOVnE95Xgn03CtpddbQ2hAUXObm8gm6aAJEpXh8SeL7YLAYrYJtF5Oq cpB0ks0zDZZRPAr/c0gy075IbB3ITGUcphFid9lpn5ork/YYD/y5//GVB22IrOvQ qT0IXtmL7OA71i7XOvxYejdTWcArpzgWnOXmh7eVjumxDicC+B3JIORZutKaXHDu w9LFnz5l1DyIM0nLEJb2onP4lXRlYdyeiSHjQi4HH6fLEqZERfKlWvftRiE7agGq nt6QaFFeSXqaaDOjfmPFLF2RmIgTHzit279WcWGUQXkbWKvQSCgRB26LBGZ6kFB2 Z1lTOF1b2zUjpKEsuFH2 =zaj0 -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/kraxel/tags/pull-ipxe-20160704-1' into staging ipxe: update submodule from 4e03af8ec to 041863191 e1000e+vmxnet3: add boot rom # gpg: Signature made Mon 04 Jul 2016 07:25:46 BST # gpg: using RSA key 0x4CB6D8EED3E87138 # gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>" # gpg: aka "Gerd Hoffmann <gerd@kraxel.org>" # gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>" # Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138 * remotes/kraxel/tags/pull-ipxe-20160704-1: build: add pc-bios to config-host.mak deps ipxe: add new roms to BLOBS ipxe: update prebuilt binaries vmxnet3: add boot rom e1000e: add boot rom ipxe: add vmxnet3 rom ipxe: add e1000e rom ipxe: update submodule from 4e03af8ec to 041863191 Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-05 12:46:18 +01:00
Peter Maydell	8662d7db39	ppc patch queue for 2016-07-05 Here's the current ppc, sPAPR and related drivers patch queue. * The big addition is dynamic DMA window support (this includes some core VFIO changes) * There are also several fixes to the MMU emulation for bugs introduced with the HV mode patches * Several other bugfixes and cleanups Changes in v2: I messed up and forgot to make a fix in the last patch which BenH pointed out (introduced by my rebasing). That's fixed in this version, and I'm replacing the tag in place with the revised version. -----BEGIN PGP SIGNATURE----- Version: GnuPG v1 iQIcBAABAgAGBQJXe0WaAAoJEGw4ysog2bOSKacP/3C6ouCDPgHXKvXMzd81LCjb Z3eHxl8SvG/dh8s9yn606Bwt0WfmG1JHkEF3KjHLch12QOuEAxk+npJGWSjHDQlR vt2RknKS6tLMkYCWvzrEg6iDKyxTo4Ux2eXnEDjVDylB8cA2C5q0wqfKFYkfINwC Q3xyVgx9ohLjZDdGZhwW85dNQ9d/mu5DpKTkINSn6yPXtT+Bvimw3J/HhTFSKgr6 w1oWDZh/AYebLKopUFKdL/IFUT6fWV3vOiMwrg7eS4slWxc8JiyAqi7jFqO7sxNg 4tZmG2WnU72sfgLDmTYpFJ1Gy3EwLeRKUslPEqQPtBctrQBa3nQRtduKFnVvxpaS yj5ZIux6Rm7j+vI9VTVpgVm0ybWFqJwm7h4Po0LGpp3oTKRV5TjYLwP1oogU1XUW s0bQoyId88zt17AuV/wjP3Boz4KLzGiaOzgcT8Ct9zM1nxG4vr8CdXtp/RfHSQMe j+p4GgIdmxJtK+6r1iU1dYx62MJl/uR2Xa5atK253FJIFjX2k8qTGOu1uHXaJHhO Wccaq+ibqte59FsvdQpRmRZGggeoHxmJJ5CUdq5MKbqStd+e6JmeEGr/UpiVXDVL MknaavRvoBaRdjsvk22Ta5IQPKxGHmDNrknI2vtS/P+diHFTkovUJfWnYUSB/HWM JOOXB4Ds+PBpHxU6nsU0 =g+wU -----END PGP SIGNATURE----- Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160705' into staging ppc patch queue for 2016-07-05 Here's the current ppc, sPAPR and related drivers patch queue. * The big addition is dynamic DMA window support (this includes some core VFIO changes) * There are also several fixes to the MMU emulation for bugs introduced with the HV mode patches * Several other bugfixes and cleanups Changes in v2: I messed up and forgot to make a fix in the last patch which BenH pointed out (introduced by my rebasing). That's fixed in this version, and I'm replacing the tag in place with the revised version. # gpg: Signature made Tue 05 Jul 2016 06:28:58 BST # gpg: using RSA key 0x6C38CACA20D9B392 # gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>" # gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>" # gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>" # gpg: WARNING: This key is not certified with sufficiently trusted signatures! # gpg: It is not certain that the signature belongs to the owner. # Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392 * remotes/dgibson/tags/ppc-for-2.7-20160705: ppc/hash64: Fix support for LPCR:ISL ppc/hash64: Add proper real mode translation support target-ppc: Return page shift from PTEG search target-ppc: Simplify HPTE matching target-ppc: Correct page size decoding in ppc_hash64_pteg_search() ppc: simplify ppc_hash64_hpte_page_shift_noslb() spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2) vfio: Add host side DMA window capabilities vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2) spapr_iommu: Realloc guest visible TCE table when starting/stopping listening ppc: simplify max_smt initialization in ppc_cpu_realizefn() spapr: Ensure thread0 of CPU core is always realized first ppc: Fix xsrdpi, xvrdpi and xvrspi rounding Signed-off-by: Peter Maydell <peter.maydell@linaro.org>	2016-07-05 11:14:27 +01:00
Benjamin Herrenschmidt	2c7ad80443	ppc/hash64: Fix support for LPCR:ISL We need to ignore the segment page size and essentially treat all pages as coming from a 4K segment. Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [dwg: Adjusted for differences in my version of the prereq patches] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 15:18:26 +10:00
Benjamin Herrenschmidt	912acdf487	ppc/hash64: Add proper real mode translation support This adds proper support for translating real mode addresses based on the combination of HV and LPCR bits. This handles HRMOR offset for hypervisor real mode, and both RMA and VRMA modes for guest real mode. PAPR mode adjusts the offsets appropriately to match the RMA used in TCG, but we need to limit to the max supported by the implementation (16G). This includes some fixes by Cédric Le Goater <clg@kaod.org> Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org> [dwg: Adjusted for differences in my version of the prereq patches] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 14:31:08 +10:00
David Gibson	949868633f	target-ppc: Return page shift from PTEG search ppc_hash64_pteg_search() now decodes a PTEs page size encoding, which it didn't previously do. This means we're now double decoding the page size because we check it int he fault path after ppc64_hash64_htab_lookup() returns. To avoid this duplication have ppc_hash64_pteg_search() and ppc_hash64_htab_lookup() return the page size from the PTE and use that in the callers instead of decoding again. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2016-07-05 14:31:08 +10:00
David Gibson	073de86aa9	target-ppc: Simplify HPTE matching ppc_hash64_pteg_search() explicitly checks each HPTE's VALID and SECONDARY bits, then uses the HPTE64_V_COMPARE() macro to check the B field and AVPN. However, a small tweak to HPTE64_V_COMPARE() means we can check all of these bits at once with a suitable ptem value. So, consolidate all the comparisons for simplicity. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2016-07-05 14:31:08 +10:00
David Gibson	651060aba7	target-ppc: Correct page size decoding in ppc_hash64_pteg_search() The architecture specifies that when searching a PTEG for PTEs, entries with a page size encoding that's not valid for the current segment should be ignored, continuing the search. The current implementation does this with ppc_hash64_pte_size_decode() which is a very incomplete implementation of this check. We already have code to do a full and correct page size decode in hpte_page_shift(). This patch moves hpte_page_shift() so it can be used in ppc_hash64_pteg_search() and adjusts the latter's parameters to include a full SLBE instead of just a segment page shift. Signed-off-by: David Gibson <david@gibson.dropbear.id.au> Reviewed-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>	2016-07-05 14:31:08 +10:00
Cédric Le Goater	1f0252e66e	ppc: simplify ppc_hash64_hpte_page_shift_noslb() The segment page shift parameter is never used. Let's remove it. Signed-off-by: Cédric Le Goater <clg@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy	ae4de14cd3	spapr_pci/spapr_pci_vfio: Support Dynamic DMA Windows (DDW) This adds support for Dynamic DMA Windows (DDW) option defined by the SPAPR specification which allows to have additional DMA window(s) The "ddw" property is enabled by default on a PHB but for compatibility the pseries-2.6 machine and older disable it. This also creates a single DMA window for the older machines to maintain backward migration. This implements DDW for PHB with emulated and VFIO devices. The host kernel support is required. The advertised IOMMU page sizes are 4K and 64K; 16M pages are supported but not advertised by default, in order to enable them, the user has to specify "pgsz" property for PHB and enable huge pages for RAM. The existing linux guests try creating one additional huge DMA window with 64K or 16MB pages and map the entire guest RAM to. If succeeded, the guest switches to dma_direct_ops and never calls TCE hypercalls (H_PUT_TCE,...) again. This enables VFIO devices to use the entire RAM and not waste time on map/unmap later. This adds a "dma64_win_addr" property which is a bus address for the 64bit window and by default set to 0x800.0000.0000.0000 as this is what the modern POWER8 hardware uses and this allows having emulated and VFIO devices on the same bus. This adds 4 RTAS handlers: * ibm,query-pe-dma-window * ibm,create-pe-dma-window * ibm,remove-pe-dma-window * ibm,reset-pe-dma-window These are registered from type_init() callback. These RTAS handlers are implemented in a separate file to avoid polluting spapr_iommu.c with PCI. This changes sPAPRPHBState::dma_liobn to an array to allow 2 LIOBNs and updates all references to dma_liobn. However this does not add 64bit LIOBN to the migration stream as in fact even 32bit LIOBN is rather pointless there (as it is a PHB property and the management software can/should pass LIOBNs via CLI) but we keep it for the backward migration support. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy	2e4109de8e	vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2) New VFIO_SPAPR_TCE_v2_IOMMU type supports dynamic DMA window management. This adds ability to VFIO common code to dynamically allocate/remove DMA windows in the host kernel when new VFIO container is added/removed. This adds a helper to vfio_listener_region_add which makes VFIO_IOMMU_SPAPR_TCE_CREATE ioctl and adds just created IOMMU into the host IOMMU list; the opposite action is taken in vfio_listener_region_del. When creating a new window, this uses heuristic to decide on the TCE table levels number. This should cause no guest visible change in behavior. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [dwg: Added some casts to prevent printf() warnings on certain targets where the kernel headers' __u64 doesn't match uint64_t or PRIx64] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy	f4ec5e26ed	vfio: Add host side DMA window capabilities There are going to be multiple IOMMUs per a container. This moves the single host IOMMU parameter set to a list of VFIOHostDMAWindow. This should cause no behavioral change and will be used later by the SPAPR TCE IOMMU v2 which will also add a vfio_host_win_del() helper. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 14:31:08 +10:00
Alexey Kardashevskiy	318f67ce13	vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2) This makes use of the new "memory registering" feature. The idea is to provide the userspace ability to notify the host kernel about pages which are going to be used for DMA. Having this information, the host kernel can pin them all once per user process, do locked pages accounting (once) and not spent time on doing that in real time with possible failures which cannot be handled nicely in some cases. This adds a prereg memory listener which listens on address_space_memory and notifies a VFIO container about memory which needs to be pinned/unpinned. VFIO MMIO regions (i.e. "skip dump" regions) are skipped. The feature is only enabled for SPAPR IOMMU v2. The host kernel changes are required. Since v2 does not need/support VFIO_IOMMU_ENABLE, this does not call it when v2 is detected and enabled. This enforces guest RAM blocks to be host page size aligned; however this is not new as KVM already requires memory slots to be host page size aligned. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> [dwg: Fix compile error on 32-bit host] Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 14:30:54 +10:00
Alexey Kardashevskiy	606b54986d	spapr_iommu: Realloc guest visible TCE table when starting/stopping listening The sPAPR TCE tables manage 2 copies when VFIO is using an IOMMU - a guest view of the table and a hardware TCE table. If there is no VFIO presense in the address space, then just the guest view is used, if this is the case, it is allocated in the KVM. However since there is no support yet for VFIO in KVM TCE hypercalls, when we start using VFIO, we need to move the guest view from KVM to the userspace; and we need to do this for every IOMMU on a bus with VFIO devices. This implements the callbacks for the sPAPR IOMMU - notify_started() reallocated the guest view to the user space, notify_stopped() does the opposite. This removes explicit spapr_tce_set_need_vfio() call from PCI hotplug path as the new callbacks do this better - they notify IOMMU at the exact moment when the configuration is changed, and this also includes the case of PCI hot unplug. Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru> Reviewed-by: David Gibson <david@gibson.dropbear.id.au> Acked-by: Alex Williamson <alex.williamson@redhat.com> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 10:43:02 +10:00
Greg Kurz	c4e6c42353	ppc: simplify max_smt initialization in ppc_cpu_realizefn() kvmppc_smt_threads() returns 1 if KVM is not enabled. Signed-off-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 10:43:02 +10:00
Bharata B Rao	7093645a84	spapr: Ensure thread0 of CPU core is always realized first During CPU core realization, we create all the thread objects and parent them to the core object in a loop. However, the realization of thread objects is done separately by walking the threads of a core using object_child_foreach(). With this, there is no guarantee on the order in which the child thread objects get realized. Since CPU device tree properties are currently derived from the CPU thread object, we assume thread0 of the core to be the representative thread of the core when creating device tree properties for the core. If thread0 is not the first thread that gets realized, then we would end up having an incorrect dt_id for the core and this causes hotplug failures from the guest. Fix this by realizing each thread object by walking the core's thread object list thereby ensuring that thread0 and other threads are always realized in the correct order. Future TODO: CPU DT nodes are per-core properties and we should ideally base the creation of CPU DT nodes on core objects rather than the thread objects. Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com> Reviewed-by: Greg Kurz <groug@kaod.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 10:43:02 +10:00
Anton Blanchard	158c87e5de	ppc: Fix xsrdpi, xvrdpi and xvrspi rounding xsrdpi, xvrdpi and xvrspi use the round ties away method, not round nearest even. Signed-off-by: Anton Blanchard <anton@samba.org> Signed-off-by: David Gibson <david@gibson.dropbear.id.au>	2016-07-05 10:43:02 +10:00

1 2 3 4 5 ...

46980 Commits