Commit Graph

54465 Commits

Author SHA1 Message Date
Cédric Le Goater
f2b14e3a9f spapr: introduce the XIVE_EXPLOIT option in CAS
On POWER9, the Client Architecture Support (CAS) negotiation process
determines whether the guest operates in XIVE Legacy compatibility
(the former POWER8 interrupt model) or in XIVE exploitation mode (the
newer POWER9 interrupt model).

Bit 7 of Byte 23 of vector 5 is used for this purpose.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:02 +10:00
Greg Kurz
92e926e1e3 ppc/kvm: have the "family" CPU alias to point to TYPE_HOST_POWERPC_CPU
When running KVM on POWER, we allow the user to pass "-cpu POWERx" instead
of "-cpu host". This is achieved by patching the ppc_cpu_aliases[] array
so that "POWERx" points to the CPU class with the same PVR as the host CPU.
This causes CPUs to be instantiated from this CPU class instead of the
TYPE_HOST_POWERPC_CPU class which is used with "-cpu host". These CPUs thus
miss all the KVM specific tuning from kvmppc_host_cpu_class_init().

This currently causes QEMU with "-cpu POWER9" to fail when running KVM on a
POWER9 DD1 host:

qemu-system-ppc64: Register sync failed... If you're using kvm-hv.ko, only
 "-cpu host" is possible
kvm_init_vcpu failed: Invalid argument

Let's have the "POWERx" alias to point to TYPE_HOST_POWERPC_CPU directly,
so that "-cpu POWERx" instantiates CPUs from the same class as "-cpu host".

Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:02 +10:00
David Gibson
2a0d90fed5 spapr: Only report host/guest IOMMU page size mismatches on KVM
We print a warning if the spapr IOMMU isn't configured to support a page
size matching the host page size backing RAM.  When that's the case we need
more complex logic to translate VFIO mappings, which is slower.

But, it's not so slow that it would be at all noticeable against the
general slowness of TCG.  So, only warn when using KVM.  This removes some
noisy and unhelpful warnings from make check on hosts with page sizes
which typically differ from those on POWER (e.g. Sparc).

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2017-07-11 11:04:02 +10:00
Greg Kurz
160bb67885 spapr: fix memory hotplug error path
QEMU shouldn't abort if spapr_add_lmbs()->spapr_drc_attach() fails.
Let's propagate the error instead, like it is done everywhere else
where spapr_drc_attach() is called.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:02 +10:00
Suraj Jitindar Singh
95cb065776 target/ppc: Add debug function for radix mmu translation
In target/ppc/mmu-hash64.c there already exists the function
ppc_hash64_get_phys_page_debug() to get the physical (real) address for
a given effective address in hash mode.

Implement the function ppc_radix64_get_phys_page_debug() to allow a real
address to be obtained for a given effective address in radix mode.
This is used when a debugger is attached to qemu.

Previously we just had a comment saying this is unimplemented which then
fell through to the default case and caused an abort due to
unrecognised mmu model as the default had no case for the V3 mmu, which
was misleading at best.

We reuse ppc_radix64_walk_tree() which is used by the radix fault
handler since the process of walking the radix tree is identical.

Reported-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:02 +10:00
Suraj Jitindar Singh
6a042827b6 target/ppc: Refactor tcg radix mmu code
The mmu-radix64.c file implements functions to enable the radix mmu
emulation in tcg mode. There is a function ppc_radix64_walk_tree() which
performs the radix tree walk and also implicitly checks the pte
protection.

Move the protection checking of the pte from the ppc_radix64_walk_tree()
function into the caller. This means the ppc_radix64_walk_tree() function
can be used without protection checking which is useful for debugging.

ppc_radix64_walk_tree() no longer needs to take the rwx and prot variables.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:02 +10:00
David Gibson
3340e5c4f2 spapr: Use unplug_request for PCI hot unplug
AIUI, ->unplug_request in the HotplugHandler is used for "soft"
unplug, where acknowledgement from the guest is required before
completing the unplug, whereas ->unplug is used for "hard" unplug
where qemu unilaterally removes the device, and the guest just has to
cope with its sudden absence.  For spapr we (correctly) use
->unplug_request for CPU and memory hot unplug but we use ->unplug for
PCI.

While I think it might be possible to support "hard" PCI unplug within
the PAPR model, that's not how it actually works now.  Although it's
called from ->unplug, the PCI unplug path will usually just mark the
device for removal, with completion of the unplug delayed until
userspace responds to the unplug notification. If the guest doesn't
respond as expected, that could delay the unplug completion arbitrarily
long.

To reflect that, change the PCI unplug path to be called from
->unplug_request.  We also rename spapr_phb_hot_plug_child() and
spapr_phb_hot_unplug_child() to spapr_pci_plug() and
spapr_pci_unplug_request() to more obviously reflect the callbacks they're
implementing.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:02 +10:00
David Gibson
5c1da81215 spapr: Remove unnecessary differences between hotplug and coldplug paths
spapr_drc_attach() has a 'coldplug' parameter which sets the DRC into
configured state initially, instead of the usual ISOLATED/UNUSABLE state.
It turns out this is unnecessary: although coldplugged devices do need to
be in CONFIGURED state once the guest starts, that will already be
accomplished by the reset code which will move DRCs for already plugged
devices into a coldplug equivalent state.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:01 +10:00
David Gibson
6b762f29a8 spapr: Add DRC release method
At the moment, spapr_drc_release() has an ugly switch on the DRC type to
call the right, device-specific release function.  This cleans it up by
doing that via a proper QOM method.

It's still arguably an abstraction violation for the DRC code to call into
the specific device code, but one mess at a time.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:01 +10:00
David Gibson
6caf3ac613 spapr: Uniform DRC reset paths
DRC objects have a regular device reset method.  However, it only gets
called in the usual way for PCI DRCs.  Because of where CPU and LMB DRCs
are in the QOM tree, their device reset method isn't automatically called.
So, the machine manually registers reset handlers to call device_reset().

This patch removes the device reset method, and instead always explicitly
registers the reset handler from realize().  This means the callers don't
have to worry about the two cases, and we always get proper resets.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2017-07-11 11:04:01 +10:00
David Gibson
f8dc29834c spapr: Leave DR-indicator management to the guest
The DR-indicator is essentially a "virtual LED" attached to a hotpluggable
device, which the guest can set to various states for the attention of
the operator or management layers.

It's mostly guest managed, except that we once-off set it to
ACTIVE/INACTIVE in the attach/detach path.  While that makes certain sense,
there's no indication in PAPR that the hypervisor should do this, and the
drmgr code on the guest side doesn't appear to need it (it will already set
the indicator to ACTIVE on hotplug, and INACTIVE on remove).

So, leave the DR-indicator entirely to the guest; the only thing we need
to do is ensure it's in a sane state on reset.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
2017-07-11 11:04:01 +10:00
Aaron Larson
0ee604abce target-ppc: SPR_BOOKE_ESR not set on FP exceptions
Properly set the book E exception syndrome register when a floating
point exception occurs.

Currently on a book E processor, the POWERPC_EXCP_FP exception handler
fails to set "env->spr[SPR_BOOKE_ESR] = ESR_FP;" as required by the
book E specification.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:01 +10:00
Laurent Vivier
e806b4db14 spapr: fix migration to pseries machine < 2.8
since commit 5c4537bd ("spapr: Fix 2.7<->2.8 migration of PCI host bridge"),
some migration fields are forged from the new ones in spapr_pci_pre_save().

It works well, except when the number of MSI devices is 0,
because in this case the function exits immediately.

This fix moves the migration code before the exit code.

The problem can be reproduced with these commands:

source qemu-2.9:

    qemu-system-ppc64 -monitor stdio -M pseries-2.6 -nodefaults -S

destination qemu-2.6:

    qemu-system-ppc64 -monitor stdio -M pseries-2.6 -nodefaults \
                      -incoming tcp:0:4444

on the source:

    migrate tcp:localhost:4444

Destination fails with the following error:

    qemu-system-ppc64: error while loading state for
                       instance 0x0 of device 'spapr_pci'
    qemu-system-ppc64: load of migration failed: Invalid argument

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:01 +10:00
Greg Kurz
f3728f9cbb spapr: fix bogus function name in comment
$ git grep spapr_ppc_reset
hw/ppc/spapr.c: * as part of spapr_ppc_reset().

$ git grep ppc_spapr_reset
hw/ppc/spapr.c:static void ppc_spapr_reset(void)
hw/ppc/spapr.c:    mc->reset = ppc_spapr_reset;
hw/ppc/spapr_hcall.c:        /* If ppc_spapr_reset() did not set up a HPT
 but one is necessary

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:01 +10:00
Greg Kurz
498cd99544 spapr: refresh "platform-specific" hcalls comment
We have more of these since the addition of KVMPPC_H_LOGICAL_MEMOP in 2012.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:01 +10:00
Greg Kurz
04d0ffbd52 spapr: make spapr_populate_hotplug_cpu_dt() static
Since commit ff9006ddbf ("spapr: move spapr_core_[foo]plug() callbacks
close to machine code in spapr.c"), this function doesn't need to be extern
anymore.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-07-11 11:04:01 +10:00
Peter Maydell
6b06e3e49e nbd patches for 2017-07-10
- Eric Blake: MAINTAINERS: Promote NBD to supported, with new maintainer
 - Vladimir Sementsov-Ogievskiy: [00/10] nbd refactoring part 2
 -----BEGIN PGP SIGNATURE-----
 Comment: Public key at http://people.redhat.com/eblake/eblake.gpg
 
 iQEcBAABCAAGBQJZY5ZGAAoJEKeha0olJ0Nq8w8H/23OZ/tr2Nb3qNu6mOMVSeDp
 gvBATbfatfaiBENDtaA76SAWbUlj262OiHGEfWt1VBnBFLIq+sTxGbUIRh34pqre
 AYkDyq+691YDuI54dl6m3KDLusTIZlPskEEsh/88+H1fuZP4lew4Fg5SBRQR00Uf
 N/s5NyS+FTy22GCS5nGWaDzcuKPb5QlVjB8D3vQ4ZWWUw1RMGHTiZ0VfnhBytg7e
 TFaq++Qn7uPVtodLeM5qNsc8XivzqlymBvM7Y9JC4oS0XgoYVFN1uQg47A3ZDoGz
 cTaTt9/RdkQnfPw9RLZBxKq/kEVF+sJIAtbGNc1oUagFOcWLgLtiWfjkA+1CT3k=
 =C+wT
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/ericb/tags/pull-nbd-2017-07-10-v2' into staging

nbd patches for 2017-07-10

- Eric Blake: MAINTAINERS: Promote NBD to supported, with new maintainer
- Vladimir Sementsov-Ogievskiy: [00/10] nbd refactoring part 2

# gpg: Signature made Mon 10 Jul 2017 15:59:18 BST
# gpg:                using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg:                 aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg:                 aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2  F3AA A7A1 6B4A 2527 436A

* remotes/ericb/tags/pull-nbd-2017-07-10-v2:
  nbd: use generic trace subsystem instead of TRACE macro
  nbd: refactor tracing
  nbd/server: rename clientflags var in nbd_negotiate_options
  nbd/server: fix TRACE in nbd_negotiate_send_rep_len
  nbd/client: refactor TRACE of NBD_MAGIC
  nbd/common: nbd_tls_handshake: remove extra TRACE
  nbd/server: add errp to nbd_send_reply()
  nbd/server: use errp instead of LOG
  nbd/server: refactor nbd_negotiate
  nbd/server: nbd_negotiate: return 1 on NBD_OPT_ABORT
  MAINTAINERS: Promote NBD to supported, with new maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-10 16:12:47 +01:00
Vladimir Sementsov-Ogievskiy
9588463e74 nbd: use generic trace subsystem instead of TRACE macro
Let NBD use the trace mechanisms already present in qemu. Now you can
use the -trace optino of qemu, or the -T/--trace option of qemu-img,
qemu-io, and qemu-nbd, to select nbd traces. For qemu, the QMP commands
trace-event-{get,set}-state can also toggle tracing on the fly.

Example:
   qemu-nbd --trace 'nbd_*' <image file> # enables all nbd traces

Recompilation with CFLAGS=-DDEBUG_NBD is no more needed, furthermore,
DEBUG_NBD macro is removed from the code.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-11-vsementsov@virtuozzo.com>
[eblake: minor tweaks to a couple of traces]
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
6fb2b9726c nbd: refactor tracing
Reorganize traces: move, reword, add information, drop extra ones.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-10-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
7f9039cdaa nbd/server: rename clientflags var in nbd_negotiate_options
Rename 'clientflags' to just 'option'. This variable has nothing to do
with flags, but is a single integer representing the option requested
by the client.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-9-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
4875196163 nbd/server: fix TRACE in nbd_negotiate_send_rep_len
Fix wrong order of TRACE arguments.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-8-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
458d7a6939 nbd/client: refactor TRACE of NBD_MAGIC
We are going to switch from TRACE macro to trace points,
this TRACE complicates things, this patch simplifies it.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-7-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
3e6bb543c2 nbd/common: nbd_tls_handshake: remove extra TRACE
Error is propagated to the caller, TRACE is not needed.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707152918.23086-6-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
c7b9728250 nbd/server: add errp to nbd_send_reply()
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707152918.23086-5-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
2fd2c8407e nbd/server: use errp instead of LOG
Move to modern errp scheme from just LOGging errors.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-4-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
76ff081d91 nbd/server: refactor nbd_negotiate
Combine two successive "if (oldStyle) {...} else {...}" into one.

Block "if (client->tlscreds)" under "if (oldStyle)" is unreachable,
as we have "oldStyle = client->exp != NULL && !client->tlscreds;".
So, delete this block.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170707152918.23086-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Vladimir Sementsov-Ogievskiy
1e120ffead nbd/server: nbd_negotiate: return 1 on NBD_OPT_ABORT
Separate the case when a client sends NBD_OPT_ABORT from all other
errors. It will be needed for the following patch, where errors will be
reported.
This particular case is not actually an error - it honestly follows the
NBD protocol. Therefore it should not be reported like an error.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707152918.23086-2-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
2017-07-10 09:57:24 -05:00
Eric Blake
99c62e70aa MAINTAINERS: Promote NBD to supported, with new maintainer
We are promising more than just odd fixes, and Paolo is hoping
to offload the pull requests to me.  Also, enough of NBD is related
to the block layer that it is worth including qemu-block on patches.

While at it, include blockdev-nbd.c and qemu-nbd.texi in the set
of maintained files.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170707182151.29872-1-eblake@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
2017-07-10 09:56:01 -05:00
Peter Maydell
94c56652b9 Block layer patches
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v2.0.22 (GNU/Linux)
 
 iQIcBAABAgAGBQJZY2R0AAoJEH8JsnLIjy/WFNcQALUMh6O/eaXzHrAER3xAA3oF
 WghkekAbVa6TiGoH90UEOrdBS64I7HuWYtwVQQ/9FVV/3Hc1Z15VnBUMrsvcwhvw
 xwIfMilj/av6bk/DoG1S5wb8T+6yUPBpuBzQhsiGnrnI0o9wMzWpSZn41alxCqpe
 jVMpOvDbLIDB9foQhyNAFUGBjhvkBIAe8O8yIkninZB07r87gUN78388NTybLLNh
 I15R9kWKJlqQxWrX4OT7ShcPnuTJClQyjfu1P4BAHzXpv8ZicBYZ/ZR6Q2verZpu
 kBwXlBxnWKJDOPyd1sKBNe3Mo3KqKXfSL0NOg00PHPRTcgh+e+QYZ4ZxG0Dv6Cyl
 rb6Cy0FBuM6JtjhuCLt9yZ1eErCtpHJq901T7DMb4LQ3IuFrDJToojdPTaOx4st4
 lh7tiraQtp3OKzh/H7smZ5WT7V8Sg2l3mPAs3r7iPihzceS9yPUbka3yB2xbwnpn
 e7H8IzwHFgqRprR+Ii8Z+0eWHApOJDSK2mCcR1OGJEpxXz2gYn6+WrvKBKxDScln
 deIo7Kz6/5OehRKjPGWJmYsgpQBzlLoPUXk1K75OMOu2y4sLGaFr/ZqZgLjOxunZ
 jJ2h5TGgzkRZOsIKQAhrBkBZ09MOI7PMyxkbGu4o+qbovG3InE9kHzH2HBsIIDKS
 zhG+zHFGha8YsyKS6a6Q
 =+05n
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging

Block layer patches

# gpg: Signature made Mon 10 Jul 2017 12:26:44 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (40 commits)
  block: Make bdrv_is_allocated_above() byte-based
  block: Minimize raw use of bds->total_sectors
  block: Make bdrv_is_allocated() byte-based
  backup: Switch backup_run() to byte-based
  backup: Switch backup_do_cow() to byte-based
  backup: Switch block_backup.h to byte-based
  backup: Switch BackupBlockJob to byte-based
  block: Drop unused bdrv_round_sectors_to_clusters()
  mirror: Switch mirror_iteration() to byte-based
  mirror: Switch mirror_do_read() to byte-based
  mirror: Switch mirror_cow_align() to byte-based
  mirror: Update signature of mirror_clip_sectors()
  mirror: Switch mirror_do_zero_or_discard() to byte-based
  mirror: Switch MirrorBlockJob to byte-based
  commit: Switch commit_run() to byte-based
  commit: Switch commit_populate() to byte-based
  stream: Switch stream_run() to byte-based
  stream: Drop reached_end for stream_complete()
  stream: Switch stream_populate() to byte-based
  trace: Show blockjob actions via bytes, not sectors
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-07-10 14:06:49 +01:00
Eric Blake
51b0a48888 block: Make bdrv_is_allocated_above() byte-based
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, for the most part this patch is just the
addition of scaling at the callers followed by inverse scaling at
bdrv_is_allocated().  But some code, particularly stream_run(),
gets a lot simpler because it no longer has to mess with sectors.
Leave comments where we can further simplify by switching to
byte-based iterations, once later patches eliminate the need for
sector-aligned operations.

For ease of review, bdrv_is_allocated() was tackled separately.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake
c00716beb3 block: Minimize raw use of bds->total_sectors
bdrv_is_allocated_above() was relying on intermediate->total_sectors,
which is a field that can have stale contents depending on the value
of intermediate->has_variable_length.  An audit shows that we are safe
(we were first calling through bdrv_co_get_block_status() which in
turn calls bdrv_nb_sectors() and therefore just refreshed the current
length), but it's nicer to favor our accessor functions to avoid having
to repeat such an audit, even if it means refresh_total_sectors() is
called more frequently.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake
d6a644bbfe block: Make bdrv_is_allocated() byte-based
We are gradually moving away from sector-based interfaces, towards
byte-based.  In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.

Changing the signature of the function to use int64_t *pnum ensures
that the compiler enforces that all callers are updated.  For now,
the io.c layer still assert()s that all callers are sector-aligned
on input and that *pnum is sector-aligned on return to the caller,
but that can be relaxed when a later patch implements byte-based
block status.  Therefore, this code adds usages like
DIV_ROUND_UP(,BDRV_SECTOR_SIZE) to callers that still want aligned
values, where the call might reasonbly give non-aligned results
in the future; on the other hand, no rounding is needed for callers
that should just continue to work with byte alignment.

For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_is_allocated().  But
some code, particularly bdrv_commit(), gets a lot simpler because it
no longer has to mess with sectors; also, it is now possible to pass
NULL if the caller does not care how much of the image is allocated
beyond the initial offset.  Leave comments where we can further
simplify once a later patch eliminates the need for sector-aligned
requests through bdrv_is_allocated().

For ease of review, bdrv_is_allocated_above() will be tackled
separately.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:07 +02:00
Eric Blake
6f8e35e241 backup: Switch backup_run() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of backups to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are cluster-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
03f5d60bbf backup: Switch backup_do_cow() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
f6ac207893 backup: Switch block_backup.h to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting
the public interface to backup jobs (no semantic change), including
a change to CowRequest to track by bytes instead of cluster indices.

Note that this does not change the difference between the public
interface (starting point, and size of the subsequent range) and
the internal interface (starting and end points).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
cf79cdf662 backup: Switch BackupBlockJob to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting an
internal structure (no semantic change), and all references to
tracking progress.  Drop a redundant local variable bytes_per_cluster.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
e8a81e9cad block: Drop unused bdrv_round_sectors_to_clusters()
Now that the last user [mirror_iteration()] has converted to using
bytes, we no longer need a function to round sectors to clusters.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
fb2ef7919b mirror: Switch mirror_iteration() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of mirroring to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are both sector-aligned and multiples of the granularity).  Drop
the now-unused mirror_clip_sectors().

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
ae4cc8777b mirror: Switch mirror_do_read() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function, preserving all existing semantics, and adding one more
assertion that things are still sector-aligned (so that conversions
to sectors in mirror_read_complete don't need to round).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
782d97efec mirror: Switch mirror_cow_align() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change), and add mirror_clip_bytes() as a
counterpart to mirror_clip_sectors().  Some of the conversion is
a bit tricky, requiring temporaries to convert between units; it
will be cleared up in a following patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
931e52607f mirror: Update signature of mirror_clip_sectors()
Rather than having a void function that modifies its input
in-place as the output, change the signature to reduce a layer
of indirection and return the result.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
e6f2419389 mirror: Switch mirror_do_zero_or_discard() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Convert another internal
function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
b436982f04 mirror: Switch MirrorBlockJob to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Continue by converting an
internal structure (no semantic change), and all references to the
buffer size.

Add an assertion that our use of s->granularity >> BDRV_SECTOR_BITS
(necessary for interaction with sector-based dirty bitmaps, until
a later patch converts those to be byte-based) does not suffer from
truncation problems.

[checkpatch has a false positive on use of MIN() in this patch]

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
317a6676a2 commit: Switch commit_run() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of committing to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are sector-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
d8a9858408 commit: Switch commit_populate() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Start by converting an
internal function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
d535435f4a stream: Switch stream_run() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Change the internal
loop iteration of streaming to track by bytes instead of sectors
(although we are still guaranteed that we iterate by steps that
are sector-aligned).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
158c649257 stream: Drop reached_end for stream_complete()
stream_complete() skips the work of rewriting the backing file if
the job was cancelled, if data->reached_end is false, or if there
was an error detected (non-zero data->ret) during the streaming.
But note that in stream_run(), data->reached_end is only set if the
loop ran to completion, and data->ret is only 0 in two cases:
either the loop ran to completion (possibly by cancellation, but
stream_complete checks for that), or we took an early goto out
because there is no bs->backing.  Thus, we can preserve the same
semantics without the use of reached_end, by merely checking for
bs->backing (and logically, if there was no backing file, streaming
is a no-op, so there is no backing file to rewrite).

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
8493211c02 stream: Switch stream_populate() to byte-based
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based.  Start by converting an
internal function (no semantic change).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
5cb1a49e01 trace: Show blockjob actions via bytes, not sectors
Upcoming patches are going to switch to byte-based interfaces
instead of sector-based.  Even worse, trace_backup_do_cow_enter()
had a weird mix of cluster and sector indices.

The trace interface is low enough that there are no stability
guarantees, and therefore nothing wrong with changing our units,
even in cases like trace_backup_do_cow_skip() where we are not
changing the trace output.  So make the tracing uniformly use
bytes.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00
Eric Blake
f3e4ce4af3 blockjob: Track job ratelimits via bytes, not sectors
The user interface specifies job rate limits in bytes/second.
It's pointless to have our internal representation track things
in sectors/second, particularly since we want to move away from
sector-based interfaces.

Fix up a doc typo found while verifying that the ratelimit
code handles the scaling difference.

Repetition of expressions like 'n * BDRV_SECTOR_SIZE' will be
cleaned up later when functions are converted to iterate over
images by bytes rather than by sectors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-07-10 13:18:06 +02:00