New device, has never been released, so we can still improve things
without worrying about compatibility.
Naming is a mess. The code calls the device driver CirMemCharDriver,
the public API calls it "memory", "memchardev", or "memchar", and the
special commands are named like "memchar-FOO". "memory" is a
particularly unfortunate choice, because there's another character
device driver called MemoryDriver. Moreover, the device's distinctive
property is that it's a ring buffer, not that's in memory. Therefore:
* Rename CirMemCharDriver to RingBufCharDriver, and call the thing a
"ringbuf" in the API.
* Rename QMP and HMP commands from memchar-FOO to ringbuf-FOO.
* Rename device parameter from maxcapacity to size (simple words are
good for you).
* Clearly mark the parameter as optional in documentation.
* Fix error reporting so that chardev-add reports to current monitor,
not stderr.
* Replace cirmem in C identifiers by ringbuf.
* Rework documentation. Document the impact of our crappy UTF-8
handling on reading.
* QMP examples that even work.
I could split this up into multiple commits, but they'd change the
same documentation lines multiple times. Not worth it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
New errors should be generic unless there's a real use case for rich
errors.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
The data returned has a well-defined size, which makes the size
returned along with it redundant at best. Drop it.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Command memchar-write takes data and size parameter. Begs the
question what happens when data doesn't match size.
With format base64, qmp_memchar_write() copies the full data argument,
regardless of size argument.
With format utf8, qmp_memchar_write() copies size bytes from data,
happily reading beyond data. Copies crap from the heap or even
crashes.
Drop the size parameter, and always copy the full data argument.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Recently, linux support multiqueue tap which could let userspace call TUNSETIFF
for a signle device many times to create multiple file descriptors as
independent queues. User could also enable/disabe a specific queue through
TUNSETQUEUE.
The patch adds the generic infrastructure to create multiqueue taps. To achieve
this a new parameter "queues" were introduced to specify how many queues were
expected to be created for tap by qemu itself. Alternatively, management could
also pass multiple pre-created tap file descriptors separated with ':' through a
new parameter fds like -netdev tap,id=hn0,fds="X:Y:..:Z". Multiple vhost file
descriptors could also be passed in this way.
Each TAPState were still associated to a tap fd, which mean multiple TAPStates
were created when user needs multiqueue taps. Since each TAPState contains one
NetClientState, with the multiqueue nic support, an N peers of NetClientState
were built up.
A new parameter, mq_required were introduce in tap_open() to create multiqueue
tap fds.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
# By Paolo Bonzini (14) and others
# Via Kevin Wolf
* kwolf/for-anthony: (24 commits)
ide: Add fall through annotations
block: Create proper size file for disk mirror
ahci: Add migration support
ahci: Change data types in preparation for migration
ahci: Remove unused AHCIDevice fields
hbitmap: add assertion on hbitmap_iter_init
mirror: do nothing on zero-sized disk
block/vdi: Check for bad signature
block/vdi: Improved return values from vdi_open
block/vdi: Improve debug output for signature
block: Use error code EMEDIUMTYPE for wrong format in some block drivers
block: Add special error code for wrong format
mirror: support arbitrarily-sized iterations
mirror: support more than one in-flight AIO operation
mirror: add buf-size argument to drive-mirror
mirror: switch mirror_iteration to AIO
mirror: allow customizing the granularity
block: allow customizing the granularity of the dirty bitmap
block: return count of dirty sectors, not chunks
mirror: perform COW if the cluster size is bigger than the granularity
...
This makes sense when the next commit starts using the extra buffer space
to perform many I/O operations asynchronously.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The desired granularity may be very different depending on the kind of
operation (e.g. continuous replication vs. collapse-to-raw) and whether
the VM is expected to perform lots of I/O while mirroring is in progress.
Allow the user to customize it, while providing a sane default so that
in general there will be no extra allocated space in the target compared
to the source.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Next commit will re-enable balloon stats with a different interface, but
this old code conflicts with it. Let's drop it.
It's important to note that the QMP and HMP interfaces are also dropped
by this commit. That shouldn't be a problem though, because:
1. All QMP fields are optional
2. This feature has always been disabled
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
The ptsname is returned directly, so there is no need to
use query-chardev to figure the pty device path.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
qemu_chr_open_socket is split into two functions. All initialization
after creating the socket file handler is split away into the new
qemu_chr_open_socket_fd function.
chr->filename doesn't get filled from QemuOpts any more. Qemu gathers
the information using getsockname and getnameinfo instead. This way it
will also work correctly for file handles passed via file descriptor
passing.
Finally qmp_chardev_open_socket() is the actual qmp hotplug
implementation which basically just calls socket_listen or
socket_connect and the new qemu_chr_open_socket_fd function.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Similar to file, except that no separate in/out files are supported
because it's pointless for direct device access. Also the special
tty ioctl hooks (pass through linespeed settings etc) are activated
on Unix.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add chardev-add and chardev-remove qmp commands. Hotplugging
a null chardev is supported for now, more will be added later.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch will allow the user to include the domain-search option in
replies from the built-in DHCP server. The domain suffixes can be
specified by adding dnssearch= entries to the "-net user" parameter.
[Jan: tiny style adjustments]
Signed-off-by: Klaus Stengel <Klaus.Stengel@asamnet.de>
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
* kwolf/for-anthony: (32 commits)
osdep: Less restrictive F_SEFL in qemu_dup_flags()
qemu-iotests: add testcases for mirroring on-source-error/on-target-error
qmp: add pull_event function
mirror: add support for on-source-error/on-target-error
iostatus: forward block_job_iostatus_reset to block job
qemu-iotests: add mirroring test case
mirror: implement completion
qmp: add drive-mirror command
mirror: introduce mirror job
block: introduce BLOCK_JOB_READY event
block: add block-job-complete
block: rename block_job_complete to block_job_completed
block: export dirty bitmap information in query-block
block: introduce new dirty bitmap functionality
block: add bdrv_open_backing_file
block: add bdrv_query_stats
block: add bdrv_query_info
qemu-config: Add new -add-fd command line option
monitor: Prevent removing fd from set during init
monitor: Enable adding an inherited fd to an fd set
...
Conflicts:
vl.c
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
* 'queue/qmp' of git://repo.or.cz/qemu/qmp-unstable:
migration: go to paused state after finishing incoming migration with -S
qmp: handle stop/cont in INMIGRATE state
hmp: fix info cpus for sparc targets
Right now, stop followed by an incoming migration will let the
virtual machine start. cont before an incoming migration instead
will fail.
This is bad because the actual behavior is not predictable; it is
racy with respect to the start of the incoming migration. That's
because incoming migration is blocking, and thus will delay the
processing of stop/cont until the end of the migration.
In addition, there's nothing that really prevents the user from
typing the block device's passwords before incoming migration is
done, so returning the DeviceEncrypted error is also helpful in
the QMP case.
Both things can be fixed by just toggling the autostart variable when
stop/cont are called in INMIGRATE state.
Note that libvirt is currently working around the race by looping
if the MigrationExpected answer is returned. After this patch, the
command will return right away without ever raising an error.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Error management is important for mirroring; otherwise, an error on the
target (even something as "innocent" as ENOSPC) requires to start again
with a full copy. Similar to on_read_error/on_write_error, two separate
knobs are provided for on_source_error (reads) and on_target_error (writes).
The default is 'report' for both.
The 'ignore' policy will leave the sector dirty, so that it will be
retried later. Thus, it will not cause corruption.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds the monitor commands that start the mirroring job.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds the implementation of a new job that mirrors a disk to
a new image while letting the guest continue using the old image.
The target is treated as a "black box" and data is copied from the
source to the target in the background. This can be used for several
purposes, including storage migration, continuous replication, and
observation of the guest I/O in an external program. It is also a
first step in replacing the inefficient block migration code that is
part of QEMU.
The job is possibly never-ending, but it is logically structured into
two phases: 1) copy all data as fast as possible until the target
first gets in sync with the source; 2) keep target in sync and
ensure that reopening to the target gets a correct (full) copy
of the source data.
The second phase is indicated by the progress in "info block-jobs"
reporting the current offset to be equal to the length of the file.
When the job is cancelled in the second phase, QEMU will run the
job until the source is clean and quiescent, then it will report
successful completion of the job.
In other words, the BLOCK_JOB_CANCELLED event means that the target
may _not_ be consistent with a past state of the source; the
BLOCK_JOB_COMPLETED event means that the target is consistent with
a past state of the source. (Note that it could already happen
that management lost the race against QEMU and got a completion
event instead of cancellation).
It is not yet possible to complete the job and switch over to the target
disk. The next patches will fix this and add many refinements to the
basic idea introduced here. These include improved error management,
some tunable knobs and performance optimizations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Even for jobs that need to be manually completed, management may want
to take care itself of the completion, not requiring the user to issue
a command to terminate the job. In this case we want to avoid that
they poll us continuously, waiting for completion to become available.
Thus, add a new event that signals the phase switch and the availability
of the block-job-complete command.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
While streaming can be dropped as soon as it progressed through the whole
image, mirroring needs to be completed manually for two reasons: 1) so that
management knows exactly when the VM switches to the target; 2) because
for other use cases such as replication, we may leave the operation running
for the whole life of the virtual machine.
Add a new block job command that manually completes background operations.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The first call to add an fd to an fd set was previously not
allowed to choose the fd set ID. The ID was generated as
the first available and ensuing calls could add more fds by
specifying the fd set ID. This change allows users to
choose the fd set ID on the first call.
Signed-off-by: Corey Bryant <coreyb@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Adding an NBD server inside QEMU is trivial, since all the logic is
in nbd.c and can be shared easily between qemu-nbd and QEMU itself.
The main difference is that qemu-nbd serves a single unnamed export,
while QEMU serves named exports.
Acked-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* sstabellini/xen-2012-10-03:
xen: Set the vram dirty when an error occur.
exec, memory: Call to xen_modified_memory.
exec: Introduce helper to set dirty flags.
xen: Introduce xen_modified_memory.
QMP, Introduce xen-set-global-dirty-log command.
qemu/xen: Add 64 bits big bar support on qemu
xen: Fix, no unplug of pt device by platform device.
* kwolf/for-anthony: (30 commits)
qemu-iotests: add tests for streaming error handling
qemu-iotests: map underscore to dash in QMP argument names
blkdebug: process all set_state rules in the old state
stream: add on-error argument
block: introduce block job error
iostatus: reorganize io error code
iostatus: change is_read to a bool
iostatus: move BlockdevOnError declaration to QAPI
iostatus: rename BlockErrorAction, BlockQMPEventAction
qemu-iotests: add test for pausing a streaming operation
qmp: add block-job-pause and block-job-resume
block: add support for job pause/resume
qmp: add 'busy' member to BlockJobInfo
block: add block_job_query
block: move job APIs to separate files
block: fix documentation of block_job_cancel_sync
qerror/block: introduce QERR_BLOCK_JOB_NOT_ACTIVE
qemu-iotests: add initial tests for live block commit
QAPI: add command for live block commit, 'block-commit'
block: helper function, to find the base image of a chain
...
This command is used during a migration of a guest under Xen. It calls
memory_global_dirty_log_start or memory_global_dirty_log_stop according to the
argument pass to the command.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
This patch adds support for error management to streaming.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The following behaviors are possible:
'report': The behavior is the same as in 1.1. An I/O error,
respectively during a read or a write, will complete the job immediately
with an error code.
'ignore': An I/O error, respectively during a read or a write, will be
ignored. For streaming, the job will complete with an error and the
backing file will be left in place. For mirroring, the sector will be
marked again as dirty and re-examined later.
'stop': The job will be paused and the job iostatus will be set to
failed or nospace, while the VM will keep running. This can only be
specified if the block device has rerror=stop and werror=stop or enospc.
'enospc': Behaves as 'stop' for ENOSPC errors, 'report' for others.
In all cases, even for 'report', the I/O error is reported as a QMP
event BLOCK_JOB_ERROR, with the same arguments as BLOCK_IO_ERROR.
It is possible that while stopping the VM a BLOCK_IO_ERROR event will be
reported and will clobber the event from BLOCK_JOB_ERROR, or vice versa.
This is not really avoidable since stopping the VM completes all pending
I/O requests. In fact, it is already possible now that a series of
BLOCK_IO_ERROR events are reported with rerror=stop, because vm_stop
calls bdrv_drain_all and this can generate further errors.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This will let block-stream reuse the enum. Places that used the enums
are renamed accordingly.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add QMP commands matching the functionality.
Paused jobs cannot be canceled without first resuming them. This
ensures that I/O errors are never missed by management. However, an
optional force argument can be specified to allow that.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Job pausing reuses the existing support for cancellable sleeps. A pause
happens at the next sleeping point and lasts until the coroutine is
re-entered explicitly. Cancellation was already doing a forced resume,
so implement it explicitly in terms of resume.
Paused jobs cannot be canceled without first resuming them. This ensures
that I/O errors are never missed by management.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Because pausing a job is asynchronous, we need to know whether it has
completed. This is described by the "busy" field of BlockJob; copy it
to BlockJobInfo.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The command for live block commit is added, which has the following
arguments:
device: the block device to perform the commit on (mandatory)
base: the base image to commit into; optional (if not specified,
it is the underlying original image)
top: the top image of the commit - all data from inside top down
to base will be committed into base (mandatory for now; see
note, below)
speed: maximum speed, in bytes/sec
Note: Eventually this command will support merging down the active layer,
but that code is not yet complete. If the active layer is passed
in as top, then an error will be returned. Once merging down the
active layer is supported, the 'top' argument may become optional,
and default to the active layer.
The is done as a block job, so upon completion a BLOCK_JOB_COMPLETED will
be emitted.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Before the qapi conversion, the sendkey command could be used to
send key codes in hex directly to the guest. In HMP, this would
be like:
(qemu) sendkey 0xdc
However, the qapi conversion broke this, as it only supports sending
QKeyCode values to the guest. That's a regression.
This commit fixes the problem by adding hex value support down
the QMP interface, qmp_send_key().
In more detail, this commit:
1. Adds the KeyValue union. This can represent an hex value or
a QKeyCode value
2. *Changes* the QMP send-key command to take an KeyValue argument
instead of a QKeyCode one
3. Adapt hmp_send_key() to the QMP interface changes
Item 2 is an incompatible change, but as we're in development phase
(and this command has been merged a few weeks ago) this shouldn't be
a problem.
Finally, it's not possible to split this commit without breaking the
build.
Reported-by: Avi Kivity <avi@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
o Add a note about memory allocation with paging=true
o Fix indentation
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Also fixes a few issues while there:
1. The fd returned by monitor_get_fd() leaks in most error conditions
2. monitor_get_fd() return value is not checked. Best case we get
an error that is not correctly reported, worse case one of the
functions using the fd (with value of -1) will explode
3. A few error conditions aren't reported
4. We now "use up" @fdname always. Before, it was left alone for
invalid @protocol
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>