Commit Graph

3568 Commits

Author SHA1 Message Date
Xiao Guangrong
87252e1b61 nvdimm acpi: build ACPI NFIT table
NFIT is defined in ACPI 6.0: 5.2.25 NVDIMM Firmware Interface Table (NFIT)

Currently, we only support PMEM mode. Each device has 3 structures:
- SPA structure, defines the PMEM region info

- MEM DEV structure, it has the @handle which is used to associate specified
  ACPI NVDIMM  device we will introduce in later patch.
  Also we can happily ignored the memory device's interleave, the real
  nvdimm hardware access is hidden behind host

- DCR structure, it defines vendor ID used to associate specified vendor
  nvdimm driver. Since we only implement PMEM mode this time, Command
  window and Data window are not needed

The NVDIMM functionality is controlled by the parameter, 'nvdimm', which
is introduced for the machine, there is a example to enable it:
-machine pc,nvdimm -m 8G,maxmem=100G,slots=100  -object \
memory-backend-file,id=mem1,share,mem-path=/tmp/nvdimm1,size=10G -device \
nvdimm,memdev=mem1,id=nv1

It is disabled on default

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 18:39:20 +02:00
Xiao Guangrong
8870ca0e94 acpi: support specified oem table id for build_header
Let build_header() support specified OEM table id so that we can build
multiple SSDT later

If the oem table id is not specified (aka, NULL), we use the default id
instead as the previous behavior

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 18:39:20 +02:00
Xiao Guangrong
5c42eef243 nvdimm: implement NVDIMM device abstract
Introduce "nvdimm" device which is based on pc-dimm device type

Currently, nothing is specific for nvdimm but hotplug is disabled

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 18:39:20 +02:00
Eduardo Habkost
c9c0afbb19 hw/compat.h: Change indentation of HW_COMPAT_* to 4 spaces
Cosmetic change only.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2015-12-22 18:39:20 +02:00
Eduardo Habkost
276a65ba4b pc: Change indentation of PC_COMPAT_* to 4 spaces
Cosmetic change only.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2015-12-22 18:39:20 +02:00
Eduardo Habkost
240240d5da pc: Add pc-*-2.6 machine classes
Add pc-i440fx-2.6 and pc-q35-2.6 machine classes.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2015-12-22 18:39:19 +02:00
Corey Minyard
90b6180500 ipmi: Add a firmware configuration repository
Add a way for IPMI devices to register their firmware information
with the IPMI subsystem so that various firmware entities can pull
that information later for adding to firmware tables.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 18:39:19 +02:00
Corey Minyard
23076bb34b Add a base IPMI interface
Add the basic IPMI types and infrastructure to QEMU.  Low-level
interfaces and simulation interfaces will register with this; it's
kind of the go-between to tie them together.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 18:39:19 +02:00
Eduardo Habkost
13fc834308 pc: Group and document related PCMachineState/PCMachineclass fields
Group related PCMachineState and PCMachineClass fields into
sections, and move existing field descriptions to doc comments.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:13 +02:00
Eduardo Habkost
34be1e7c92 q35: Remove MCHPCIState.guest_info field
The field is not used for anything.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:13 +02:00
Marcel Apfelbaum
81ed6482a3 hw/i386: extend pxb query for all PC machines
Add bus property to PC machines and use it when looking
for primary PCI root bus (bus 0).

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2015-12-22 17:45:13 +02:00
Marcel Apfelbaum
02b07434be hw/pxb: introduce pxb-pcie expander for PCIe machines
The pxb-pcie is the counterpart of pxb for PCI express machines.
The new device re-uses the pxb code, but appears to the guests
as a different device. The pxb-pcie device does not have an internal
pci-pci bridge and exposes a PCIe root bus instead of a PCI one.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 17:45:13 +02:00
Eduardo Habkost
71ae9e94d9 pc: Move option_rom_has_mr/rom_file_has_mr globals to MachineClass
This way, these settings can be simply set on the corresponding
machine_options() function, instead of requiring code in
pc_compat_*() functions.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:12 +02:00
Eduardo Habkost
cdedce0564 pc: Remove enforce-aligned-dimm QOM property
The property is read-only and not used for anything.

Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 17:45:12 +02:00
Eduardo Habkost
16a9e8a5bc pc: Move enforce_aligned_dimm to PCMachineClass
enforce_aligned_dimm never changes after the machine is
initialized, so it can be simply set in PCMachineClass like all
the other compat fields.

Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2015-12-22 17:45:12 +02:00
Eduardo Habkost
cd4040ec18 pc: Move acpi_data_size global to PCMachineClass
This way we don't need code in pc_compat_*() functions to set the legacy
acpi_data_size value.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:12 +02:00
Eduardo Habkost
2b0ddf6612 pc: Move legacy_acpi_table_size global to PCMachineClass
This way we can set legacy_acpi_table_size on the machine_options()
functions, instead of requirng code in pc_compat_*() functions.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:12 +02:00
Eduardo Habkost
7102fa7073 pc: Move compat boolean globals to PCMachineClass
This way the compat flags can be initialized in the machine_options()
function. This will help us to eventually eliminate the pc_compat_*()
functions.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2015-12-22 17:45:12 +02:00
Andrew Baumann
723697551a sdhci: add optional quirk property to disable card insertion/removal interrupts
This is needed for a quirk of the Raspberry Pi (bcm2835/6) MMC
controller, where the card insert bit is documented as unimplemented
(always reads zero, doesn't generate interrupts) but is in fact
observed on hardware as set at power on, but is cleared (and remains
clear) on subsequent controller resets.

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1450738069-18664-4-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-12-22 16:34:26 +08:00
Stefan Hajnoczi
648296e067 block-backend: add blk_get_max_iov()
Add a function to query BlockLimits.max_iov.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-12-22 16:01:07 +08:00
Stefan Hajnoczi
bd44feb754 block: add BlockLimits.max_iov field
The maximum number of struct iovec elements depends on the
BlockDriverState.  The raw-posix and iSCSI protocols have a maximum of
IOV_MAX but others could have different values.

Cc: Peter Lieven <pl@kamp.de>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2015-12-22 16:01:07 +08:00
Peter Maydell
c688084506 Merge QCryptoSecret object support
-----BEGIN PGP SIGNATURE-----
 Version: GnuPG v1
 
 iQIcBAABCAAGBQJWdDmJAAoJEL6G67QVEE/fJ4EP+gNC4ErBDpbg+I4RhLHv/FsF
 i2iEYkmBfFzUmiB8iSFlJY12XJ/CPnbrWks0WNHIoarUgtGuvfqH91KGkERxJQFE
 TkD65EOhNHWlP1zaI5r5ZMizhOdO6EPa0pbS/QH/UCy5qwu5IbG5EesOw00d7nL8
 5gT39ehXAFljutRygWsa1JyOkDB04WNehVsly/l2t9v16aQUDJOC5mWVuoInrDEo
 ye1VG2Cx7Y1/FRo3fFCDwzYD+8jgxIBAu8Igwjk/95VbfBVl769PBAilQRc5zMHt
 //eMNdul6GooVKmu/K1JWkmjIZJFUiboEMgPJElWV1y8bWmhh++4J6EVb55owoDk
 VRv84cqiaYErVb+56gaImr92GSKezll0APWz6YlDsFZgPClCPnUjDSr39+23t1h4
 LprirtbkAjw73T92kuQ7kzbXElWm7rSfcx5u1/S6YPP+EDzZpW9+h62lKGGnuS2M
 bzwFOOmWHe1MhbRSh+BOzGBf1wWhMSCKgLAOmPuRQ8slS91vfE66bIlqpIKBGgfn
 42t0wZCEW8bqIe8xry5pC5UoDfm3cVDhgGHGyMLWWDMez0qDchaAkWNkIDtc8Juv
 a1WqE/0lP/sVb36yLVANvt1/Qvpg6M3JwMbTjVaJl2eTDDtwho4PK+Chxx0a8BGl
 Z6oGj1rmvCDD2Dsi/EXI
 =g+9G
 -----END PGP SIGNATURE-----

Merge remote-tracking branch 'remotes/berrange/tags/pull-qcrypto-secrets-base-2015-12-18-1' into staging

Merge QCryptoSecret object support

# gpg: Signature made Fri 18 Dec 2015 16:51:21 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-qcrypto-secrets-base-2015-12-18-1:
  crypto: add support for loading encrypted x509 keys
  crypto: add QCryptoSecret object class for password/key handling
  qga: convert to use error checked base64 decode
  qemu-char: convert to use error checked base64 decode
  util: add base64 decoding function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2015-12-18 17:04:15 +00:00
Daniel P. Berrange
1d7b5b4afd crypto: add support for loading encrypted x509 keys
Make use of the QCryptoSecret object to support loading of
encrypted x509 keys. The optional 'passwordid' parameter
to the tls-creds-x509 object type, provides the ID of a
secret object instance that holds the decryption password
for the PEM file.

 # printf "123456" > mypasswd.txt
 # $QEMU \
    -object secret,id=sec0,filename=mypasswd.txt \
    -object tls-creds-x509,passwordid=sec0,id=creds0,\
            dir=/home/berrange/.pki/qemu,endpoint=server \
    -vnc :1,tls-creds=creds0

This requires QEMU to be linked to GNUTLS >= 3.1.11. If
GNUTLS is too old an error will be reported if an attempt
is made to pass a decryption password.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 16:25:08 +00:00
Daniel P. Berrange
ac1d887849 crypto: add QCryptoSecret object class for password/key handling
Introduce a new QCryptoSecret object class which will be used
for providing passwords and keys to other objects which need
sensitive credentials.

The new object can provide secret values directly as properties,
or indirectly via a file. The latter includes support for file
descriptor passing syntax on UNIX platforms. Ordinarily passing
secret values directly as properties is insecure, since they
are visible in process listings, or in log files showing the
CLI args / QMP commands. It is possible to use AES-256-CBC to
encrypt the secret values though, in which case all that is
visible is the ciphertext.  For ad hoc developer testing though,
it is fine to provide the secrets directly without encryption
so this is not explicitly forbidden.

The anticipated scenario is that libvirtd will create a random
master key per QEMU instance (eg /var/run/libvirt/qemu/$VMNAME.key)
and will use that key to encrypt all passwords it provides to
QEMU via '-object secret,....'.  This avoids the need for libvirt
(or other mgmt apps) to worry about file descriptor passing.

It also makes life easier for people who are scripting the
management of QEMU, for whom FD passing is significantly more
complex.

Providing data inline (insecure, only for ad hoc dev testing)

  $QEMU -object secret,id=sec0,data=letmein

Providing data indirectly in raw format

  printf "letmein" > mypasswd.txt
  $QEMU -object secret,id=sec0,file=mypasswd.txt

Providing data indirectly in base64 format

  $QEMU -object secret,id=sec0,file=mykey.b64,format=base64

Providing data with encryption

  $QEMU -object secret,id=master0,file=mykey.b64,format=base64 \
        -object secret,id=sec0,data=[base64 ciphertext],\
	           keyid=master0,iv=[base64 IV],format=base64

Note that 'format' here refers to the format of the ciphertext
data. The decrypted data must always be in raw byte format.

More examples are shown in the updated docs.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 16:25:08 +00:00
Daniel P. Berrange
89bc0b6cae util: add base64 decoding function
The standard glib provided g_base64_decode doesn't provide any
kind of sensible error checking on its input. Add a QEMU custom
wrapper qbase64_decode which can be used with untrustworthy
input that can contain invalid base64 characters, embedded
NUL characters, or not be NUL terminated at all.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 16:25:08 +00:00
Fam Zheng
bbe1ef2686 block: Remove prototype of bdrv_swap from header
The function has gone.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Max Reitz
8b13976d3f block: Add opaque value to the amend CB
Add an opaque value which is to be passed to the bdrv_amend_options()
status callback.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2015-12-18 14:34:43 +01:00
Kevin Wolf
145f598e4a block: Introduce bs->explicit_options
bs->options doesn't only contain options that the user explicitly
requested, but also option that were derived from flags, the filename or
inherited from the parent node.

For reopen, it is important to know the difference because reopening the
parent can change inherited values in child nodes, but it shouldn't
change any options that were explicitly specified for the child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18 14:34:43 +01:00
Kevin Wolf
8e2160e2c7 block: Add infrastructure for option inheritance
Options are not actually inherited from the parent node yet, but this
commit lays the grounds for doing so.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18 14:34:42 +01:00
Kevin Wolf
4cdd01d32e block: Pass driver-specific options to .bdrv_refresh_filename()
In order to decide whether a blkdebug: filename can be produced or a
json: one is necessary, blkdebug checked whether bs->options had more
options than just "config", "x-image" or "image" (the latter including
nested options). That doesn't work well when generic block layer options
are present.

This patch passes an option QDict to the driver that contains only
driver-specific options, i.e. the options for the general block layer as
well as child nodes are already filtered out. Works much better this
way.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18 14:34:42 +01:00
Kevin Wolf
260fecf13b block: Exclude nested options only for children in append_open_options()
Some drivers have nested options (e.g. blkdebug rule arrays), which
don't belong to a child node and shouldn't be removed. Don't remove all
options with "." in their name, but check for the complete prefixes of
actually existing child nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18 14:34:42 +01:00
Kevin Wolf
d9b7b05703 block: Allow references for backing files
For bs->file, using references to existing BDSes has been possible for a
while already. This patch enables the same for bs->backing.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2015-12-18 14:34:42 +01:00
Kevin Wolf
5365f44dfa qcow2: Add .bdrv_join_options callback
qcow2 accepts a few driver-specific options that overlap semantically
(e.g. "overlap-check" is an alias of "overlap-check.template", and any
missing cache size option is derived from the given ones).

When bdrv_reopen() merges the set of updated options with left out
options that should be kept at their old value, we need to consider this
and filter out any duplicates (which would generally cause errors
because new and old value would contradict each other).

This patch adds a .bdrv_join_options callback to BlockDriver and
implements it for qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2015-12-18 14:34:42 +01:00
Daniel P. Berrange
d98e4eb7de io: add QIOChannelBuffer class
Add a QIOChannel subclass that is capable of performing I/O
to/from a memory buffer. This implementation does not attempt
to support concurrent readers & writers. It is designed for
serialized access where by a single thread at a time may write
data, seek and then read data back out.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:31 +00:00
Daniel P. Berrange
195e14d026 io: add QIOChannelCommand class
Add a QIOChannel subclass that is capable of performing I/O
to/from a separate process, via a pair of pipes. The command
can be used for unidirectional or bi-directional I/O.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:31 +00:00
Daniel P. Berrange
2d1d0e70cf io: add QIOChannelWebsock class
Add a QIOChannel subclass that can run the websocket protocol over
the top of another QIOChannel instance. This initial implementation
is only capable of acting as a websockets server. There is no support
for acting as a websockets client yet.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:31 +00:00
Daniel P. Berrange
ed8ee42c40 io: add QIOChannelTLS class
Add a QIOChannel subclass that can run the TLS protocol over
the top of another QIOChannel instance. The object provides a
simplified API to perform the handshake when starting the TLS
session. The layering of TLS over the underlying channel does
not have to be setup immediately. It is possible to take an
existing QIOChannel that has done some handshake and then swap
in the QIOChannelTLS layer. This allows for use with protocols
which start TLS right away, and those which start plain text
and then negotiate TLS.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:31 +00:00
Daniel P. Berrange
d6e48869a4 io: add QIOChannelFile class
Add a QIOChannel subclass that is capable of operating on things
that are files, such as plain files, pipes, character/block
devices, but notably not sockets.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:31 +00:00
Daniel P. Berrange
559607ea17 io: add QIOChannelSocket class
Implement a QIOChannel subclass that supports sockets I/O.
The implementation is able to manage a single socket file
descriptor, whether a TCP/UNIX listener, TCP/UNIX connection,
or a UDP datagram. It provides APIs which can listen and
connect either asynchronously or synchronously. Since there
is no asynchronous DNS lookup API available, it uses the
QIOTask helper for spawning a background thread to ensure
non-blocking operation.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:31 +00:00
Daniel P. Berrange
b02db2d920 io: add QIOTask class for async operations
A number of I/O operations need to be performed asynchronously
to avoid blocking the main loop. The caller of such APIs need
to provide a callback to be invoked on completion/error and
need access to the error, if any. The small QIOTask provides
a simple framework for dealing with such probes. The API
docs inline provide an outline of how this is to be used.

Some functions don't have the ability to run asynchronously
(eg getaddrinfo always blocks), so to facilitate their use,
the task class provides a mechanism to run a blocking
function in a thread, while triggering the completion
callback in the main event loop thread. This easily allows
any synchronous function to be made asynchronous, albeit
at the cost of spawning a thread.

In this series, the QIOTask class will be used for things like
the TLS handshake, the websockets handshake and TCP connect()
progress.

The concept of QIOTask is inspired by the GAsyncResult
interface / GTask class in the GIO libraries. The min
version requirements on glib don't allow those to be
used from QEMU, so QIOTask provides a facsimilie which
can be easily switched to GTask in the future if the
min version is increased.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:30 +00:00
Daniel P. Berrange
1c809fa01d io: add helper module for creating watches on FDs
A number of the channel implementations will require the
ability to create watches on file descriptors / sockets.
To avoid duplicating this code in each channel, provide a
helper API for dealing with file descriptor watches.

There are two watch implementations provided. The first
is useful for bi-directional file descriptors such as
sockets, regular files, character devices, etc. The
second works with a pair of unidirectional file descriptors
such as pipes.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:05 +00:00
Daniel P. Berrange
666a3af9c8 io: add abstract QIOChannel classes
Start the new generic I/O channel framework by defining a
QIOChannel abstract base class. This is designed to feel
similar to GLib's GIOChannel, but with the addition of
support for using iovecs, qemu error reporting, file
descriptor passing, coroutine integration and use of
the QOM framework for easier sub-classing.

The intention is that anywhere in QEMU that almost
anywhere that deals with sockets will use this new I/O
infrastructure, so that it becomes trivial to then layer
in support for TLS encryption. This will at least include
the VNC server, char device backend and migration code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2015-12-18 12:18:05 +00:00
Paolo Bonzini
f6d153f1bf rcu: optimize rcu_read_lock
rcu_read_lock cannot change rcu_gp_ongoing from true to false
(the previous value of p_rcu_reader->ctr is zero), hence
there is no need to check p_rcu_reader->waiting and wake up
a concurrent synchronize_rcu.

While at it mark the wakeup as unlikely in rcu_read_unlock.

Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1450265542-4323-1-git-send-email-pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
3cc8f88499 memory: try to inline constant-length reads
memcpy can take a large amount of time for small reads and writes.
Handle the common case of reading s/g descriptors from memory (there
is no corresponding "write" case that is as common, because writes
often use address_space_st* functions) by inlining the relevant
parts of address_space_read into the caller.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
1619d1fe73 memory: inline a few small accessors
These are used in the address_space_* fast paths.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
a203ac702e memory: extract first iteration of address_space_read and address_space_write
We want to inline the case where there is only one iteration, because
then the compiler can also inline the memcpy.  As a start, extract
everything after the first address_space_translate call.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:49 +01:00
Paolo Bonzini
612263cf33 memory: avoid unnecessary object_ref/unref
For the common case of DMA into non-hotplugged RAM, it is unnecessary
but expensive to do object_ref/unref.  Add back an owner field to
MemoryRegion, so that these memory regions can skip the reference
counting.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
a676854f34 memory: reorder MemoryRegion fields
Order fields so that all fields accessed during a RAM read/write fit in
the same cache line.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
49b24afcb1 exec: always call qemu_get_ram_ptr within rcu_read_lock
Simplify the code and document the assumption.  The only caller
that is not within rcu_read_lock is memory_region_get_ram_ptr.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00
Paolo Bonzini
1382902055 user: introduce "-d page"
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2015-12-17 17:33:48 +01:00