Commit Graph

18162 Commits

Author SHA1 Message Date
Gleb Natapov
a0fa82085e enable architectural PMU cpuid leaf for kvm
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Vasilis Liaskovitis
991dfefdee Set numa topology for max_cpus
qemu-kvm passes numa/SRAT topology information for smp_cpus to SeaBIOS. However
SeaBIOS always expects to setup max_cpus number of SRAT cpu entries
(MaxCountCPUs variable in build_srat function of Seabios). When qemu-kvm runs
with smp_cpus != max_cpus (e.g. -smp 2,maxcpus=4), Seabios will mistakenly use
memory SRAT info for setting up CPU SRAT entries for the offline CPUs. Wrong
SRAT memory entries are also created. This breaks NUMA in a guest.
Fix by setting up SRAT info for max_cpus in qemu-kvm.

Signed-off-by: Vasilis Liaskovitis <vasilis.liaskovitis@profitbricks.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
cce47516cd kvm: x86: Drop redundant apic base and tpr update from kvm_get_sregs
The latter was already commented out, the former is redundant as well.
We always get the latest changes after return from the guest via
kvm_arch_post_run.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
fabacc0f79 kvm: x86: Avoid runtime allocation of xsave buffer
Keep a per-VCPU xsave buffer for kvm_put/get_xsave instead of
continuously allocating and freeing it on state sync.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:01 -02:00
Jan Kiszka
6b42494b21 kvm: x86: Use symbols for all xsave field
Field 0 (FCW+FSW) and 1 (FTW+FOP) were hard-coded so far.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
2011-12-22 14:53:00 -02:00
Paolo Bonzini
44f76b289a nbd: add myself as maintainer
Not planning to do much else, hence listing it as "Odd Fixes".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
41996e3803 qemu-nbd: throttle requests
Limiting the number of in-flight requests is implemented very simply
with a can_read callback.  It does not require a semaphore, unlike the
client side in block/nbd.c, because we can throttle directly the creation
of coroutines.  The client side can have a coroutine created at any time
when an I/O request is made.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
262db38871 qemu-nbd: asynchronous operation
Using coroutines enable asynchronous operation on both the network and
the block side.  Network can be owned by two coroutines at the same time,
one writing and one reading.  On the send side, mutual exclusion is
guaranteed by a CoMutex.  On the receive side, mutual exclusion is
guaranteed because new coroutines immediately start receiving data,
and no new coroutines are created as long as the previous one is receiving.

Between receive and send, qemu-nbd can have an arbitrary number of
in-flight block transfers.  Throttling is implemented by the next
patch.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
72deddc5e6 qemu-nbd: add client pointer to NBDRequest
By attaching a client to an NBDRequest, we can avoid passing around the
socket descriptor and data buffer.

Also, we can now manage the reference count for the client in
nbd_request_get/put request instead of having to do it ourselved in
nbd_read.  This simplifies things when coroutines are used.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
1743b51586 qemu-nbd: move client handling to nbd.c
This patch sets up the fd handler in nbd.c instead of qemu-nbd.c.  It
introduces NBDClient, which wraps the arguments to nbd_trip in a single
structure, so that we can add a notifier to it.  This way, qemu-nbd can
know about disconnections.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
a61c67828d qemu-nbd: use common main loop
Using a single main loop for sockets will help yielding from the socket
coroutine back to the main loop, and later reentering it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:59 +01:00
Paolo Bonzini
cbcfa0418f link the main loop and its dependencies into the tools
Using the main loop code from QEMU enables tools to operate fully
asynchronously.  Advantages include better Windows portability (for some
definition of portability) over glib's.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
d9a7380658 qemu-nbd: introduce NBDRequest
Move the buffer from NBDExport to a new structure, so that it will be
possible to have multiple in-flight requests for the same export
(and for the same client too---we get that for free).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
af49bbbe78 qemu-nbd: introduce NBDExport
Wrap the common parameters of nbd_trip and nbd_negotiate in a
single opaque struct.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
a030b347aa qemu-nbd: introduce nbd_do_receive_request
Group the receiving of a response and the associated data into a new function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
fae6941629 qemu-nbd: more robust handling of invalid requests
Fail invalid requests with EINVAL instead of dropping them into
the void.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
2204559203 qemu-nbd: introduce nbd_do_send_reply
Group the sending of a reply and the associated data into a new function.
Without corking, the caller would be forced to leave 12 free bytes at the
beginning of the data pointer.  Not too ugly, but still ugly. :)

Using nbd_do_send_reply everywhere will help when the routine will set up
the write handler that re-enters the send coroutine.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
a478f6e595 qemu-nbd: simplify nbd_trip
Use TCP_CORK to remove a violation of encapsulation, that would later
require nbd_trip to know too much about an NBD reply.

We could also switch to sendmsg (qemu_co_sendv) later, it is even
easier once coroutines are in.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
128aa58947 move corking functions to osdep.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
3777b09fd7 qemu-nbd: remove data_size argument to nbd_trip
The size of the buffer is in practice part of the protocol.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
94607e7a77 qemu-nbd: remove offset argument to nbd_trip
The argument is write-only.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Chunyan Liu
3e05c78551 Update ioctl order in nbd_init() to detect EBUSY
Update ioctl(s) in nbd_init() to detect device busy early.

Current nbd_init() issues NBD_CLEAR_SOCKET before NBD_SET_SOCKET, if issuing
"qemu-nbd -c /dev/nbd0 disk.img" twice, the second time won't detect EBUSY in
nbd_init(), but in nbd_client will report EBUSY and do clear socket (the 1st
time command will be affacted too because of no socket any more.)

No change to previous version.

Signed-off-by: Chunyan Liu <cyliu@suse.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:58 +01:00
Paolo Bonzini
7a706633e9 nbd: add support for NBD_CMD_TRIM
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
1486d04a1b nbd: add support for NBD_CMD_FLUSH
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
2c7989a9b1 nbd: add support for NBD_CMD_FLAG_FUA
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
adcf6302de nbd: fix error handling in the server
bdrv_read and bdrv_write return negative errno values, not -1.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
ecda3447d1 nbd: allow multiple in-flight requests
Allow sending up to 16 requests, and drive the replies to the coroutine
that did the request.  The code is written to be exactly the same as
before this patch when MAX_NBD_REQUESTS == 1 (modulo the extra mutex
and state).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
d9b09f13ca nbd: split requests
qemu-nbd has a limit of slightly less than 1M per request.  Work
around this in the nbd block driver.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
ae255e523c nbd: switch to asynchronous operation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:57 +01:00
Paolo Bonzini
8c5135f90e sheepdog: move coroutine send/recv function to generic code
Outside coroutines, avoid busy waiting on EAGAIN by temporarily
making the socket blocking.

The API of qemu_recvv/qemu_sendv is slightly different from
do_readv/do_writev because they do not handle coroutines.  It
returns the number of bytes written before encountering an
EAGAIN.  The specificity of yielding on EAGAIN is entirely in
qemu-coroutine.c.

Reviewed-by: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-22 11:53:53 +01:00
Amit Shah
03ecd2c80a virtio-serial-bus: Ports are expected to implement 'have_data' callback
There's no need to check if ports can accept any incoming data from the
guest each time the guest sends data.  Check if the port implements such
functionality during port initialisation.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-21 15:00:29 -06:00
Amit Shah
05e7af694c virtio-console: Properly initialise class methods
The earlier code really was a hack: initialising class methods in an
object init function as noted by Anthony.

The motivation for that was to not have the virtio-serial-bus call into
the callback functions if there was no chardev backend registered.
However, that really wasn't a worthwhile optimisation, and definitely
not one that was well-implemented.  Get rid of it.

Reported-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-21 15:00:29 -06:00
Amit Shah
6640422c17 virtio-console: Check if chardev backends available before calling into them
For the callback functions invoked by the virtio-serial-bus code, check
if we have chardev backends registered before we call into the chardev
functions.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Reported-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-21 15:00:29 -06:00
Paolo Bonzini
993295fedc add qemu_send_full and qemu_recv_full
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2011-12-21 15:00:24 +01:00
Harsh Prateek Bora
058a96ed50 scripts/analyse-9p-simpletrace.py: Add symbolic names for 9p operations.
Currently, we just print the numerical value of 9p operation identifier in
case of RERROR which is less meaningful for readability. Mapping 9p
operation ids to symbolic names provides a better tracelog:

	RERROR (tag = 1 , id = TWALK , err = " No such file or directory ")
	RERROR (tag = 1 , id = TUNLINKAT , err = " Directory not empty ")

This patch provides a dictionary of all possible 9p operation symbols mapped
to their numerical identifiers which are likely to be used in future at
various places in this script.

Signed-off-by: Harsh Prateek Bora <harsh@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2011-12-21 12:37:23 +05:30
Aneesh Kumar K.V
e4027caf93 hw/9pfs: iattr_valid flags are kernel internal flags map them to 9p values.
Kernel internal values can change, add protocol values for these constant and
use them.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2011-12-21 12:37:23 +05:30
Aneesh Kumar K.V
2f008a8c97 hw/9pfs: Use the correct signed type for different variables
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2011-12-21 12:37:23 +05:30
Stefan Hajnoczi
302a0d3ed7 hw/9pfs: replace iovec manipulation with QEMUIOVector
The v9fs_read() and v9fs_write() functions rely on iovec[] manipulation
code should be replaced with QEMUIOVector to avoid duplicating code.
In the future it may be possible to make the code even more concise by
using QEMUIOVector consistently across virtio and 9pfs.

The "v" format specifier for pdu_marshal() and pdu_unmarshal() is
dropped since it does not actually pack/unpack anything.  The specifier
was also not implemented to update the offset variable and could only be
used at the end of a format string, another sign that this shouldn't
really be a format specifier.  Instead, see the new
v9fs_init_qiov_from_pdu() function.

This change avoids a possible iovec[] buffer overflow when indirect
vrings are used since the number of vectors is now limited by the
underlying VirtQueueElement and cannot be out-of-bounds.

Signed-off-by: Stefan Hajnoczi <stefanha@linux.vnet.ibm.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
2011-12-21 12:37:22 +05:30
Andrzej Zaborowski
3799ce4ab6 sd: Remember to reset .expecting_acmd on reset.
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:04:21 +01:00
Peter Maydell
fcfa9351c5 hw/sd.c: Clear status bits when read via response r6
Response format r6 includes a subset of the status bits;
clear the clear-on-read bits which are read by an r6 response.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:49 +01:00
Peter Maydell
1d06cb7ab9 hw/sd.c: Correct handling of APP_CMD status bit
Fix some bugs in our implementation of the APP_CMD status bit:
 * the response to an ACMD should have APP_CMD set, not cleared
 * if an illegal ACMD is sent then the next command should be
   handled as a normal command

This requires that we split "card is expecting an ACMD" from
the state of the APP_CMD status bit (the latter indicates
both "expecting ACMD" and "that was an ACMD").

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:46 +01:00
Peter Maydell
10a412dab3 hw/sd.c: Correct handling of type B SD status bits
Correct how we handle the type B ("cleared on valid command")
status bits. In particular, the CURRENT_STATE bits in a response
should be the state of the card when it received that command,
not the state when it received the preceding command. (This is
one of the issues noted in LP:597641.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:42 +01:00
Peter Maydell
5b08bfe2e9 hw/sd.c: Set ILLEGAL_COMMAND for ACMDs in invalid state
App commands in an invalid state should set ILLEGAL_COMMAND, not
merely return a zero response.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:39 +01:00
Peter Maydell
b1f517ed43 hw/sd.c: Handle CRC and locked-card errors in normal code path
Handle returning CRC and locked-card errors in the same code path
we use for other responses. This makes no difference in behaviour
but means that these error responses will be printed by the debug
logging code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:35 +01:00
Peter Maydell
53bb8cc485 hw/sd.c: Handle illegal commands in sd_do_command
Add an extra sd_illegal value to the sd_rsp_type_t enum so that
sd_app_command() and sd_normal_command() can tell sd_do_command()
that the command was illegal. This is needed so we can do things
like reset certain status bits only on receipt of a valid command.
For the moment, just use it to pull out the setting of the
ILLEGAL_COMMAND status bit into sd_do_command().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:31 +01:00
Peter Maydell
e30d59388b hw/sd.c: When setting ADDRESS_ERROR bit, don't clear everything else
Fix a typo that meant that ADDRESS_ERRORs setting or clearing write
protection would clear every other bit in the status register.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:27 +01:00
Peter Maydell
abda1f37ee hw/sd.c: On CRC error, set CRC error status bit rather than clearing it
If we fail to validate the CRC for an SD command we should be setting
COM_CRC_ERROR, not clearing it. (This bug actually has no effect currently
because sd_req_crc_validate() always returns success.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:21 +01:00
Peter Maydell
b8d334c828 hw/sd.c: Add comment regarding CARD_STATUS_* defines
Add a clarifying comment about what the CARD_STATUS_[ABC]
macros are defining.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 05:01:17 +01:00
Peter Maydell
25881d3390 hw/sd.c: Fix the set of commands which are failed when card is locked
Fix bugs in the code determining whether to accept a command when the
SD card is locked. Most notably, we had the condition completely
reversed, so we would accept all the commands we should refuse and
refuse all the commands we should accept. Correct this by refactoring
the enormous if () clause into a separate function.
We had also missed ACMD42 off the list of commands which are accepted
in locked state: add it.

This is one of the two problems reported in LP:597641.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrzej Zaborowski <andrew.zaborowski@intel.com>
2011-12-21 04:59:49 +01:00
Peter Maydell
e114fead27 hw/sysbus.c: Remove unnecessary conditionals
Now that all sysbus MMIO regions are MemoryRegions, mmio[n].memory
is never NULL, and we can remove some unnecessary conditionals.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
2011-12-20 15:44:31 -06:00