Commit Graph

22765 Commits

Author SHA1 Message Date
Aurelien Jarno
a52ad07e7c tcg: always mark dead input arguments as dead
Always mark dead input arguments as dead, even if the op is at the basic
block end. This will allow to check that all temps are correctly saved.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:22 +01:00
Aurelien Jarno
c29c1d7edf tcg: rewrite tcg_reg_alloc_mov()
Now that the liveness analysis provides more information, rewrite
tcg_reg_alloc_mov(). This changes the behaviour about propagating
constants and memory accesses. We now take the assumption that once
a value is loaded into a register (from memory or from a constant),
it's better to keep it there than to reload it later. This assumption
is now always almost correct given that we are now sure the
corresponding temp is going to be used later (otherwise it would have
been synchronized and marked as dead already). The assumption is wrong
if one of the op after clobbers some registers including the one
of the holding the temp (this can be avoided by allocating clobbered
registers last, which is what most TCG target do), or in case of lack
of available register.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:22 +01:00
Aurelien Jarno
4c4e1ab26b tcg: improve tcg_reg_alloc_movi()
Now that the liveness analysis might mark some output temps as dead, call
temp_dead() if needed.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:21 +01:00
Aurelien Jarno
9c43b68de6 tcg: rework liveness analysis
Rework the liveness analysis by tracking temps that need to go back to
memory in addition to dead temps tracking. This allows to mark output
arguments as "need sync", and to synchronize them back to memory as soon
as they are not written anymore. This way even arguments mapping to
globals can be marked as "dead", avoiding moves to a new register when
input and outputs are aliased.

In addition it means that registers are freed as soon as temps are not
used anymore, instead of waiting for a basic block end or an op with side
effects. This reduces register spilling especially on CPUs with few
registers, and spread the mov over all the TB, increasing the
performances on in-order CPUs.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:21 +01:00
Aurelien Jarno
ec7a869d31 tcg: sync output arguments on liveness request
Synchronize an output argument when requested by the liveness analysis.
This is needed so that the temp can be declared dead later.

For that, add a new op_sync_args table in which each bit tells if the
corresponding output argument needs to be synchronized with the memory.
Pass it to the tcg_reg_alloc_* functions, and honor this bit. We need to
synchronize the argument before marking it as dead, and we have to make
sure all the infos about the temp are correctly filled.

At the same time change some types from unsigned int to uint16_t when
passing op_dead_args.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:21 +01:00
Aurelien Jarno
1ad80729be tcg: add temp_sync()
Add a new function temp_sync() to synchronize the canonical location
of a temp with the value in the corresponding register, but without
freeing the associated register. Rewrite temp_save() to call
temp_sync() followed by temp_dead().

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:21 +01:00
Aurelien Jarno
7f6ceedf9c tcg: add tcg_reg_sync()
Add a new function tcg_reg_sync() to synchronize the canonical location
of a temp with the value in the associated register, but without freeing
it. Rewrite tcg_reg_free() to first call tcg_reg_sync() and then to free
the register.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:21 +01:00
Aurelien Jarno
639368dd68 tcg: add temp_dead()
A lot of code is duplicated to mark a temporary as dead. Replace it
by temp_dead(), which in addition marks the temp as saved in memory
for globals and local temps, instead of doing this a posteriori in
temp_save().

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:21 +01:00
Aurelien Jarno
17b914912d tcg/i386: remove ld/st third argument register constraint
On x86_64, remove the constraint on the third argument register which
is not needed:
 - For loads the helper arguments are env, addr, mem_idx. The addr
   value should not be in the two first argument registers as they are
   used in tcg_out_tlb_load().
 - For stores the helper arguments are env, addr, data, mem_idx.
   The addr and data values should not be in the two first argument
   registers as they are used in tcg_out_tlb_load(). The data value
   should also not be in the two first argument registers, but could
   be in the third argument register in which case it would be already
   loaded at the right location.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:15 +01:00
Aurelien Jarno
166792f7bb tcg/i386: remove suboptimal register shifting
Now that CONFIG_TCG_PASS_AREG0 has been removed, it's easier to get
an optimal code for the load/store functions.

First swap the two registers used in tcg_out_tlb_load() so that the
address end-up in the second register instead of the first one. Adjust
tcg_out_qemu_ld() and tcg_out_qemu_st() to respectively call
tcg_out_qemu_ld_direct() and tcg_out_qemu_st_direct() with the correct
registers. Then replace the register shifting by direct load of the
arguments.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
2012-10-28 14:54:05 +01:00
Max Filippov
50cd721482 hw/xtensa_sim: get rid of intermediate xtensa_sim_init
Remove xtensa_sim_init that only explodes machine init args, rename
sim_init to xtensa_sim_init.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 15:04:00 +00:00
Max Filippov
d64ed08eec hw/xtensa_lx60: don't prematurely explode QEMUMachineInitArgs
Don't explode QEMUMachineInitArgs before passing it to lx_init.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 15:03:59 +00:00
Peter Maydell
d1bd2423a9 Makefile: Forbid out-of-tree build from a source tree that has been built in
If we try to do an out-of-tree build but the source tree we're building from
has been used in the past for an in-tree build then things will go
confusingly wrong. Specifically, some parts of the build process will pull
in generated files from the old in-tree build (because SRC_PATH is on
the vpath). Diagnose this situation so we can produce a useful error
message and tell the user how to fix it (run distclean in the source tree).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 14:41:13 +00:00
Catalin Patulea
49cdaea18b tests/tcg: fix a few warnings
Signed-off-by: Catalin Patulea <catalinp@google.com>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 14:37:25 +00:00
Aurelien Jarno
0356404b0f target-sparc64: disable VGA cirrus
OpenBIOS on sparc64 only support Standard VGA and not Cirrus VGA. Don't
build Cirrus VGA support so that it can't be selected.

This fixes the breakage introduced by commit f2898771.

Reported-by: Richard Henderson <rth@twiddle.net>
Cc: Blue Swirl <blauwirbel@gmail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 14:36:04 +00:00
Blue Swirl
8a52731705 Merge branch 'target-arm.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm
* 'target-arm.for-upstream' of git://git.linaro.org/people/pmaydell/qemu-arm:
  target-arm: Remove out of date FIXME regarding saturating arithmetic
  target-arm: Implement abs_i32 inline rather than as a helper
  target-arm: Use TCG operation for Neon 64 bit negation
  arm-semi.c: Handle get/put_user() failure accessing arguments
2012-10-27 14:21:37 +00:00
Bruce Rogers
9bca81624e configure: avoid compiler warning in pipe2 detection
When building qemu-kvm for openSUSE:Factory, I am getting a
warning in the pipe2 detection performed by configure, which
prevents using --enable-werror.

Change detection code to use return value of pipe2.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 14:21:04 +00:00
Peter Maydell
c1556a812a configure: Disable (clang) initializer-overrides warnings
Disable clang's initializer-overrides warnings, as QEMU makes significant
use of the pattern of initializing an array with a range-based default
entry like
    [0 ... 0x1ff] = { GPIO_NONE, 0 }
followed by specific entries which override that default, and clang
would otherwise warn "initializer overrides prior initialization of
this subobject" when it encountered the specific entry.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
2012-10-27 14:20:06 +00:00
Gerd Hoffmann
0ebfb144e8 xhci: fix usb name in caps
Used to be "UTB" not "USB".

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 14:38:12 +02:00
Gerd Hoffmann
91062ae00f xhci: make number of interrupters and slots configurable
Add properties to tweak the numbers of available interrupters and slots.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 14:37:34 +02:00
Gerd Hoffmann
e099ad4b7e xhci: allow disabling interrupters
For secondary interrupters this is explicitly allowed in the specs.
For the primary interrupter behavior is undefined, lets be friendly
and allow disabling too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 14:35:55 +02:00
Gerd Hoffmann
3f973ee84e xhci: flush endpoint context unconditinally
Not updating the endpoint context in case the state didn't change is
wrong.  Other context fields might have changed, for example the
dequeue pointer in response to a CR_SET_TR_DEQUEUE command.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 14:35:47 +02:00
Gerd Hoffmann
79a8af3509 xhci: fix function name in error message
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 14:35:39 +02:00
Hans de Goede
6fe30910ab uhci: Use only one queue for ctrl endpoints
ctrl endpoints use different pids for different phases of a control
transfer, this patch makes us use only one queue for a ctrl ep, rather
then 3.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:12 +02:00
Hans de Goede
8928c9c43d uhci: Retry to fill the queue while waiting for td completion
If the guest is using multiple transfers to try and keep the usb bus busy /
used at maximum efficiency, currently we would see / do the following:

1) submit transfer 1 to the device
2) submit transfer 2 to the device
3) report transfer 1 completion to guest
4) report transfer 2 completion to guest
5) submit transfer 1 to the device
6) report transfer 1 completion to guest
7) submit transfer 2 to the device
8) report transfer 2 completion to guest
etc.

So after the initial submission we would effectively only have 1 transfer
in flight, rather then 2. This is caused by us not checking the queue for
addition of new transfers by the guest (ie the resubmission of a recently
finished transfer), while waiting for a pending transfer to complete.
This patch does add a check for this, changing the sequence to:

1) submit transfer 1 to the device
2) submit transfer 2 to the device
3) report transfer 1 completion to guest
4) submit transfer 1 to the device
5) report transfer 2 completion to guest
6) submit transfer 2 to the device
etc.

Thus keeping 2 transfers in flight (most of the time, and always 1),
as intended by the guest.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:12 +02:00
Hans de Goede
3905097ea8 uhci: Always mark a queue valid when we encounter it
Before this patch we would not mark a queue valid when its head was a
non-active td. This causes us to misbehave in the following scenario:

1) queue with multiple input transfers queued
2) We hit some latency issue, causing qemu to get behind processing frames
3) When qemu gets to run again, it notices the first transfer ends short,
   marking the head td non-active
4) It now processes 32+ frames in a row without giving the guest a chance
   to run since it is behind
5) valid is decreased to 0, causing the queue to get cancelled also cancelling
   already queued up further input transfers
6) guest gets to run, notices the inactive td, cleanups up further tds
   from the short transfer, and lets the queue continue at the first td of
   the next input transfer
7) we re-start the queue, issuing the second input transfer for the *second*
   time, and any data read by the first time we issued it has been lost

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:12 +02:00
Hans de Goede
420ca987d5 uhci: When the guest marks a pending td non-active, cancel the queue
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:12 +02:00
Hans de Goede
8c75a899f8 uhci: Detect guest td re-use
A td can be reused by the guest in a different queue, before we notice
the original queue has been unlinked. So search for tds by addr only, detect
guest td reuse, and cancel the original queue, this is necessary to keep our
packet ids unique.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
66a08cbe6a uhci: Verify queue has not been changed by guest
According to the spec a guest can unlink a qh, and then as soon as frindex
has changed by 1 since the unlink, assume it is idle and re-use it. However
for various reasons, we cannot simply consider a qh as unlinked if we've not
seen it for 1 frame. This means that it is possible for a guest to re-use /
restart the queue while we still see its old state. This patch adds a safety
check for this, and "early" retires queues when they were changed by the guest.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
5ad23e873c uhci: Immediately free queues on device disconnect
There is no need to just cancel any in-flight packets, and then wait
for validate-end to clean things up, we can simply clean things up
immediately on device removal.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
11d15e402b uhci: Store ep in UHCIQueue
This avoids the need to repeatedly lookup the device, and ep.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
a4f30cd766 uhci: Make uhci_fill_queue() actually operate on an UHCIQueue
And move its calling point to handle_td, this removes the ep_ret ugliness,
and prepates the way for further cleanups in the follow-up patches in this
patch-set.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
963a68b54f uhci: Add uhci_read_td() helper function
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
1f250cc772 uhci: Rename UHCIAsync->td to UHCIAsync->td_addr
We use the name td both to refer to a UHCI_TD read from guest memory as
well as to refer to the guest address where a td is stored, switch over
to always use td_addr in the second case for consistency.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
4050737726 uhci: Move emptying of the queue's asyncs' queue to uhci_queue_free
Cleanup: all callers of uhci_queue_free first unconditionally cancel
all remaining asyncs in the queue, so lets move this to uhci_queue_free().

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
3c87c76d1a uhci: Drop unnecessary forward declaration of some static functions
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:11 +02:00
Hans de Goede
a89e255b0c uhci: Don't retry on error
Since we are either dealing with emulated devices, where retrying is
not going to help, or with redirected devices where the host OS will
have already retried, don't bother retrying on failed transfers.

Also move some common/indentical code out of all the error cases
into the generic error path.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
2f2ee2689f uhci: cleanup: Add an unlink call to uhci_async_cancel()
All callers of uhci_async_cancel() call uhci_async_unlink() first, so
lets move the unlink call to uhci_async_cancel()

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
5b352ed537 uhci: No need to handle async completion of isoc packets
No devices ever return async for isoc endpoints and the core
already enforces this.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
aaac74343d usb: Enforce iso endpoints never returing USB_RET_ASYNC
ehci was already testing for this, and we depend in various places
on no devices doing this, so lets move the check for this to the
usb core.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
a6fb2ddb14 usb: Add an int_req flag to USBPacket
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
6ba43f1f6b usb: Move short-not-ok handling to the core
After a short-not-ok packet ending short, we should not advance the queue.
Move enforcing this to the core, rather then handling it in the hcd code.

This may result in the queue now actually containing multiple input packets
(which would not happen before), and this requires special handling in
combination with pipelining, so disable pipleining for input endpoints
(for now).

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
0cae7b1a00 usb: Move clearing of queue on halt to the core
hcds which queue up more then one packet at once (uhci, ehci and xhci),
must clear the queue after an error which has caused the queue to halt.

Currently this is handled as a special case inside the hcd code, this
patch instead adds an USB_RET_REMOVE_FROM_QUEUE packet result code, teaches
the 3 hcds about this and moves the clearing of the queue on a halt into
the USB core.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:10 +02:00
Hans de Goede
36dfe324fd usb: Add USB_RET_ADD_TO_QUEUE packet result code
This can be used by usb-device code which wishes to process an entire endpoint
queue at once, to do this the usb-device code returns USB_RET_ADD_TO_QUEUE
from its handle_data class method and defines a flush_ep_queue class method
to call when the hcd is done queuing up packets.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00
Hans de Goede
d0ff81b871 usb: Rename __usb_packet_complete to usb_packet_complete_one
And make it available for use outside of core.c

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00
Hans de Goede
3151f2096d xhci: Add a xhci_ep_nuke_one_xfer helper function
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00
Hans de Goede
b4ea866499 ehci: Retry to fill the queue while waiting for td completion
If the guest is using multiple transfers to try and keep the usb bus busy /
used at maximum efficiency, currently we would see / do the following:

1) submit transfer 1 to the device
2) submit transfer 2 to the device
3) report transfer 1 completion to guest
4) report transfer 2 completion to guest
5) submit transfer 1 to the device
6) report transfer 1 completion to guest
7) submit transfer 2 to the device
8) report transfer 2 completion to guest
etc.

So after the initial submission we would effectively only have 1 transfer
in flight, rather then 2. This is caused by us not checking the queue for
addition of new transfers by the guest (ie the resubmission of a recently
finished transfer), while waiting for a pending transfer to complete.
This patch does add a check for this, changing the sequence to:

1) submit transfer 1 to the device
2) submit transfer 2 to the device
3) report transfer 1 completion to guest
4) submit transfer 1 to the device
5) report transfer 2 completion to guest
6) submit transfer 2 to the device
etc.

Thus keeping 2 transfers in flight (most of the time, and always 1),
as intended by the guest.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00
Hans de Goede
e3a36bce1d ehci: Detect going in circles when filling the queue
For ctrl endpoints Windows (atleast Win7) creates circular td lists, so far
these were not a problem because we would stop filling the queue if altnext
was set. Since further patches in this patchset remove the altnext check this
does become a problem and we need detection for going in circles.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00
Hans de Goede
44272b0f88 ehci: Speed up the timer of raising int from the async schedule
Often the guest will queue up new packets in response to a packet, in the
async schedule with its IOC flag set, completing. By speeding up the
frame-timer, we notice these new packets earlier. This increases the
speed (MB/s) of a Linux guest reading from a USB mass storage device by a
factor of 1.15 on top of the "Improve latency of interrupt delivery"
speed-ups, both with and without input pipelining enabled.

I've not tested the speed-up of this patch without the
"Improve latency of interrupt delivery" patch.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00
Hans de Goede
0262f65aaa ehci: Improve latency of interrupt delivery and async schedule scanning
While doing various performance tests of reading from USB mass storage devices
I noticed the following::
1) When an async handled packet completes, we don't immediately report an
   interrupt to the guest, instead we wait for the frame-timer to run and
   report it from there
2) If 1) has been fixed and an async handled packet takes a while to complete,
   then async_stepdown will become a high value, which means that there
   will be a large latency before any new packets queued by the guest in
   response to the interrupt get seen

1) was done deliberately as part of commit f0ad01f92:
http://www.kraxel.org/cgit/qemu/commit/?h=usb.57&id=f0ad01f92ca02eee7cadbfd225c5de753ebd5fce
Since setting the interrupt immediately on async packet completion was causing
issues with Linux guests, I believe this recently fixed Linux bug explains
why this is happening:
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux.git;a=commitdiff;h=361aabf395e4a23cf554cf4ec0c0c6963b8beb01

Note that we can *not* count on this fix being present in all Linux guests!

I was hoping that the recently added support for Interrupt Threshold Control
would fix the issues with Linux guests, but adding a simple ehci_commit_irq()
call to ehci_async_bh() still caused problems with Linux guests.

The problem is, that when doing ehci_commit_irq() from ehci_async_bh(),
the "old" frindex value is used to calculate usbsts_frindex, and when
the frame-timer then runs possibly very shortly after ehci_async_bh(),
it increases the frame-timer, and thus any interrupts raised from that
frame-timer run, will also get reported to the guest immediately, rather
then being delayed to the next frame-timer run.

Luckily the solution for this is simple, this means that we need to
increase frindex before calling ehci_commit_irq() from ehci_async_bh(),
which in the end boils down to simple calling ehci_frame_timer() instead
of ehci_async_bh() from the bh.

This may seem like it causes a lot of extra work to be done, but this
is not true. Any work done from the frame-timer processing the periodic
schedule is work which then does not need to be done the next time the
frame timer runs, also the frame-timer will re-arm itself at (possibly)
a later time then it was armed for saving a vmexit at that time.

As an additional advantage moving to simply calling the frame-timer also
fixes 2) as the packet completion will set async_stepdown to 0, and the
re-arming of the timer with an async_stepdown of 0 ensures that any
newly queued up packets get seen in a reasonable amount of time.

This improves the speed (MB/s) of a Linux guest reading from a USB mass
storage device by a factor of 1.5 - 1.7 with input pipelining disabled,
and by a factor of 1.8 with input pipelining enabled.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2012-10-25 09:08:09 +02:00