disks to other domains) from Jed Davis, <jld@panix.com>:
* Issue multiple requests when necessary rather than
assuming that arbitrary requests can be mapped into single
contiguous virtual address ranges.
* Don't assume that all data for a request is consecutive
in memory. With some client OSes, it's not.
The above two changes fix data corruption issues with Linux
clients with certain filesystem block sizes.
* Gracefully handle memory or pool allocation failures after
beginning to handle a request from the ring.
* Merge contiguous requests to avoid the "64K turns into 44K + 20K
and doubles the transactions per second at the disk" problem
caused by the 11-page limit caused by the structure of Xen
ring entries. This causes a very slight performance decrease
for sequential 64K I/O if the disk is not already saturated with
requests (about 1%) but halves the transactions per second we
hit the disk with -- or better. It even compensates for bizarre
Linux behaviour like breaking long requests up into 5.5K pieces.
* Probably some stuff I forgot to mention.
Disk throughput (though not latency) is now much, much closer to the
"raw hardware" case than it was before.
for xen_microtime(). This match more closely what is done on a real i386
(where we read the RTC), and seems to fix gettimeofday() sometime going
backward by several seconds for me.
initialise the microtime state again. it was initialised at the first
clock tick before time of day was set. Should fix port-xen/29846.
Also, try harder to call microset() once a second; the clock interrupt handler
can be called more often than HZ in some case.
them from interrupt context.
xpq_flush_queue() is called from IPL_NET in if_xennet.c, and
other xpq_queue* functions may be called from interrupt context via
pmap_kenter*(). Should fix port-xen/30153.
Thanks to Jason Thorpe and YAMAMOTO Takashi for enlightments on this issue.
event not to be sent when one is needed. Fixing this would require
one hypercall per packet, instead of one per NB_XMIT_PAGES_BATCH pages.
It's not worth it, so always send an event at the end of xennetback_ifstart()
- there is no callback mechanism to notify us when a guest has handled
packets we sent. If we stop transmitting because the ring is full or we're
out of pages when the ifq is also full, nothing will call
xennetback_ifstart() again and transmit is stalled. Abuse the watchdog
to kick the transmit queue once second after an out of ressources
condition.
where the printing of `version' is already performed.
This has the benefit of allowing the copyright to be available
via dmesg(8) on platforms which need the `msgbuf' to be setup
in cpu_startup() before printed output is remembered.
of relying on the fact that the domain controller is still getting interrupts
when xenconscn_getc()/xenconscn_putc() is called.
Now that polling is fixed, move IPL_CTRL down between IPL_SOFTSERIAL and
IPL_TTY, fixing port-xen/29999 by YAMAMOTO Takashi.
Now that IPL_CTRL is low enouth, remove the softintr stuff from the
domain controller, and call wakeup() directly.
Thanks to YAMAMOTO Takashi and Christian Limpach for feedback and suggestions.
- sort the ih_evt_handler list by IPL, higher first. Otherwise some handlers
would have been delayed, event if they could run at the current IPL.
- As ih_evt_handler is sorted, remove IPLs that have been processed for
an event when calling hypervisor_set_ipending()
- In hypervisor_set_ipending(), enter the event in ipl_evt_mask only
for the lowest IPL. As deffered IPLs are processed high to low,
this ensure that hypervisor_enable_event() will be called only when all
callbacks have been called for an event. We don't need the evtch_maskcount[]
counters any more.
Thanks to YAMAMOTO Takashi for ideas and feedback.
cause an event to be both handled and marked as pending, or being
marked as pending twice (triggering the diagnostic check
evtch_maskcount[port] == 0 in hypervisor_set_ipending):
mask and clear event by word of 32bit in do_hypervisor_event() or stipending(),
instead of by indiviual bits in do_event() or xenevt_event().
In addition this is marginally more efficient.