- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)
xennetback instances. To support this, more fields from xni_page to xni_pkt.
This would also have the effect to move more code into the
while (!SLIST_EMPTY(&pkt_page->xni_pkt_head)) loop in xennetback_tx_free().
By passing xni_pkt instead of xni_page to xennetback_tx_free we can
avoid the loop, and the xni_pkt_head list completely. There is even a sligh
performance increase if the domU deals with xni_txring->event properly.
Based on comments from YAMAMOTO Takashi.
In theory mbufs can have an infinite life time and could block the transmit
ring (as slots are released when the mbuf external storage is freed). To
avoid this, when we're processing the last slot of the ring copy the buffer
and release the slot immediatly.
receive system:
- on the receive side, attach the mapped buffer as external storage
instead of copying it. As a mapped buffer may not live much longer, we
have to deal with the fact that one page of buffer may containt several
packets, and it's not possible to map them several times. Use a hashed list
to keep track of mapped pages, and use reference counters.
- on the transmit side, when MCLBYTES == PAGE_SIZE, give away the mbuf
cluster page when possible instead of copying it. Keep a pool of physical
pages to map in place of the page we give away. When copying, use a
pool_cache(9) to manage copy buffers (use mclpool_cache when
MCLBYTES == PAGE_SIZE, otherwise use a private pool/pool_cache) instead
of a local list. This should reduce the number of hypercalls and MMU
operations in the copy case as well.
send queue. This give upper layer an opportunity to queue up all available
packets before starting to process them. This reduce the number of interrupt
generated on the backend, and the time spent doing hypercalls in a significant
way.
the ressources (this can lead to a dom0 kernel crash when destroying a domain
with active I/O going). For this, add a refcount to struct xbdback_instance,
and defer CMSG_BLKIF_BE_DISCONNECT reply until refcount is back to 0 (which
means all queued buffers have completed).
Based on patch sent by Jed Davis on port-xen.
instead to always trying PG_RW and falling back to PG_RO if this fails.
Use uvm_map_checkprot() in IOCTL_PRIVCMD_MMAP and IOCTL_PRIVCMD_MMAP_BATCH
to compute the appropriate vm_prot_t for pmap_remap_pages().
Thanks to Jed Davis for pointing out uvm_map_checkprot().
In pmap_remap_pages() new mappings are created (PG_RW|PG_M). When saving
a domain, the hypervisor will refuse to map the foreing pages RW.
As a temporary measure, retry the mapping read-only if PG_RW fails, so that
domain save will work. Also fix the PTP's wire_count if the MMU update
fails (prevent a kernel panic).
which attach to hypervisor. This allows to use config_found_ia() instead of
config_found(), instead of relying on the order of which device are
written in ioconf.c.
From Quentin Garnier.
- Define _BUS_AVAIL_END to 0xffffffff, as we don't have an easy way to
find the upper bound for our machine address space (and this can change
when we swap pages with the hypervisor).
- implement _xen_bus_dmamem_alloc_range(), which will request a contigous
set of pages to the hypervisor if the pages returned by uvm_pglistalloc()
don't fit the constraints.
We can't deal with the low/high constraints yet, because Xen doesn't offer a
way to get pages in a specific ranges of addresses.
Based on patches from Dave Thompson (in private mail), with heavy hacking
by me.
the generic pci_enumerate_bus() in Xen because the hypervisor may
hide the function 0, but give access to other functions of the same device
and pci_enumerate_bus() will stop if function 0 isn't available.
Ceri Storey to port-xen, with some additionnal changes by me:
- include bus_dma.c, bus_space.c and pci_machdep.c if pci is defined
instead of dom0ops
- Make various initialisations, and probe/attach pci busses based on NPCI
instead of DOM0OPS
- in conf/files.xen, move xen-specific devices before non-xen specific devices
so that the xen-specific match function is called first, to avoid false
attachement from too liberal match function in non-xen code.
#define clockframe somethingelse
to:
struct clockframe {
struct somethingelse cf_se;
};
and change access macros accordingly.
That means that, at least for that very issue, things will not go
ka-boomy if you don't have the actual definition of struct clockframe
before including systm.h.
disks to other domains) from Jed Davis, <jld@panix.com>:
* Issue multiple requests when necessary rather than
assuming that arbitrary requests can be mapped into single
contiguous virtual address ranges.
* Don't assume that all data for a request is consecutive
in memory. With some client OSes, it's not.
The above two changes fix data corruption issues with Linux
clients with certain filesystem block sizes.
* Gracefully handle memory or pool allocation failures after
beginning to handle a request from the ring.
* Merge contiguous requests to avoid the "64K turns into 44K + 20K
and doubles the transactions per second at the disk" problem
caused by the 11-page limit caused by the structure of Xen
ring entries. This causes a very slight performance decrease
for sequential 64K I/O if the disk is not already saturated with
requests (about 1%) but halves the transactions per second we
hit the disk with -- or better. It even compensates for bizarre
Linux behaviour like breaking long requests up into 5.5K pieces.
* Probably some stuff I forgot to mention.
Disk throughput (though not latency) is now much, much closer to the
"raw hardware" case than it was before.
for xen_microtime(). This match more closely what is done on a real i386
(where we read the RTC), and seems to fix gettimeofday() sometime going
backward by several seconds for me.