sync that descriptor with PREREAD to make sure that it is evicted
from the data cache. From Allen Briggs.
* With the above bug fixed, stop using BUS_DMA_COHERENT, resulting in
a fairly decent performance improvement on systems where BUS_DMA_COHERENT
causes descriptors to be accessed uncached (most painful in wm_start()).
Change the bus_dmamap_sync() macro to test the ops argument against pre-
and post- constants. The compiler will optimize out dead code because
of the constants. Since post- operations are not needed on ARM (except
for ISA bounce buffers), this eliminate a large number of function calls
which are noops, each of which cost at least 6 cycles just in the call
and return overhead (not to mention whatever other useless work the
compiler decides to do in the callee).
load f/w images > 0x7fff words), set ISP_FW_ATTR_SCCLUN. We explicitly
don't believe we can find attributes if f/w is < 1.17.0, so we have to
set SCCLUN for the 1.15.37 f/w we're using manually- otherwise every
target will replicate itself across all 16 supported luns for non-SCCLUN
f/w.
* Save an instruction in the transition from idle to have-process-to-
switch-to, and eliminate two instructions that cause datadep-stalls
on StrongARM And XScale (one in each idle block).
* Rearrange some other instructions to avoid datadep-stalls on StrongARM
and XScale.
* Since cpu_do_powersave == 0 is by far the common case, avoid a
pipeline flush by reordering the two idle blocks.
it in pmap_activate(). Instead, let's leave it empty and let pages be
faulted into it on demand. This improves the context switch latency
somewhat, at least for small processes.
need to mess with the referenced and modified flags, since they're only
called when a page is being initialised, and is about to have them cleared.
Make this so.
the CPU's "sleep" function in the idle loop.
* Default all CPUs to not use powersave, except for the PDA processors
(SA11x0 and PXA2x0).
This significantly reduces inteterrupt latency in high-performance
applications (and was good to squeeze another ~10% out of an XScale
IOP on a Gig-E benchmark).
they contain. IRQ information for these has been removed from the
kernel configuration file. GSC bus chips now choose an available CPU
IRQ for themselves, and know IRQ information for all of the devices
they may contain. Minor autoconfiguration changes support this.
Renamed the old-style vmstat interrupt counters to say "ipl" and not
"irq", since they've been disconnected from irq numbers. Also provide
a function to allocate an irq bit from an interrupt register, and a
function to report the next ipl bit that will be allocated.