by making p_estcpu fixpt_t. PR/31542.
1. schedcpu() decreases p_estcpu of all processes
every seconds, by at least 1 regardless of load average.
2. schedclock() increases p_estcpu of curproc by 1,
at about 16 hz.
in the consequence, if a system has >16 processes
with runnable lwps, their p_estcpu are not likely increased.
by making p_estcpu fixpt_t, we can decay it more slowly
when loadavg is high. (ie. solve #1.)
i left kinfo_proc2::p_estcpu (ie. ps -O cpu) scaled because i have
no idea about its absolute value's usage other than debugging,
for which raw values are more valuable.
in some drivers including wd and scsi.
- physio: if a caller provided a buf, stick to use it
because some drivers use it as an identifier.
- sprinkle simple_locks.
- scsistrategy: rather than issueing an async request and
waiting for its completion, simply issue a sync request.
the way to wait for the completion had an assumption that
B_CALL is never used. it isn't the case after the recent
physio() changes.
pointed/analyzed/tested by Martin Husemann.
blindly assuming MCLBYTES will DTRT.
- Use bus_dmamap_load_mbuf() instead of bus_dmamap_load() where
appropriate.
- If we have to coalesce a Tx mbuf chain comprised of more than IWI_MAX_NSEG
segments, allocate a cluster iff the payload won't fit in the header.
- remove bogus multicast handling [pointed out by thorpej]
and don't reset the chip on ENETRESET; ENETRESET is a sign
that only the multicast filter needs changing.
- make a few functions static
- introduce gem_bitwait() to factor out some of the register wait code.
- add gem_stop() in attach
- some DEBUG should be GEM_DEBUG
- handle underrun, packet too long, and overflow errors by resetting the chip
- add handler in ioctl for add/del multi
- fix typo
Also:
- add a shadow sc_if_flags member so that we don't reset the chip if we
don't need to.
we can't do an SA context switch after all, we need to clear the sau from
the LWP's arg. sa_switch() frees the sau in this case, but if we don't
reset the LWP's state and the process exits, then the exiting LWP will
try to free the sau again.
also, change the sadebug printf stuff to use printf_nolog(), since
otherwise we deadlock because we're already holding sched_lock and
the normal printf() will try to wakeup the log reader.
we even have that documented in cache.h). Initialize c_nlines for
split caches.
This should fix srmmu cache flush routines for split I/D caches that
has been broken since cache.c revision 1.79 started using c_nlines.
not running on the same CPU as the swapper. l_stat is protected by
sched_lock, which isn't held here, so we can race with that lwp
starting to run and see its l_cpu not updated yet, as in PR 31870.
we check l_stat again in uvm_swapout() while holding sched_lock,
so the race itself is harmless.
code.
- To achieve COMPAT_NETBSD32 compatibility, introduce a parameter to
kevent1 that points to functions that do the actual copyin/copyout
operations. This is similar to what was done in FreeBSD by Paul Saab.
- Add the COMPAT_NETBSD32 definitions and hooks.
discussed in the PR.
- introduce sys/timevar.h to hold kernel-specific stuff relevant to
sys/time.h. Ideally, timevar.h would contain all (or almost) of the
#ifdef _KERNEL part of time.h, but that's a pretty big and tedious
change to make. For now, it will contain only the prototypes I
introduced when working on COMPAT_NETBSD32.
- split copyinout_t into copyin_t and copyout_t, it makes prototypes more
explicit about the meaning of a given argument. Suggested by yamt@.
- move copyinout_t definition in sys/time.h to systm.h as copyin_t and
copyout_t
- make everything uses the new types and include the proper headers at
the proper places.
introducing OMIT_DKTYPENUMS and OMIT_FSTYPENUMS, which omits the
declaration of the enums. Use this in the vax boot code to make
it build again. Also use __CONCAT() instead of ##.
m0. But m0 may be freed later, so trying to use sip6 at the end of this
function is wrong. My guess is that we want to reference the data area
of m (the mbuf about to be send) instead at this point.
Fix a panic on Xen (where a data area of a mbuf may be unmapped when the
mbuf is freed), and probably potential data/pool corruption in other cases.
The change adopts the idea of fxp to drop the incoming packet and panic
if the old mbuf cannot be reloaded. Since the bus_dmamap is allocated
during attach, this is not supposed to happen. Since a lot of code moves
anyway, factor out the allocation of RX ring elements, which is shared
between the init path and the RX interrupt path.
XXX A better fix might be to borrow the mbuf from the logic end of the
XXX ring buffer, but that needs more involved driver changes.
Reviewed by dyoung@ and nick@
build when EXEC_AOUT is not defined, the syscalls.master entry has to be
conditionalized. Alpha did so already, so let the other archs catch up
with it.
Go-on: christos
the original code since if fullgroups was empty and partgroups wasn't, we
would not clean up partgroups (pointed out by yamt). Well, this one has
different semantics from the original, they are the correct ones I think..
expensive, and pointless. As elsewhere in the kernel (and as approved
under FIPS-140-2 by multiple test labs, incidentally) we use arc4 to
generate IVs here; there is no benefit to their being cryptographically
strong so long as there is a sufficient Hamming distance between them.
- rather than embedding bufq_state in driver softc,
have a pointer to the former.
- move bufq related functions from kern/subr_disk.c to kern/subr_bufq.c.
- rename method to strategy for consistency.
- move some definitions which don't need to be exposed to the rest of kernel
from sys/bufq.h to sys/bufq_impl.h.
(is it better to move it to kern/ or somewhere?)
- fix some obvious breakage in dev/qbus/ts.c. (not tested)
and 795x. This was present in the driver before, but disabled due to
problems with the actual randomness of generated numbers on the
ubiquitious 7900-series parts. The code here is far, far more
conservative than anyone else's driver for this RNG is -- but I
believe that conservatism is called for, because the 79xx RNG
design is missing a number of pieces from Hifn's "reference" 6500
RNG, and thus the numbers it generates must be treated with some
care.
Support for the 7811 RNG (which is a full-fledged 6500 type
generator) is pretty much the same here as in other variants of
this driver, except that it uses Hifn's "worst case" estimate of
actual entropy per output bit, so it will accumulate bits much
more slowly. The 7811 support is untested.
on (e.g. cardbus), write the wep keys to the card. Fixes problem of
receiving gibberish if one has set wep keys before bringing the
interface up, and also after ifconfig ath0 down; ifconfig ath0 up.
(Tested with current from 20050926 with local mods, and discussed with
sam@ and dyoung@. This is not 100% right, but significantly better
than before. Really we should have an interator in net80211 to repush
all key state, include per-node keys.)
ptyfs_write() rather than setting a flag and updating these times
through ptyfs_itimes() at some indeterminate time in the future.
However, just use the "time" variable to set the times instead of
using a potentially expensive call to nanotime(). A HZ resolution
on these timestamps is more than enough.
(Possibly incomplete) fix for PR kern/31430.
OK'd be christos@.
xennetback instances. To support this, more fields from xni_page to xni_pkt.
This would also have the effect to move more code into the
while (!SLIST_EMPTY(&pkt_page->xni_pkt_head)) loop in xennetback_tx_free().
By passing xni_pkt instead of xni_page to xennetback_tx_free we can
avoid the loop, and the xni_pkt_head list completely. There is even a sligh
performance increase if the domU deals with xni_txring->event properly.
Based on comments from YAMAMOTO Takashi.
Framebuffer drivers use their own minor device number to find their device entry
in fbdriver methods ( like cgsix uses cgsix_cd[minor] ) but fb passed the fb
device's unit number which works only as long as all registered fb devices use
the same underlying driver. With different fb drivers, like a cgsix and an ffb, both will be (native) unit number 0 so with the old code /dev/fb1 would be
unusable.
COMPAT_16 and earlier that results in a current shared linker running at
address 0 (and thus allows NULL pointer derefs to work).
As noted by Matthias Drochner, this "fix" just checks the first psection
and not the first loadable psection. This isn't a problem with the
binutils up to now, but might be in the future.
This closes a hole pointed out by Thor Lancelot Simon on tech-kern ~3
years ago.
The problem was with running binaries from remote storage, where our
kernel (and Veriexec) has no control over any changes to files.
An attacker could, after the fingerprint has been verified and
program loaded to memory, inject malicious code into the backing
store on the remote storage, followed by a forced flush, causing
a page-in of the malicious data from backing store, bypassing
integrity checks.
Initial implementation by Brett Lymn.
- fix bus_space_read_1 -> bus_space_read_2 since revision 1.27 changed
the bit defines to assume 2 byte reads.
- Increment oerrors in case of collisions
- Clamp success counter to 100, instead of letting rotate freely.
In theory mbufs can have an infinite life time and could block the transmit
ring (as slots are released when the mbuf external storage is freed). To
avoid this, when we're processing the last slot of the ring copy the buffer
and release the slot immediatly.
options MCLSHIFT=12
in order to avoid a copy when a mbuf cluster has to be sent to a domU
(which is the common case when the packet comes from a physical network
interface).
receive system:
- on the receive side, attach the mapped buffer as external storage
instead of copying it. As a mapped buffer may not live much longer, we
have to deal with the fact that one page of buffer may containt several
packets, and it's not possible to map them several times. Use a hashed list
to keep track of mapped pages, and use reference counters.
- on the transmit side, when MCLBYTES == PAGE_SIZE, give away the mbuf
cluster page when possible instead of copying it. Keep a pool of physical
pages to map in place of the page we give away. When copying, use a
pool_cache(9) to manage copy buffers (use mclpool_cache when
MCLBYTES == PAGE_SIZE, otherwise use a private pool/pool_cache) instead
of a local list. This should reduce the number of hypercalls and MMU
operations in the copy case as well.
send queue. This give upper layer an opportunity to queue up all available
packets before starting to process them. This reduce the number of interrupt
generated on the backend, and the time spent doing hypercalls in a significant
way.
since both pool_get() and pool_put() can call wakeup().
instead, allocate the struct sadata_upcall before taking
sched_lock in mi_switch() and free it after releasing sched_lock.
clean up some modularity warts by adding a callback to
struct sadata_upcall for freeing sa_arg.
split the single list of pool cache groups into three lists:
completely full, partially full, and completely empty.
use LIST instead of TAILQ where appropriate.