for openat() and friends) to ni_atdir to avoid confusion with a
previously existing (and, alas, still documented) ni_startdir field
that meant something else entirely.
This uglifies the interface, because several operations need to be
passed the namei flags and cache_lookup also needs for the time being
to be passed cnp->cn_nameiop. Nonetheless, it's a net benefit.
The glop should be able to go away eventually but requires structural
cleanup elsewhere first.
This change requires a kernel bump.
- Move the namecache's hash computation to inside the namecache code,
instead of being spread out all over the place. Remove cn_hash from
struct componentname and delete all uses of it.
- It is no longer necessary (if it ever was) for cache_lookup and
cache_lookup_raw to clear MAKEENTRY from cnp->cn_flags for the cases
that cache_enter already checks for.
- Rearrange the interface of cache_lookup (and cache_lookup_raw) to
make it somewhat simpler, to exclude certain nonexistent error
conditions, and (most importantly) to make it not require write access
to cnp->cn_flags.
This change requires a kernel bump.
Depending on how the pty had been opened, t_dev could previously have
been set to NODEV. This was probably harmless before, but it caused the
compatibility handler for the COMPAT_60_TIOCPTSNAME ioctl to fail for
ptys that were allocated by screen(1), but only if this was the first
time that the pty had ever been used.
Elide the fs-wide rename lock for single-directory renames. This
required changing the order of lookups, so that we know what the
directories are before we lock the nodes.
Clean up error branches, explain why various nonsense happens and
what it does and doesn't do, and note some of what needs to change.
stale value can be returned and this causes a diagnostic panic in
namei.
In relookup(), clear *vpp before calling VOP_LOOKUP, as is done in
lookup_once(), as an additional precautionary measure.
(in theory both of these fixes are not required together)
Should fix PR 47040.
by calling NDAT(&nd, dirvp) after NDINIT().
Right now the implementation is vile and unspeakable to avoid changing
the kernel ABI; this way we can get openat() and friends into 6.1. I
will rectify the mess and bump the kernel once things are working.
caused by the sequence of passing two fd's with two sendmsg()'s,
then doing a read() and a recvmsg(). The read() calls dom_dispose()
which discards both messages in the mbuf, and sets the fp's in the
array to NULL. Linux dequeues only one message per read() so the
second recvmsg() gets the fd from the second message. This fix
just avoids the NULL pointer de-reference, making the second
recvmsg() to fail. It is dubious to pass fd's with stream sockets
and expect mixing read() and recvmsg() to work. Plus processing
one control message per read() changes the current semantics and
should be examined before applied. In addition there is a race between
dom_externalize() and dom_dispose(): what happens in a multi-threaded
network stack when one thread disposes where the other externalizes
the same array?
NB: Pullup to 6.
in a case of process exit. Necessary to re-flag all LWPs for exit, as their
state might have changed or new LWPs spawned.
Should fix PR/46168 and PR/46402.
insufficient entropy at boot: key it as soon as it makes any request after
we hit the minimum entropy threshold.
This too should help avoid predictable output at boot time.
as early as possible. On systems with no cycle counter (or very very
predictable cycle counts early in boot) at least this will cause some
difference across boots.
Contrary to the AMD implementation, it doesn't use xcalls to distribute
the update to all CPUs but relies on cpuctl(8) to bind itself to the
right CPU -- to keep it simple and avoid possible problems with
hyperthreading.
Also, it doesn't parse the vendor supplied file to pick the right
part for the present CPU model but relies on userland to prepare
files with specific filenames. I'll commit a pkg for this in a minute
(pkgsrc/sysutils/intel-microcode).
The ioctl interface changed; compatibility is provided (should be
limited to COMPAT_NETBSD6 as soon as this is available).
changes required to support these options. The -e option was
requested by martin@ in private chat in order to make writing tests
easier (i.e. don't bother testing MODULAR functionaility if it
doesn't exist). While there, I added -A and -a since those were
quite similar.
-A Tells you whether or not modules can be autoloaded at the moment.
This option does take into consideration the sysctl
kern.module.autoload.
-a Tells you whether or not modules can be autoloaded at the moment.
This option does not take into consideration the sysctl
kern.module.autoload.
-e Tells you whether or not you may load a module at the moment.
Now I can follow which process is which in this routine.
If I jiggle the whitespace so line numbers don't change, there is no
change in the output of `objdump -d kern_exit.o' for amd64.
ok abp
Print a diagnostic message if we ever get ERESTART out of fd_close
and convert it to EINTR instead.
Even if fd_close fails, it has already closed the file descriptor, so
restarting the system call is a mistake, with dangerous consequences
for multithreaded programs.
Should probably turn the message into a kassert eventually, and maybe
add one deeper in fd_close in order to more easily debug it before
all the data structures are destroyed.
unwanted sideeffects in the RB_ASKNAME case. This fixes PR/46732.
No longer wrap MD cpu_rootconf(), as hp300 port stores reboot information
as a side effect. Instead call MI rootconf() from MD code which makes
rootconf() now a wrapper to setroot().
Adjust several MD routines to set the global booted_device,booted_partition
variables instead of passing partial information to setroot().
Make cpu_rootconf(9) describe the calling order.
making allocator more flexible for allocations larger then 4kb
move the encoded "size" under DEBUG back to the begining of allocated chunk
no objections on tech-kern@
is a global sysctl kern.maxlwp to control this, which is by default 2048.
The first lwp of each process or kernel threads are not counted against the
limit. To show the current resource usage per user, I added a new sysctl
that dumps the uidinfo structure fields.
- cpu_load_pmap: use atomic kcpuset(9) operations; fixes rare crashes.
- Add kcpuset_copybits(9) and replace xen_kcpuset2bits(). Avoids incorrect
ncpu problem in early boot. Also, micro-optimises xen_mcast_invlpg() and
xen_mcast_tlbflush() routines.
Tested by chs@.
caches, merge together pool_drain_start() and pool_drain_end() into
bool pool_drain(struct pool **ppp);
"bool" value indicates whether reclaiming was fully done (true) or not (false)
"ppp" will contain a pointer to the pool that was drained (optional).
See http://mail-index.netbsd.org/tech-kern/2012/06/04/msg013287.html
context, re-enable the portion of code that allows invalidation of CPU-bound
pool caches.
Two reasons:
- CPU cached objects being invalidated, the probability of fetching an
obsolete object from the pool_cache(9) is greatly reduced. This speeds up
pool_cache_get() quite a bit as it does not have to keep destroying objects
until it finds an updated one when an invalidation is in progress.
- for situations where we have to ensure that no obsolete object remains
after a state transition (canonical example: pmap mappings between Xen VM
restoration), invalidating all pool_cache(9) is the safest way to go.
As it uses xcall(9) to broadcast the execution of pool_cache_transfer(),
pool_cache_invalidate() cannot be called from interrupt or softint context
(scheduling a xcall(9) can put a LWP to sleep).
pool_cache_xcall() => pool_cache_transfer() to reflect its use.
Invalidation being a costly process (1000s objects may be destroyed),
all places where pool_cache_invalidate() may be called from
interrupt/softint context will now get caught by the proper KASSERT(), and
fixed. Ping me when you see one.
Tested under i386 and amd64 by running ATF suite within 64MiB HVM
domains (tried triggering pgdaemon a few times).
No objection on tech-kern@.
XXX a similar fix has to be pulled up to NetBSD-6, but with a more
conservative approach.
See http://mail-index.netbsd.org/tech-kern/2012/05/29/msg013245.html
This value seems to never have been used anywhere.
This makes it consistent with it's cousin p_vm_msize (which is in pages as
well and has several uses).
The values are 'u_long' so copying them into an 'int' temporary
(to avoid writing an out of range value into the actual variable)
doesn't work too well at all.
Shows up on amd64 now that the sysctl values are marked as 64bit.
sparc64 must have been badly broken for ages.
passed to sysctl_createv() actually matches the declared type for
the item itself.
In the places where the caller specifies a function and a structure
address (typically the 'softc') an explicit (void *) cast is now needed.
Fixes bugs in sys/dev/acpi/asus_acpi.c sys/dev/bluetooth/bcsp.c
sys/kern/vfs_bio.c sys/miscfs/syncfs/sync_subr.c and setting
AcpiGbl_EnableAmlDebugObject.
(mostly passing the address of a uint64_t when typed as CTLTYPE_INT).
I've test built quite a few kernels, but there may be some unfixed MD
fallout. Most likely passing &char[] to char *.
Also add CTLFLAG_UNSIGNED for unsiged decimals - not set yet.
- Restructure the code to do the checking in the appropriate note type,
and harmonize all the checks to be positive.
- Print only the tag data being careful not to overrun the allocated buffer.
assertion failure (and thus a crash in DIAGNOSTIC kernels). Independently
discovered by YAMAMOTO Takashi and Joel Sing.
To avoid this, introduce a cpu_mcontext_validate() function and move all
sanity checks from cpu_setmcontext() there. Also untangle the netbsd32
compat mess slightly and add a cpu_mcontext32_validate() cousin there.
Add an exhaustive atf test case, based partly on code from Joel Sing.
Should finally fix the remaining open part of PR kern/43903.
fs/puffs/t_fuzz:mountfuzz7, fs/puffs/t_fuzz:mountfuzz8,
and fs/zfs/t_zpool:create tests pass again. Patch from
manu, discussed on tech-kern and committed at his request.
all we decided to adopt the Linux API, therefore there is rationale
to stick to it.
No standard tells us what to do, and our extended attribute API has not
been used in a release, therefore we do not break anything, and we get
more easily compatible with programs using the Linux extended attribute
API.
Note that FreeBSD and MacOS X return ENOATTR. FreeBSD has its own API
and MacOS X has a Linux-like API. How did the world get so complicated?
- Remove copy-pasting in error paths, use execve_free_{vmspace,data}().
- Move some code (both in the init and exit paths) out of the locks.
- Slightly simplify do_posix_spawn() callers.
- Add few asserts and comments.
translate FSYNC_LAZY into PGO_LAZY for VOP_PUTPAGES() so that
genfs_do_io() can set the appropriate io priority for the I/O.
this is the first part of addressing PR 46325.
To make code in 'external' (etc) still compile, MALLOC_DECLARE() still
has to generate something of type 'struct malloc_type *', with
normal optimisation gcc generates a compile-time 0.
MALLOC_DEFINE() and friends have no effect.
Fix one or two places where the code would no longer compile.
The M_xxx arg is left on the calls to malloc() and free(),
maybe they could be converted to an enumeration and just saved in
the malloc header (for deep diag use).
Remove the malloc_type from mbuf extension.
Fixes rump build as well.
Welcome to 6.99.6
kcpuset(9) and thus replace hardcoded CPU bitmasks. This removes the
limitation of maximum CPUs.
- Support up to 256 CPUs on amd64 architecture by default.
Bug fixes, improvements, completion of Xen part and testing on 64-core
AMD Opteron(tm) Processor 6282 SE (also, as Xen HVM domU with 128 CPUs)
by Manuel Bouyer.
1) Add a per-cpu CPRNG to handle short reads from /dev/urandom so that
programs like perl don't drain the entropy pool dry by repeatedly
opening, reading 4 bytes, closing.
2) Really fix the locking around reseeds and destroys.
3) Fix the opportunistic-reseed strategy so it actually works, reseeding
existing RNGs once each (as they are used, so idle RNGs don't get
reseeded) until the pool is half empty or newly full again.
1) Lock ordering in cprng_strong_destroy had us take a spin mutex then
an adaptive mutex. Can't do that. Reordering this requires changing
cprng_strong_reseed to tryenter the cprng's own mutex and skip the
reseed on failure, or we could deadlock.
2) Can't free memory with a valid mutex in it.
- always provide a vmspace for the new proc, initially borrowing from proc0
(this part fixes PR 46286)
- increase parallelism between parent and child if arguments allow this,
avoiding a potential deadlock on exec_lock
- add a new flag for userland to request old (lockstepped) behaviour for
better error reporting
- adapt test cases to the previous two and add a new variant to test the
diagnostics flag
- fix a few memory (and lock) leaks
- provide netbsd32 compat
may change during loops. Fixes the macppc build, which previously
died with:
src/sys/arch/macppc/dev/dbdma.c:62:6: error: assuming signed overflow does not occur when assuming that (X + c) < X is always false
- split the file actions allocation and freeing into separate functions
- use pnbuf
- don't play games with pointers (partially freeing stuff etc), only check
fa and sa and free as needed using the same code.
- use copyinstr properly
- KM_SLEEP allocation can't fail
- if path allocation failed midway, we would be possibily freeing
userland strings.
- use sizeof(*var) instead sizeof(type)
Fix a kmem_alloc() call with zero size (PR kern/46038), allow file actions
to be passed, even if empty.
Rearange p_reflock locking for the child, avoid a double free in an
error case, avoid a memory leak in another error case - all pointed out
by yamt.
During blocking operations early in the child borrow the kernel vmspace
(as suggested by yamt).
- to reduce cpu consumption for a long queue, maintain the sorted lists of
buffers with rbtree instead of TAILQ. i vaguely remember that the problem
pointed out by someone on a public mailing list while ago. sorry, i can't
remember who and where.
- add some #ifdef'ed out experimental code.
- bump kernel version for struct buf change.