name to start up as init (rather than just cycling thru initpaths[]
and panicing when out of options). if RB_ASKNAME isn't set, the old
behaviour remains. inspired by changes in der Mouse's patchtree.
resolves [kern/18027] from me.
counters. These counters do not exist on all CPUs, but where they
do exist, can be used for counting events such as dcache misses that
would otherwise be difficult or impossible to instrument by code
inspection or hardware simulation.
pmc(9) is meant to be a general interface. Initially, the Intel XScale
counters are the only ones supported.
- avoid race conditions by having seqno in ioctl
- better uid/gid tracking
- "replace" policy to replace args
- less diffs, as many of local changes were fed back to openbsd already
due to the 1st item, it was impossible for us to provide backward-compatibility
(new kernel + old bin/systrace won't work). upgrade both.
* In pool_prime_page(), assert that the object being placed onto the
free list meets the alignment constraints (that "ioff" within the
object is aligned to "align").
* In pool_init(), round up the object size to the alignment value (or
ALIGN(1), if no special alignment is needed) so that the above invariant
holds true.
gets reset properly when the old parent exits before the child. A flag
is set in old parent process when the child is reparented in ptrace(2).
If it's set when process is exiting, all running processes have their
'old parent process' pointer checked and reset if appropriate. Also
change to use 'struct proc *' pointer directly, rather than pid_t.
This fixes security/14444 by David Sainty.
Reviewed by Christos Zoulas.
One basic struct, a function to setup a queue with a specific strategy and
three macros to put buf's into the queue, get and remove the next buf or
get the next buf without removal.
The BUFQ_XXX interface will be removed in the future.
The B_ORDERED flag is not longer supported.
Approved by: Jason R. Thorpe <thorpej@wasabisystems.com>
* struct sigacts gets a new sigact_sigdesc structure, which has the
sigaction and the trampoline/version. Version 0 means "legacy kernel
provided trampoline". Other versions are coordinated with machine-
dependent code in libc.
* sigaction1() grows two more arguments -- the trampoline pointer and
the trampoline version.
* A new __sigaction_sigtramp() system call is provided to register a
trampoline along with a signal handler.
* The handler is no longer passed to sensig() functions. Instead,
sendsig() looks up the handler by peeking in the sigacts for the
process getting the signal (since it has to look in there for the
trampoline anyway).
* Native sendsig() functions now select the appropriate trampoline and
its arguments based on the trampoline version in the sigacts.
Changes to libc to use the new facility will be checked in later. Kernel
version not bumped; we will ride the 1.6C bump made recently.
* Keep pointers to the first and last mbufs of the last record in the
socket buffer.
* Use the sb_lastrecord pointer in the sbappend*() family of functions
to avoid traversing the packet chain to find the last record.
* Add a new sbappend_stream() function for stream protocols which
guarantee that there will never be more than one record in the
socket buffer. This function uses the sb_mbtail pointer to perform
the data insertion. Make TCP use sbappend_stream().
On a profiling run, this makes sbappend of a TCP transmission using
a 1M socket buffer go from 50% of the time to .02% of the time.
Thanks to Bill Sommerfeld and YAMAMOTO Takashi for their debugging
assistance!
as necessary:
* Implement a new mbuf utility routine, m_copyup(), is is like
m_pullup(), except that it always prepends and copies, rather
than only doing so if the desired length is larger than m->m_len.
m_copyup() also allows an offset into the destination mbuf, which
allows space for packet headers, in the forwarding case.
* Add *_HDR_ALIGNED_P() macros for IP, IPv6, ICMP, and IGMP. These
macros expand to 1 if __NO_STRICT_ALIGNMENT is defined, so that
architectures which do not have strict alignment constraints don't
pay for the test or visit the new align-if-needed path.
* Use the new macros to check if a header needs to be aligned, or to
assert that it already is, as appropriate.
Note: This code is still somewhat experimental. However, the new
code path won't be visited if individual device drivers continue
to guarantee that packets are delivered to layer 3 already properly
aligned (which are rules that are already in use).
still in the chroot. If not, teleport the lookup to the chroot
and log. Closes an assisted-jail escape method pointed out by
xs@kittenz.org. Patch from xs@kittenz.org and myself
It's not even built if the option isn't present.
* Use cdev_decl() to generate prototypes for the devsw functions.
* Minor whitespace cleanup.
* Nuke the SYSTR_CLONE ioctl from orbit; instead, just clone it in
systraceopen(), like we do with svr4_net.
trying to MSG_PEEK for more than the socket can hold. The second is that
before sleeping waiting for more data, upcall the protocol telling it you
have just received data so it can kick itself to re-fill the just drained
socket buffer.
and an error occurs, make sure the socket doesn't retain a partial
copy by dropping the rest of the record.
This would otherwise trigger a panic("receive 1a") under DIAGNOSTIC.
Fixes PR#16990, suggested fix adapted.
Reviewed by Matt Thomas.
- implement SIMPLEQ_REMOVE(head, elm, type, field). whilst it's O(n),
this mirrors the functionality of SLIST_REMOVE() (the other
singly-linked list type) and FreeBSD's STAILQ_REMOVE()
- remove the unnecessary elm arg from SIMPLEQ_REMOVE_HEAD().
this mirrors the functionality of SLIST_REMOVE_HEAD() (the other
singly-linked list type) and FreeBSD's STAILQ_REMOVE_HEAD()
- remove notes about SIMPLEQ not supporting arbitrary element removal
- use SIMPLEQ_FOREACH() instead of home-grown for loops
- use SIMPLEQ_EMPTY() appropriately
- use SIMPLEQ_*() instead of accessing sqh_first,sqh_last,sqe_next directly
- reorder manual page; be consistent about how the types are listed
- other minor cleanups
enough to be useful, and broadening it so that it did would have meant
that operations possibly requiring synchronous disk activity would have
to be done in splbio(). This clearly was not going to work.
Worked around this in the LFS case by having lfs_cluster_callback put an
extra hold on the vnode before calling biodone(), and taking the hold
off without HOLDRELE's problematic list swapping. lfs_vunref() will take
care of that---in thread context---on the next write if need be.
Also, ensure that the list walking in lfs_{writevnodes,segunlock,gather}
takes into account the possibility that the list may change
underneath it (possibly because it itself deleted an element).
Tested on i386, test-compiled on alpha.
by default, and can be enabled by adding the SOSEND_LOAN option to your
kernel config. The SOSEND_COUNTERS option can be used to provide some
instrumentation.
Use of this option, combined with an application that does large enough
writes, gets us zero-copy on the TCP and UDP transmit path.
closed, open those fds to /dev/null.
XXX: This needs to be fixed in a better way. The kernel should not need to
know about /dev/null or special case 0, 1, 2.
This generates more useful information of a process who catches SIGINFO,
rather than always printing "runnable" (the process is marked runnable
because of the signal).
Inspired by the behavior of BSD/OS.
indicating an unhandled "command". ERESTART is -1, which can lead to
confusion. ERESTART has been moved to -3 and EPASSTHROUGH has been
placed at -4. No ioctl code should now return -1 anywhere. The
ioctl() system call is now properly restartable.
on the <bsd-api-discuss@wasabisystems.com> mailing list. PT_IO
is a more general inferior I/D space I/O mechanism. FreeBSD and
OpenBSD have also added PT_IO.
From lha@stacken.kth.se, kern/15945.
back in rev. 1.51, bread() and breadn() were changed to assume that
if B_DONE is set on a buffer returned by bio_doread(), that the buffer
must have already been in the cache, and thus the overall bread() should
return success. but if the requested buffer is not in the cache and
is past the end of the device, bounds_check_with_label() will set B_ERROR
on the buffer and the caller will call biodone(), which will cause bread()
to think the buffer was already in the cache and thus return success.
to fix this, undo rev. 1.51 and instead have biowait() treat both B_DONE
and B_DELWRI as indicators that it doesn't need to sleep waiting for an
i/o to complete.
Changes:
* MP locking changes (mostly FreeBSD specific)
XXXSMP the MP locking macros are noops on NetBSD for now
* kevent fix (FreeBSD rev. 1.87): when the last reader/writer
disconnects, ensure that anybody who is waiting for the kevent
on the other end of the pipe gets EV_EOF
* kill __P
first. This is necessary to avoid warnings with -fshort-enums. Casting
to an int really should be enough, but turns out not to be.
This change will be documented in doc/HACKS.
m_reclaim() to match the drain hook signature. This allows us to
delete m_retry() and m_retryhdr(), as the pool allocator will now
perform the reclaimation step for us.
From art@openbsd.org.
and the latter, while there was some code tested the bit, was woefully
incomplete and also unused by anything. Besides, PR_STATIC functionality
could be better handled by backend allocators anyhow.
From art@openbsd.org
pool_set_drain_hook(). This hook is called in three cases:
* When a pool has hit the hard limit, just before either erroring
out or sleeping.
* When a backend allocator fails to allocate memory.
* Just before trying to reclaim pages in pool_reclaim().
This hook requests the client to try and free some items back to
the pool.
From art@openbsd.org.
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:
* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.
From art@openbsd.org.
header to distinguish between o32, n32 and n64 ABIs. We now use this.
This suppress the need of the mips_option test, which had some fake positive.
This also removes the mandatory ordering of n32 vs o32 in the exec switch
(exec_conf.c)
since bounds_check_with_label() will truncate a buffer that crosses
the end of the partition. adjust the assertion to account for this.
fixes PRs 7938, 12156, 12698, 13076, 13210 and 13288.
rounded up to respect boundary limits, adjust newstart and last before
skiping to the next region. Otherwise we may check the same candidate
region against the start of the next region, no the one immediatly following
the hole, leading to corrupted map.
This fixes the panic seen on sparc64 with scsi drivers, and probably fixes
PR 15489.
easy, convenient dropping into DDB at the "root device: " prompt.
Useful if your console can't do it w/o actually taking an interrupt
and you want to, say, look at the boot messages.
as an added measure to make sure that we can execute a binary.
These default to (1) if elf_machdep.h does not override them.
On Sun2, ELF32_EHDR_FLAGS_OK() checks for the presense of EF_M68000,
since the 68010 cannot run binaries for the 68020-and-up.
was successfully changed. previously, successfully viewing the
current value would flush the cache :-/
- similarly, don't change hostid and sb_max unless the value was
successfully changed
ltsleep() is calling CURSIG() which can call issignal() and issignal()
could not deal with being called from a locked context. This happens
when a process receives SIGTTIN, and issignal() calls psignal() to
post SIGCHLD to the parent.
XXX: It is really messy to have issignal() handle the job control
functionality and the whole signal interlocking protocol needs to
be re-designed. For now this fix (provided by enami) does the trick.
I've been running with this fix for weeks, and atatat has stress-tested
the kernel running ~30 make kernels...
expecting pmap_kenter_pa() to be used to replace an existing mapping,
plus it just seems like a bad idea to keep around mappings of pages
that may be freed and reused.
uint32_t namei_hash(const char *p, const char **ep)
which determines the equivalent MI hash32_str() hash for p.
If *ep != NULL, calculate the hash to the character before ep.
If *ep == NULL, calculate the has to the first / or NUL found, and
point *ep to that location.
- Use namei_hash() to calculate cn_hash in lookup() and relookup().
Hash distribution goes from 35-40% to 55-70%, with similar profiled
time spent in cache_lookup() and cache_enter() on my P3-600.
- Use namei_hash() to calculate cn_hash in nfs_readdirplusrpc(),
insetad of homegrown code (that differed from that in lookup() !)
namei_hash() has better spread and is faster than previous code
(which used a non-constant multiplication).