would have been much easier if up to and including 5.0 we wouldn't
silently cap the nfds argument to poll(!!!).
Makes things like socket(1) work out-of-the-box, and pretty much
every other decidedly prehistoric select() user.
(netcat is a slight exception since it sets FD_SETSIZE, a.k.a.
interface-of-the-year, to 16)
to convince non-rumped applications to communicate with a rump
kernel instead of the host kernel. The precision of what goes
where is not exactly surgical, but for example when wanting to
debug a web server's TCP/IP stack interaction, it might be enough.
When all you have is a hand grenade, all problems look like a ....
hmm?
There's still plenty to figure out. For example, I'm not sure what
the user interface will be like. Now it just attempts to hijack
network communication. It also needs to sync with symbol renaming
in libc, and maybe autogenerate the non-schizophrenic wrappers
where the communication is heading to exactly one destination, lest
I'll be a mummmy by the time I finish writing them all. As a fun
example of a non-non-schizophrenic one, consider poll().
Work in progress, but I managed to get two non-rumped netcats
talking to each other or fetching the index from a non-rumped
thttpd. telnet works in one direction (i can read the data from
netcat, but anything i send back is not printed). bozohttpd uses
dup2() which i haven't bothered to address yet, etcetc.
(not hooking this up the build for now)
dlsym(RTLD_NEXT) to lookup a host_syscall() function pointer which
is used instead of syscall() to communicate with the kernel server.
WARNING: popular opinion classifies this as "ugly code". if you
have a weak heart/mind/soul/sole meuniere, read max. 1 line of the
diff per day, preferably with food.
process receives SIGINFO. Additionally, dump vnode status if the
process gets SIGUSR1 (can be quite quite verbose, therefore not
displayed with SIGINFO).
time, setting EOVERFLOW at the inmost level will unfortunately persist,
even if later calls to those functions succeed. Move the EOVERFLOW setting
to the top level calls.
violent disconnect. Fixes some race conditions (maybe the one
occasionally showing up on tests/rump/rumpkern/t_stress).
thanks to schmonz for some discussion
However, because of a protocol deficiency puffs relies on being able
to keep track of VOP_LOOKUP calls by inspecting their contents, and
this at least allows it to use something vaguely principled instead of
making wild guesses based on whether SAVESTART is set.
Update libp2k to use INRELOOKUP instead of SAVESTART.
objects, and the RTLD_NODELETE and RTLD_NOLOAD flags to dlopen(3).
Mark libpthread as DF_1_NOOPEN and use it to test the functionality.
Somewhat taken from FreeBSD.
Fixes PR 42029.
OK from christos and joerg.
the situation, I decided to commit it. There is an inherent problem
with ASLR and the way the pthread library is using the thread stack.
Our pthread library chooses that stack for each thread strategically
so that it can locate the location of the pthread struct for each
thread by masking the stack pointer and looking just below the red
zone it creates. Unfortunately with ASLR you get many random values
for the initial stack, and there are situations where the masked
stack base ends up below the base of the stack. (this happens on
x86 when the stack base happens to be 0x???02000 for example and
your mask is stackmask is 0xffe00000). To fix this, we detect the
pathological cases (this happens only in the main thread), allocate
more stack, and mprotect it appropriately. Then we stash the main
base and the main struct, so that when we look for the pthread
struct in pthread__id, we can special case the main thread.
Another way to work around the problem is unlimiting stacksize,
but the proper way is to use TLS to find the thread structure and
not to play games with the thread stacks.
don't require locking and can operate on user-specified timezones
as opposed to having to alter the environment to change a timezone.
This work was presented to the tzcode folks and it was generally
accepted, but there seems to be a lot of inertia.
It's pretty much a placeholder for now. One plan for the future
is to require some sort of authentication for superuser clients.
The code will need a little massage then, though, to prevent DoS
attacks.
Fix multiple issues:
- Remove bogus 2038 check and add overflow checks in the appropriate places.
- Correct incomplete leap year calculation that broke things after 2100.
- Check localtime return values
- Change int calculations to time_t to avoid oveflow.
- Consistently check/return -1 and remove bogus comment about not being
able to return -1.
Now:
$ date -d 20991201
Tue Dec 1 00:00:00 EST 2099
$ date -d 40991201
Tue Dec 1 00:00:00 EST 4099
$ date -d 10000000991201
Tue Dec 1 00:00:00 EST 1000000099
TIME=0:04.48 CPU=117.8% (5.288u 0.000s) SWAPS=0 (0+95)pf (0i+0o) (0Kc+0Kd)
$ date -d 100000000991201
date: Cannot parse `100000000991201'
TIME=0:53.48 CPU=99.2% (53.086u 0.000s) SWAPS=0 (0+96)pf (0i+0o) (0Kc+0Kd)
Exit 1
llrint() and llrintf(). Code copied from round(), roundf() and
rint() and modified for return values. Its possible this may not
do the right things in edge cases, but if so its likely to have
the same issues as the existing round(), roundf() and rint().
All this used by vax (only), and should allow xnest to complete
build.
exposed function. (But note that this function is neither documented
nor declared in any installed header file, and it probably should not
be externally exposed.) Related to PR 44183, closes PR 44186.
external/lib/Makefile and crypto/external/lib/Makefile, replacing
them all with SUBDIRs directly from lib/Makefile.
compat/compatsubdirs.mk becomes simpler now, as everything is built
from lib/Makefile, meaning all the libraries will now be built under
compat so update the set lists to account for that.
skip entries, print (null), run off the end of the array, or occasionally
receive SIGSEGV, and now will, hopefully at least, do none of that.
Based in part on the patch in PR 44183 from Sergio Acereda; I also
did some tidyup and fixed it to print top-to-bottom first like ls(1).
is done in rumpuser for simplicity, since on the kernel side things
we assume we have only one pointer of space). As a side-effect,
we can no longer know if the current thread is holding on to a
mutex locked without curlwp context (basically all mutexes inited
outside of mutex_init()). The only thing that called rumpuser_mutex_held()
for a non-kmutex was the giant lock. So, instead implement recursive
locking for the giant lock in the rump kernel and get rid of the
now-unused recursive pthread mutex in the hypercall interface.
Also, add rump_daemonize_begin() / rump_daemonize_end() to help
with the "can't daemon() after pthread_create()" problem. Applications
could accomplish the same, but since it's such a common operation,
provide a little help.
limits. This improves syscall throughput about 2x for non-userio
syscalls (no copyin/out, e.g. getpid()) and almost 1.5x even for
things like __sysctl().
(measured for cases where the remote process is on the local machine)
XXX: if the pthread deadqueue sucks for anything which cares about
performance, why does it exist? Nuking it would make supporting
variable stack size easier.
==> add support for remote vmspace vmapbuf/vunmapbuf
==> add proper support for copyin/out_vmspace
==> add support for remote vmspace uvm_io
==> add support for non-curproc rumpuser_sp_copyin/out
==> store remote context in vm_map->pmap instead of
pthread_specificdata
In short, makes read/write of most (all?) block devices work from
a remote rump client via rump syscalls.
basics are there, but a few more tweaks are needed. The reason
I'm committing it now is that the code was mindnumbingly boring to
write (no wonder it took me almost 3 years to get it done), and I
might burn it if it's not in a safe place.
on "current-users" mailing list. Garbage collection is performed if:
1.) We previously allocated memory for the environment array which
is no longer used because the application overwrote "environ".
2.) We find a non-NULL pointer in the allocated environment array after
the end of the environment. This happens if the applications attempts
to clear the environment with something like "environ[0] = NULL;".
2.) Add a wrapper function __findenv() which implements the previous
*internal* interface. It turns out that ld.elf_so(1) and pthread(3)
both use it.
Stripping e.g. "LD_LIBRARY_PATH" from the environment while running
setuid binaries works again now.
- Use RB tree to keep track of memory allocated via setenv(3) as
suggested by Enami Tsugutomo in private e-mail.
This simplifies the code a lot as we no longer need to keep the size
of "environ" in sync with an array of allocated environment variables.
It also makes it possible to free environment variables in unsetenv(3)
if something has changed the order of the "environ" array.
- Fix a bug in getenv(3) and getenv_r(3) which would return bogus
results e.g. for " getenv("A=B") " if an environment variable "A"
with value "B=C" exists.
- Clean up the internal functions:
- Don't expose the read/write lock for the environment to other parts
of "libc". Provide locking functions instead.
- Use "bool" to report success or failure.
- Use "ssize_t" or "size_t" instead of "int" for indexes.
- Provide internal functions with simpler interfaces e.g. don't
combine return values and reference arguments.
- Don't copy "environ" into an allocated block unless we really need
to grow it.
Code reviewed by Joerg Sonnenberger and Christos Zoulas, tested by
Joerg Sonnenberger and me. These changes also fix problems in
zsh 4.3.* and pam_ssh according to Joerg.
This incarnation is written in the user namespace as opposed to
the previous one which was done in kernel namespace. Also, rump
does all the handshaking now instead of excepting an application
to come up with the user namespace socket.
There's still a lot to do, including making code "a bit" more
robust, actually running different clients in a different process
inside the kernel and splitting the client side library from librump.
I'm committing this now so that I don't lose it, plus it generally
works as long as you don't use it in unexcepted ways: i've tested
ifconfig(8), route(8), envstat(8) and sysctl(8).
now fails with EINVAL errno when variable is NULL, empty or contains
an `=' character; or value is NULL.
Adjust the man page accordingly, and exercize them in the existing
environment testcase.
character, and PUFFS do not want it. This fixes this bug, that returned
stat the informations for x instead of reporting ENOENT:
mkdir x && ln x z && stat -x z/whatever/you/want
- add kvm_i386pae.c (used for PAE memory translations), and update Makefile
for libkvm build.
- in pdppaddr: pass a flag to indicate PAE mode. Use a bit ignored
by the MMU. Mask address with PG_FRAME to avoid side effects.
Tested with vmstat(1)/netstat(1) to debug core files of PAE and !PAE
kernels. Older kernel dumps will default to native i386 (!PAE) mode.
XXX Currently, savecore(8) will fail to dump a PAE kernel in a !PAE
environment (and reciprocally). So you need to sync and reboot
with a kernel of the same mode as the one that crashed. Once the dump
is successful, this does not matter anymore.
- Keep track of file name to avoid lookups when we can. This makes sure we
do not have two cookies for the same inode, a situation that cause wreak
havoc when we come to remove or rename a node.
- Do not use PUFFS_FLAG_BUILDPATH at all, since we now track file names
- In open, queue requests after checking for access, as there is no merit
to queue a will-be-denied request while we can deny it immediatly
- request reclaim of removed nodes at inactive stage
on constant strings (e.g. postdrop(1)):
- Don't write to the environment string passed to putenv(3).
- Don't overwrite the value of an existing environment string
unless the memory was actually allocated by setenv(3).
variables: only free memory if the current value points to the same
memory area as the allocated block. This will prevent crashes if an
application changes the order of the environment array.
Unfortunately this is still not enough to stop zsh 4.2.* from crashing.
zsh 4.3.* works fine before and after this change.
- Restore open on our own in fsycn and readdir, as the node may not already
be open, and FUSE really wants it to be. No need to close immediatly, it
can be done at inactive time.
= Write operations =
- fix a nasty bug that corrupted files on write (written added twice)
- Keep track of file size in order to honour PUFFS_IO_APPEND
= many fixes in rename =
- handler overwritten nodes correctly
- wait for all operations on the node to drain before doing rename, as
filesystems may not cope with operations on a moving file.
- setback PUFFS_SETBACK_INACT_N1 cannot be used from rename, we therefore
miss the inactive time for an overwritten node. This bounds us to give up
PUFFS_KFLAG_IAONDEMAND.
= Removed files =
- forbid most operations on a removed node, return ENOENT
- setback PUFFS_SETBACK_NOREF_N1 at inactive stage to cause removed
file reclaim
= Misc =
- Update outdated ARGSUSED for lint
- Fix a memory leak (puffs_pn_remove instead of puffs_pn_put)
- Do not use PUFFS_FLAG_BUILDPATH except for debug output. It makes the
lookup code much simplier.