Move the function+line printing into GRE_DPRINTF().
Retire gre_closef(). Retire gre_join(). Constify gre_reconf(),
and don't pass it an LWP any longer.
Make this work in the new file descriptor regime. Add a kernel
thread per gre(4) instance whose purpose is to install the socket
into proc0's file descriptor table. Add gre_fp_send() and
gre_fp_recv() for passing file_t pointers to proc0.
Fix locking: don't solock() in the socket upcall, where it is
already held. Do solock() before calling soconnect().
Simplify reconfiguration.
Update a comment that mentions finding a less specific route, since
we don't do that any more.
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.
With much feedback from matt@ and plunky@.
(rev 1.125): correct the check for fd_getsock() failure in
gre_socreate().
The second bug is more complicated to fix. Since rev 1.125,
gre_reconf() is using the file descriptor table of the current
process instead of the process 0's (the kernel's).
the opportunity to handle an ioctl before generic ifioctl handling
occurs. This will ease extending the kernel and sharing of code
between drivers.
First steps: Make the signature of ifioctl_common() match struct
ifinet->if_ioctl. Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.
instead of adding/subtracting our own IPv4 header.
There are many benefits: gre(4) needn't grok the outer encapsulation
header any longer, so this simplifies the gre(4) code. The IP
stack needn't grok GRE, so it is simplified, too. gre(4) will
benefit from optimizations in the socket code. Eventually, gre(4)
will gain an IPv6 encapsulation with very few new lines of code.
There is a small performance loss. A 133 MHz, 486-class AMD Elan
sinks/sources a TCP stream over GRE with about 93% the throughput
of the old code. TCP throughput on a 266 MHz, 586-class AMD Geode
is about 96% the throughput of the old code. A 175-MHz ADM5120
(MIPS) only sinks a TCP stream over GRE at about 90% of the old
code; I am still investigating that.
I produced stripped-down versions of sosend() and soreceive() for
gre(4) to use. They are guaranteed not to block, so they can be
called from a software interrupt and from a socket upcall,
respectively.
A kernel thread is no longer necessary for socket transmit/receive,
but I didn't get around to removing it, yet.
Thanks to Matt Thomas for suggesting the use of stripped-down socket
code and software interrupts, and to Andrew Doran for advice and
answers concerning software interrupts, threads, and performance.
acquisition and final release out into gre_thread(). This will
fix a locking bug that LOCKDEBUG exposed: holding a spinlock over
an sosend() call is a no-no.
Cosmetic: join some lines, remove some unnecessary curly braces.
* Create the kernel thread in gre_clone_create() instead of trying
to create it in gre_ioctl(). (Thanks ad@ for suggesting it, and
pointing out that I can't kthread_create while I hold a spin
lock.) Run the thread always, but put it to sleep while the
gre(4) is not in UDP mode.
* Use sockaddr_in_init().
* Move some thread state off of the stack and into the softc.
* Extract subroutines gre_do_recv(), gre_do_send(), and gre_reconf()
from gre_thread1(), making the code more readable.
compatibility with the older ioctls. This avoids stack smashing and
abuse of "struct sockaddr" when ioctls placed "struct sockaddr_foo's" that
were longer than "struct sockaddr".
XXX: Some of the emulations might be broken; I tried to add code for
them but I did not test them.
Fix a defect in the locking of file descriptors as we delegate a
UDP socket from userland to the kernel. Move sc_fp out of sc_soparm.
Synchronize access to sc_fp by gre_ioctl() and the kernel thread
using a condition variable. For simplicity's sake, make it the
kernel helper thread's responsibility to close its UDP socket.
route_in6, struct route_iso), replacing all caches with a struct
route.
The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing. Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.
Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously. Of course, all design oversights and bugs are
mine.
DETAILS
1 I added to each address family a pool of sockaddrs. I have
introduced routines for allocating, copying, and duplicating,
and freeing sockaddrs:
struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
struct sockaddr *sockaddr_copy(struct sockaddr *dst,
const struct sockaddr *src);
struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
void sockaddr_free(struct sockaddr *sa);
sockaddr_alloc() returns either a sockaddr from the pool belonging
to the specified family, or NULL if the pool is exhausted. The
returned sockaddr has the right size for that family; sa_family
and sa_len fields are initialized to the family and sockaddr
length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
sockaddr_in). sockaddr_free() puts the given sockaddr back into
its family's pool.
sockaddr_dup() and sockaddr_copy() work analogously to strdup()
and strcpy(), respectively. sockaddr_copy() KASSERTs that the
family of the destination and source sockaddrs are alike.
The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
passed directly to pool_get(9).
2 I added routines for initializing sockaddrs in each address
family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
etc. They are fairly self-explanatory.
3 structs route_in6 and route_iso are no more. All protocol families
use struct route. I have changed the route cache, 'struct route',
so that it does not contain storage space for a sockaddr. Instead,
struct route points to a sockaddr coming from the pool the sockaddr
belongs to. I added a new method to struct route, rtcache_setdst(),
for setting the cache destination:
int rtcache_setdst(struct route *, const struct sockaddr *);
rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
available to create the sockaddr storage.
It is now possible for rtcache_getdst() to return NULL if, say,
rtcache_setdst() failed. I check the return value for NULL
everywhere in the kernel.
4 Each routing domain (struct domain) has a list of live route
caches, dom_rtcache. rtflushall(sa_family_t af) looks up the
domain indicated by 'af', walks the domain's list of route caches
and invalidates each one.
because if_detach() may cause us to transmit a packet, which
ordinarily entails reloading the route cache. This fixes a bug
where the kernel would panic later in rtflush(). Thanks Michael
Earnhart for reporting the bug.
In gre_output(), do not leak mbufs.
increase ifi_noproto. If the GRE header contains routing options,
increase the input-error count, ifi_ierrors.
While I am here, make some cosmetic changes: remove unnecessary
'proto' argument from gre_input3(). Shorten some staircases.