pr_{accept,sockname,peername} nam parameter type from mbuf * to sockaddr *.
* retained use of mbuftypes[MT_SONAME] for now.
* bump to netbsd version 7.99.12 for parameter type change.
patch posted to tech-net@ 2015/04/19
* update protocol bind implementations to use/expect sockaddr *
instead of mbuf *
* introduce sockaddr_big struct for storage of addr data passed via
sys_bind; sockaddr_big is of sufficient size and alignment to
accommodate all addr data sizes received.
* modify sys_bind to allocate sockaddr_big instead of using an mbuf.
* bump kernel version to 7.99.9 for change to pr_bind() parameter type.
Patch posted to tech-net@
http://mail-index.netbsd.org/tech-net/2015/03/15/msg005004.html
The choice to use a new structure sockaddr_big has been retained since
changing sockaddr_storage size would lead to unnecessary ABI change. The
use of the new structure does not preclude future work that increases
the size of sockaddr_storage and at that time sockaddr_big may be
trivially replaced.
Tested by mrg@ and myself, discussed with rmind@, posted to tech-net@
(v4 multicast options off v4 mapped v6 socket) on interface destruction. The
code to clean this up in a true v4 socket was moved to its own function
which is now also called in the corresponding place for v6 sockets on
interface destruction.
- allow binding to mapped v4 multicast addresses
- define v4moptions, allow setting it via ioctl, pass it to ip_output,
free it when killing the pcb.
Ideally we would allow the IPV6 multicast setsockopts work on mapped addresses
too, but this is a lot more work and linux does not do it either.
switches and put into separate functions
xxx_bind(struct socket *, struct mbuf *)
xxx_listen(struct socket *)
- always KASSERT(solocked(so)) even if not implemented
- replace calls to pr_generic() with req = PRU_BIND with call to
pr_bind()
- replace calls to pr_generic() with req = PRU_LISTEN with call to
pr_listen()
- drop struct lwp * parameter from at_pcbsetaddr(), in_pcbbind() and
unp_bind() and always use curlwp.
rename existing functions that operate on PCB for consistency (and to
free up their names for xxx_{bind,listen}() PRUs
- l2cap_{bind,listen}() -> l2cap_{bind,listen}_pcb()
- sco_{bind,listen}() -> sco_{bind,listen}_pcb()
- rfcomm_{bind,listen}() -> rfcomm_{bind,listen}_pcb()
patch reviewed by rmind
welcome to netbsd 6.99.48
KAME_IPSEC, and make IPSEC define it so that existing kernel
config files work as before
Now the default can be easily be changed to FAST_IPSEC just by
setting the IPSEC alias to FAST_IPSEC.
methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime
Truncation (MSLT).
MSLT and VTW were contributed by Coyote Point Systems, Inc.
Even after a TCP session enters the TIME_WAIT state, its corresponding
socket and protocol control blocks (PCBs) stick around until the TCP
Maximum Segment Lifetime (MSL) expires. On a host whose workload
necessarily creates and closes down many TCP sockets, the sockets & PCBs
for TCP sessions in TIME_WAIT state amount to many megabytes of dead
weight in RAM.
Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to
a class based on the nearness of the peer. Corresponding to each class
is an MSL, and a session uses the MSL of its class. The classes are
loopback (local host equals remote host), local (local host and remote
host are on the same link/subnet), and remote (local host and remote
host communicate via one or more gateways). Classes corresponding to
nearer peers have lower MSLs by default: 2 seconds for loopback, 10
seconds for local, 60 seconds for remote. Loopback and local sessions
expire more quickly when MSLT is used.
Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket
dead weight with a compact representation of the session, called a
"vestigial PCB". VTW data structures are designed to be very fast and
memory-efficient: for fast insertion and lookup of vestigial PCBs,
the PCBs are stored in a hash table that is designed to minimize the
number of cacheline visits per lookup/insertion. The memory both
for vestigial PCBs and for elements of the PCB hashtable come from
fixed-size pools, and linked data structures exploit this to conserve
memory by representing references with a narrow index/offset from the
start of a pool instead of a pointer. When space for new vestigial PCBs
runs out, VTW makes room by discarding old vestigial PCBs, oldest first.
VTW cooperates with MSLT.
It may help to think of VTW as a "FIN cache" by analogy to the SYN
cache.
A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT
sessions as fast as it can is approximately 17% idle when VTW is active
versus 0% idle when VTW is inactive. It has 103 megabytes more free RAM
when VTW is active (approximately 64k vestigial PCBs are created) than
when it is inactive.
- Properly authorize port binding in in_pcbsetport() and in6_pcbsetport()
- Pass struct sockaddr_in6 to in6_pcbsetport() instead of just the address,
so that we have a more complete context
- Adjust udp6_output() to craft a sockaddr_in6 as it calls in6_pcbsetport()
- Fix an issue in in_pcbbind() where we used the "dom_sa_any" pointer and
not a copy of it, pointed out by bouyer@, thanks!
Mailing list reference:
http://mail-index.netbsd.org/tech-net/2009/04/29/msg001259.html
in6_pcbbind_port(), used for binding to an address and a port respectively.
While here, fix a possible "leak" of an in6pcb when binding to an address
succeeded but binding to an auto-assigned port failed.
Proposed and received no objections on tech-net@:
http://mail-index.netbsd.org/tech-net/2009/04/15/msg001223.html
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.
With much feedback from matt@ and plunky@.
to _ro_rt. Use rtcache_getrt() to access a route cache's struct
rtentry *.
Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address. Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.
Remove stale debugging code from net/route.c. Move the rtflush()
code into rtcache_clear() and delete rtflush(). Delete rtalloc(),
because nothing uses it any more.
Make ND6_HINT an inline, lowercase subroutine, nd6_hint.
I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt(). They compile, but I have not
tested them. I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.
don't assume that the cached route is a sockaddr_in6, and do the
right comparisions so that no out-of-bounds memory is accessed.
btw, the use of "#ifdef INET" throughout the source doesn't look clean
to me: There are 2 cases -- whether AF_INET is usable by userland
programs, and whether IPv4 is supported as on-wire protocol.