NetBSD/sys/netinet
dyoung c2e43be1c5 Reduces the resources demanded by TCP sessions in TIME_WAIT-state using
methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime
Truncation (MSLT).

MSLT and VTW were contributed by Coyote Point Systems, Inc.

Even after a TCP session enters the TIME_WAIT state, its corresponding
socket and protocol control blocks (PCBs) stick around until the TCP
Maximum Segment Lifetime (MSL) expires.  On a host whose workload
necessarily creates and closes down many TCP sockets, the sockets & PCBs
for TCP sessions in TIME_WAIT state amount to many megabytes of dead
weight in RAM.

Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to
a class based on the nearness of the peer.  Corresponding to each class
is an MSL, and a session uses the MSL of its class.  The classes are
loopback (local host equals remote host), local (local host and remote
host are on the same link/subnet), and remote (local host and remote
host communicate via one or more gateways).  Classes corresponding to
nearer peers have lower MSLs by default: 2 seconds for loopback, 10
seconds for local, 60 seconds for remote.  Loopback and local sessions
expire more quickly when MSLT is used.

Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket
dead weight with a compact representation of the session, called a
"vestigial PCB".  VTW data structures are designed to be very fast and
memory-efficient: for fast insertion and lookup of vestigial PCBs,
the PCBs are stored in a hash table that is designed to minimize the
number of cacheline visits per lookup/insertion.  The memory both
for vestigial PCBs and for elements of the PCB hashtable come from
fixed-size pools, and linked data structures exploit this to conserve
memory by representing references with a narrow index/offset from the
start of a pool instead of a pointer.  When space for new vestigial PCBs
runs out, VTW makes room by discarding old vestigial PCBs, oldest first.
VTW cooperates with MSLT.

It may help to think of VTW as a "FIN cache" by analogy to the SYN
cache.

A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT
sessions as fast as it can is approximately 17% idle when VTW is active
versus 0% idle when VTW is inactive.  It has 103 megabytes more free RAM
when VTW is active (approximately 64k vestigial PCBs are created) than
when it is inactive.
2011-05-03 18:28:44 +00:00
..
accept_filter.h
accf_data.c
accf_http.c
cpu_in_cksum.c
files.ipfilter Defopt the rest of the Ipfilter options and tunables. 2010-10-02 20:07:39 +00:00
files.netinet Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
icmp6.h
icmp_private.h
icmp_var.h
if_arp.c arp_drain() may be called with locks held, so instead of doing any work 2011-05-03 16:00:29 +00:00
if_atm.c udpate license clauses on my code to match the new-style BSD licenses. 2011-02-01 19:40:24 +00:00
if_atm.h udpate license clauses on my code to match the new-style BSD licenses. 2011-02-01 19:40:24 +00:00
if_ether.h
if_inarp.h
igmp_var.h
igmp.c
igmp.h
in4_cksum.c fix assertions 2011-04-25 22:04:32 +00:00
in_cksum.c
in_gif.c
in_gif.h
in_ifattach.h
in_offload.c ip_undefer_csum: 2011-04-25 22:11:31 +00:00
in_offload.h undefer csum in looutput. 2011-04-25 22:20:59 +00:00
in_pcb_hdr.h Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
in_pcb.c Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
in_pcb.h Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
in_proto.c *_drain() routines may be called with locks held, so instead of doing 2011-05-03 17:44:30 +00:00
in_proto.h
in_selsrc.c
in_selsrc.h
in_systm.h
in_var.h ip_randomid: make mechanism MP-safe and more modular. 2010-11-05 01:35:57 +00:00
in.c Backout rev.1.137. It causes troubles, see PR kern/43294. 2010-05-15 05:02:46 +00:00
in.h
ip6.h
ip_carp.c ahem, min -> max in previous 2010-08-11 11:06:42 +00:00
ip_carp.h
ip_ecn.c
ip_ecn.h
ip_encap.c
ip_encap.h
ip_etherip.c Push the bpf_ops usage back into bpf.h. Push the common ifp->if_bpf 2010-04-05 07:19:28 +00:00
ip_etherip.h
ip_flow.c After discussion with ad@: it appears that KERNEL_LOCK also protects 2010-04-01 00:24:41 +00:00
ip_icmp.c manually adjust m_data and m_len so it can later be prepended with a 2010-07-02 07:02:00 +00:00
ip_icmp.h Add MPLS support, proposed on tech-net@ a couple of days ago 2010-06-26 14:24:27 +00:00
ip_id.c ip_randomid: make mechanism MP-safe and more modular. 2010-11-05 01:35:57 +00:00
ip_input.c *_drain() routines may be called with locks held, so instead of doing 2011-05-03 17:44:30 +00:00
ip_mroute.c
ip_mroute.h
ip_output.c after ip_input.c rev.1.285 and 1.286, restore kernel_lock for if_output. 2011-04-14 15:53:36 +00:00
ip_private.h
ip_reass.c ip_reass_packet: finish abstraction; some clean-up. 2010-11-05 00:21:51 +00:00
ip_var.h *_drain() routines may be called with locks held, so instead of doing 2011-05-03 17:44:30 +00:00
ip.h
Makefile Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
pim_var.h
pim.h
raw_ip.c
tcp_congctl.c simplify code a little. no functional changes. 2011-04-08 11:15:11 +00:00
tcp_congctl.h - comments 2011-04-14 15:57:02 +00:00
tcp_debug.c
tcp_debug.h
tcp_fsm.h
tcp_input.c Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
tcp_output.c simplify a compile-time assertion 2011-04-14 16:08:53 +00:00
tcp_private.h
tcp_sack.c - comments 2011-04-14 15:54:31 +00:00
tcp_seq.h
tcp_subr.c Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
tcp_timer.c Rewrite comments about TCP RTO calculations. 2011-04-20 13:35:51 +00:00
tcp_timer.h Rewrite comments about TCP RTO calculations. 2011-04-20 13:35:51 +00:00
tcp_usrreq.c Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
tcp_var.h Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
tcp_vtw.c Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
tcp_vtw.h Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
tcp.h
tcpip.h
udp_private.h
udp_usrreq.c Reduces the resources demanded by TCP sessions in TIME_WAIT-state using 2011-05-03 18:28:44 +00:00
udp_var.h
udp.h