c2e43be1c5
methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime Truncation (MSLT). MSLT and VTW were contributed by Coyote Point Systems, Inc. Even after a TCP session enters the TIME_WAIT state, its corresponding socket and protocol control blocks (PCBs) stick around until the TCP Maximum Segment Lifetime (MSL) expires. On a host whose workload necessarily creates and closes down many TCP sockets, the sockets & PCBs for TCP sessions in TIME_WAIT state amount to many megabytes of dead weight in RAM. Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to a class based on the nearness of the peer. Corresponding to each class is an MSL, and a session uses the MSL of its class. The classes are loopback (local host equals remote host), local (local host and remote host are on the same link/subnet), and remote (local host and remote host communicate via one or more gateways). Classes corresponding to nearer peers have lower MSLs by default: 2 seconds for loopback, 10 seconds for local, 60 seconds for remote. Loopback and local sessions expire more quickly when MSLT is used. Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket dead weight with a compact representation of the session, called a "vestigial PCB". VTW data structures are designed to be very fast and memory-efficient: for fast insertion and lookup of vestigial PCBs, the PCBs are stored in a hash table that is designed to minimize the number of cacheline visits per lookup/insertion. The memory both for vestigial PCBs and for elements of the PCB hashtable come from fixed-size pools, and linked data structures exploit this to conserve memory by representing references with a narrow index/offset from the start of a pool instead of a pointer. When space for new vestigial PCBs runs out, VTW makes room by discarding old vestigial PCBs, oldest first. VTW cooperates with MSLT. It may help to think of VTW as a "FIN cache" by analogy to the SYN cache. A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT sessions as fast as it can is approximately 17% idle when VTW is active versus 0% idle when VTW is inactive. It has 103 megabytes more free RAM when VTW is active (approximately 64k vestigial PCBs are created) than when it is inactive. |
||
---|---|---|
.. | ||
accept_filter.h | ||
accf_data.c | ||
accf_http.c | ||
cpu_in_cksum.c | ||
files.ipfilter | ||
files.netinet | ||
icmp6.h | ||
icmp_private.h | ||
icmp_var.h | ||
if_arp.c | ||
if_atm.c | ||
if_atm.h | ||
if_ether.h | ||
if_inarp.h | ||
igmp_var.h | ||
igmp.c | ||
igmp.h | ||
in4_cksum.c | ||
in_cksum.c | ||
in_gif.c | ||
in_gif.h | ||
in_ifattach.h | ||
in_offload.c | ||
in_offload.h | ||
in_pcb_hdr.h | ||
in_pcb.c | ||
in_pcb.h | ||
in_proto.c | ||
in_proto.h | ||
in_selsrc.c | ||
in_selsrc.h | ||
in_systm.h | ||
in_var.h | ||
in.c | ||
in.h | ||
ip6.h | ||
ip_carp.c | ||
ip_carp.h | ||
ip_ecn.c | ||
ip_ecn.h | ||
ip_encap.c | ||
ip_encap.h | ||
ip_etherip.c | ||
ip_etherip.h | ||
ip_flow.c | ||
ip_icmp.c | ||
ip_icmp.h | ||
ip_id.c | ||
ip_input.c | ||
ip_mroute.c | ||
ip_mroute.h | ||
ip_output.c | ||
ip_private.h | ||
ip_reass.c | ||
ip_var.h | ||
ip.h | ||
Makefile | ||
pim_var.h | ||
pim.h | ||
raw_ip.c | ||
tcp_congctl.c | ||
tcp_congctl.h | ||
tcp_debug.c | ||
tcp_debug.h | ||
tcp_fsm.h | ||
tcp_input.c | ||
tcp_output.c | ||
tcp_private.h | ||
tcp_sack.c | ||
tcp_seq.h | ||
tcp_subr.c | ||
tcp_timer.c | ||
tcp_timer.h | ||
tcp_usrreq.c | ||
tcp_var.h | ||
tcp_vtw.c | ||
tcp_vtw.h | ||
tcp.h | ||
tcpip.h | ||
udp_private.h | ||
udp_usrreq.c | ||
udp_var.h | ||
udp.h |