NetBSD

Author	SHA1	Message	Date
christos	526a1efeb9	Limit the tcp initial window setting to 10, leaving it by default to 4 and simplifying the code in process. Per draft-ietf-initcwnd-08.txt.	2013-04-10 00:16:03 +00:00
tls	7b0b7dedd9	Entropy-pool implementation move and cleanup. 1) Move core entropy-pool code and source/sink/sample management code to sys/kern from sys/dev. 2) Remove use of NRND as test for presence of entropy-pool code throughout source tree. 3) Remove use of RND_ENABLED in device drivers as microoptimization to avoid expensive operations on disabled entropy sources; make the rnd_add calls do this directly so all callers benefit. 4) Fix bug in recent rnd_add_data()/rnd_add_uint32() changes that might have lead to slight entropy overestimation for some sources. 5) Add new source types for environmental sensors, power sensors, VM system events, and skew between clocks, with a sample implementation for each. ok releng to go in before the branch due to the difficulty of later pullup (widespread #ifdef removal and moved files). Tested with release builds on amd64 and evbarm and live testing on amd64.	2012-02-02 19:42:57 +00:00
yamt	bf52753ac3	tcp_reass_unlock: assertion	2011-10-31 12:52:19 +00:00
gdt	2377e629f8	Add comment urging a separation of TCP_RTT_SHIFT into separate defines describing the EWMA calculation and the storage representation. (No code change.)	2011-05-25 23:17:44 +00:00
dyoung	c2e43be1c5	Reduces the resources demanded by TCP sessions in TIME_WAIT-state using methods called Vestigial Time-Wait (VTW) and Maximum Segment Lifetime Truncation (MSLT). MSLT and VTW were contributed by Coyote Point Systems, Inc. Even after a TCP session enters the TIME_WAIT state, its corresponding socket and protocol control blocks (PCBs) stick around until the TCP Maximum Segment Lifetime (MSL) expires. On a host whose workload necessarily creates and closes down many TCP sockets, the sockets & PCBs for TCP sessions in TIME_WAIT state amount to many megabytes of dead weight in RAM. Maximum Segment Lifetimes Truncation (MSLT) assigns each TCP session to a class based on the nearness of the peer. Corresponding to each class is an MSL, and a session uses the MSL of its class. The classes are loopback (local host equals remote host), local (local host and remote host are on the same link/subnet), and remote (local host and remote host communicate via one or more gateways). Classes corresponding to nearer peers have lower MSLs by default: 2 seconds for loopback, 10 seconds for local, 60 seconds for remote. Loopback and local sessions expire more quickly when MSLT is used. Vestigial Time-Wait (VTW) replaces a TIME_WAIT session's PCB/socket dead weight with a compact representation of the session, called a "vestigial PCB". VTW data structures are designed to be very fast and memory-efficient: for fast insertion and lookup of vestigial PCBs, the PCBs are stored in a hash table that is designed to minimize the number of cacheline visits per lookup/insertion. The memory both for vestigial PCBs and for elements of the PCB hashtable come from fixed-size pools, and linked data structures exploit this to conserve memory by representing references with a narrow index/offset from the start of a pool instead of a pointer. When space for new vestigial PCBs runs out, VTW makes room by discarding old vestigial PCBs, oldest first. VTW cooperates with MSLT. It may help to think of VTW as a "FIN cache" by analogy to the SYN cache. A 2.8-GHz Pentium 4 running a test workload that creates TIME_WAIT sessions as fast as it can is approximately 17% idle when VTW is active versus 0% idle when VTW is inactive. It has 103 megabytes more free RAM when VTW is active (approximately 64k vestigial PCBs are created) than when it is inactive.	2011-05-03 18:28:44 +00:00
dyoung	ac162b774b	_drain() routines may be called with locks held, so instead of doing any work in _drain(), set a drain-needed flag. Do the work in the fasttimo handler. Contributed by Coyote Point Systems, Inc.	2011-05-03 17:44:30 +00:00
gdt	f641bea548	Rewrite comments about TCP RTO calculations. Long ago, the storage representations of srtt and rttvar were changed from the 4.4BSD scheme, and the comments are out of sync with the code. This commit rewrites most of the comments that explain the RTO calculations, and points out some issues in the code. Joint work with Bev Schwartz of BBN (original analysis and comments), but I have rewritten and extended them, so errors are mine. This material is based upon work supported by the Defense Advanced Research Projects Agency and Space and Naval Warfare Systems Center, Pacific, under Contract No. N66001-09-C-2073. Approved for Public Release, Distribution Unlimited	2011-04-20 13:35:51 +00:00
yamt	37494bba21	comments	2011-04-14 15:55:46 +00:00
pooka	11281f01a0	Replace a large number of link set based sysctl node creations with calls from subsystem constructors. Benefits both future kernel modules and rump. no change to sysctl nodes on i386/MONOLITHIC & build tested i386/ALL	2009-09-16 15:23:04 +00:00
darran	ddd44491c6	Make tcp msl (max segment life) tunable via sysctl net.inet.tcp.msl. Okayed by tls@.	2009-09-09 22:41:28 +00:00
pooka	9d2101a249	POOL_INIT -> pool_init	2009-05-27 17:41:03 +00:00
pooka	c7a407f862	stinkset purge: POOL_INIT -> pool_init also, make the syncache pool static in scope	2009-01-29 20:38:22 +00:00
plunky	fd7356a917	Convert socket options code to use a sockopt structure instead of laying everything into an mbuf. approved by core	2008-08-06 15:01:23 +00:00
martin	ce099b4099	Remove clause 3 and 4 from TNF licenses	2008-04-28 20:22:51 +00:00
ad	15e29e981b	Merge the socket locking patch: - Socket layer becomes MP safe. - Unix protocols become MP safe. - Allows protocol processing interrupts to safely block on locks. - Fixes a number of race conditions. With much feedback from matt@ and plunky@.	2008-04-24 11:38:36 +00:00
thorpej	7ff8d08aae	Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated when the user requests them via sysctl.	2008-04-12 05:58:22 +00:00
thorpej	f5c68c0b9f	Change TCP stats from a structure to an array of uint64_t's. Note: This is ABI-compatible with the old tcpstat structure; old netstat binaries will continue to work properly.	2008-04-08 01:03:58 +00:00
matt	a34217b8de	Rework tcp congctl selection code so that the congctl entries can be const. Don't access tcp_congctl stuff outside of tcp_congctl.c, use routines to update t_congctl. This code is slightly now more complicated.	2008-02-29 07:39:17 +00:00
matt	a4a1e5ce55	Convert stragglers to ansi definitions from old-style definitons. Remember that func() is not ansi, func(void) is.	2008-02-27 19:41:51 +00:00
perry	b6a2ef7569	Convert many of the uses of __attribute__ to equivalent __packed, __unused and __dead macros from cdefs.h	2007-12-25 18:33:32 +00:00
rmind	4175f8693b	TCP socket buffers automatic sizing - ported from FreeBSD. http://mail-index.netbsd.org/tech-net/2007/02/04/0006.html ! Disabled by default, marked as experimental. Testers are very needed. ! Someone should thoroughly test this, and improve if possible. Discussed on <tech-net>: http://mail-index.netbsd.org/tech-net/2007/07/12/0002.html Thanks Greg Troxel for comments. OK by the long silence on <tech-net>.	2007-08-02 02:42:40 +00:00
ad	88ab7da936	Merge some of the less invasive changes from the vmlocking branch: - kthread, callout, devsw API changes - select()/poll() improvements - miscellaneous MT safety improvements	2007-07-09 20:51:58 +00:00
christos	0a36551606	tcpdrop kernel bits (from anon ymous)	2007-06-25 23:35:12 +00:00
christos	eeff189533	- per socket keepalive settings - settable connection establishment timeout	2007-06-20 15:29:17 +00:00
dyoung	72f0a6dfb0	Eliminate address family-specific route caches (struct route, struct route_in6, struct route_iso), replacing all caches with a struct route. The principle benefit of this change is that all of the protocol families can benefit from route cache-invalidation, which is necessary for correct routing. Route-cache invalidation fixes an ancient PR, kern/3508, at long last; it fixes various other PRs, also. Discussions with and ideas from Joerg Sonnenberger influenced this work tremendously. Of course, all design oversights and bugs are mine. DETAILS 1 I added to each address family a pool of sockaddrs. I have introduced routines for allocating, copying, and duplicating, and freeing sockaddrs: struct sockaddr sockaddr_alloc(sa_family_t af, int flags); struct sockaddr sockaddr_copy(struct sockaddr dst, const struct sockaddr src); struct sockaddr sockaddr_dup(const struct sockaddr src, int flags); void sockaddr_free(struct sockaddr sa); sockaddr_alloc() returns either a sockaddr from the pool belonging to the specified family, or NULL if the pool is exhausted. The returned sockaddr has the right size for that family; sa_family and sa_len fields are initialized to the family and sockaddr length---e.g., sa_family = AF_INET and sa_len = sizeof(struct sockaddr_in). sockaddr_free() puts the given sockaddr back into its family's pool. sockaddr_dup() and sockaddr_copy() work analogously to strdup() and strcpy(), respectively. sockaddr_copy() KASSERTs that the family of the destination and source sockaddrs are alike. The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is passed directly to pool_get(9). 2 I added routines for initializing sockaddrs in each address family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(), etc. They are fairly self-explanatory. 3 structs route_in6 and route_iso are no more. All protocol families use struct route. I have changed the route cache, 'struct route', so that it does not contain storage space for a sockaddr. Instead, struct route points to a sockaddr coming from the pool the sockaddr belongs to. I added a new method to struct route, rtcache_setdst(), for setting the cache destination: int rtcache_setdst(struct route , const struct sockaddr *); rtcache_setdst() returns 0 on success, or ENOMEM if no memory is available to create the sockaddr storage. It is now possible for rtcache_getdst() to return NULL if, say, rtcache_setdst() failed. I check the return value for NULL everywhere in the kernel. 4 Each routing domain (struct domain) has a list of live route caches, dom_rtcache. rtflushall(sa_family_t af) looks up the domain indicated by 'af', walks the domain's list of route caches and invalidates each one.	2007-05-02 20:40:22 +00:00
christos	53524e44ef	Kill caddr_t; there will be some MI fallout, but it will be fixed shortly.	2007-03-04 05:59:00 +00:00
dyoung	5493f188c7	KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous parentheses in return statements. Cosmetic: don't open-code TAILQ_FOREACH(). Cosmetic: change types of variables to avoid oodles of casts: in in6_src.c, avoid casts by changing several route_in6 pointers to struct route pointers. Remove unnecessary casts to caddr_t elsewhere. Pave the way for eliminating address family-specific route caches: soon, struct route will not embed a sockaddr, but it will hold a reference to an external sockaddr, instead. We will set the destination sockaddr using rtcache_setdst(). (I created a stub for it, but it isn't used anywhere, yet.) rtcache_free() will free the sockaddr. I have extracted from rtcache_free() a helper subroutine, rtcache_clear(). rtcache_clear() will "forget" a cached route, but it will not forget the destination by releasing the sockaddr. I use rtcache_clear() instead of rtcache_free() in rtcache_update(), because rtcache_update() is not supposed to forget the destination. Constify: 1 Introduce const accessor for route->ro_dst, rtcache_getdst(). 2 Constify the 'dst' argument to ifnet->if_output(). This led me to constify a lot of code called by output routines. 3 Constify the sockaddr argument to protosw->pr_ctlinput. This led me to constify a lot of code called by ctlinput routines. 4 Introduce const macros for converting from a generic sockaddr to family-specific sockaddrs, e.g., sockaddr_in: satocsin6, satocsin, et cetera.	2007-02-17 22:34:07 +00:00
yamt	8836e5995d	add some more tcp mowners.	2006-12-06 09:10:45 +00:00
yamt	f5830ee995	- make tcp_reass static. - constify.	2006-12-06 09:08:27 +00:00
yamt	c31e22237d	- constify. - make tcp_dooptions and tcpipqent_pool static.	2006-10-21 10:08:54 +00:00
yamt	81463c93c7	implement RFC3465 appropriate byte counting. from Kentaro A. Kurahone, with minor adjustments by me. the ack prediction part of the original patch was omitted because it's a separate change. reviewed by Rui Paulo.	2006-10-19 11:40:51 +00:00
rpaulo	21df8206df	Export the tcp_do_rfc1948 variable to userland via sysctl. The code to generate an ISS via an MD5 hash has been present in the NetBSD kernel since 2001, but it wasn't even exported to userland at that time. It was agreed on tech-net with the original author <thorpej> that we should let the user decide if he wants to enable it or not. Not enabled by default.	2006-10-16 18:13:56 +00:00
rpaulo	f3330397f0	Modular (I tried ;-) TCP congestion control API. Whenever certain conditions happen in the TCP stack, this interface calls the specified callback to handle the situation according to the currently selected congestion control algorithm. A new sysctl node was created: net.inet.tcp.congctl.{available,selected} with obvious meanings. The old net.inet.tcp.newreno MIB was removed. The API is discussed in tcp_congctl(9). In the near future, it will be possible to selected a congestion control algorithm on a per-socket basis. Discussed on tech-net and reviewed by <yamt>.	2006-10-09 16:27:07 +00:00
rpaulo	2fb2ae3251	Import of TCP ECN algorithm for congestion control. Both available for IPv4 and IPv6. Basic implementation test results are available at http://netbsd-soc.sourceforge.net/projects/ecn/testresults.html. Work sponsored by the Google Summer of Code project 2006. Special thanks to Kentaro Kurahone, Allen Briggs and Matt Thomas for their help, comments and support during the project.	2006-09-05 00:29:35 +00:00
rpaulo	25ec6d007f	revert stuff that shouldn't have gone in.	2006-07-22 17:45:03 +00:00
rpaulo	f5f6aa2ed3	TCP RFC is 793, not 783.	2006-07-22 17:39:48 +00:00
perry	fbae48b901	Change "inline" back to "__inline" in .h files -- C99 is still too new, and some apps compile things in C89 mode. C89 keywords stay. As per core@.	2006-02-16 20:17:12 +00:00
perry	0f0296d88a	Remove leading __ from __(const\|inline\|signed\|volatile) -- it is obsolete.	2005-12-24 20:45:08 +00:00
christos	95e1ffb156	merge ktrace-lwp.	2005-12-11 12:16:03 +00:00
elad	9702e98730	Multiple inclusion protection, as suggested by christos@ on tech-kern@ few days ago.	2005-12-10 23:31:41 +00:00
rpaulo	37cbe61e67	Implement tcp.inet{,6}.tcp{,6}.(debug\|debx) when TCP_DEBUG is set. They can be used to ``transliterate protocol trace'' like trpt(8) does.	2005-09-06 02:41:14 +00:00
yamt	f02551ec2d	move {tcp,udp}_do_loopback_cksum back to tcp/udp so that they can be referenced by ipv6.	2005-08-10 13:06:49 +00:00
elad	6439f2618f	Add sysctls for IP, ICMP, TCP, and UDP statistics.	2005-08-05 09:21:25 +00:00
christos	89940190d0	Implement PMTU checks from: http://www.gont.com.ar/drafts/icmp-attacks-against-tcp.html 1. Don't act on ICMP-need-frag immediately if adhoc checks on the advertised MTU fail. The MTU update is delayed until a TCP retransmit happens. 2. Ignore ICMP Source Quench messages meant for TCP connections. From OpenBSD.	2005-07-19 17:00:02 +00:00
christos	ea2d4204b6	- add const - remove bogus casts - avoid nested variables	2005-05-29 21:41:23 +00:00
kurahone	f7707899c1	Added sysctl tunable limits for the number of maximum SACK holes per connection and per system. Idea taken from FreeBSD.	2005-04-05 01:07:17 +00:00
yamt	8b0967ff45	protect tcpipqent with splvm.	2005-03-29 20:10:16 +00:00
yamt	df05ca7085	simplify data receiver side sack processing. - introduce t_segqlen, the number of segments in segq/timeq. the name is from freebsd. - rather than maintaining a copy of sack blocks (rcv_sack_block[]), build it directly from the segment list when needed.	2005-03-16 00:39:56 +00:00
yamt	0446b7c3e3	- use full sized segments unless we actually have SACKs to send. - avoid TSO duplicate D-SACK. - send SACKs regardless of TF_ACKNOW. - don't clear rcv_sack_num when transmitting. discussed on tech-net@.	2005-03-16 00:38:27 +00:00
atatat	76a9013c25	gc the tcp_sysctl() prototype since it's completely vestigial	2005-03-09 04:51:56 +00:00

1 2 3 4

170 Commits