* IP_PORTRANGE socket option, which controls how the ephemeral ports
are allocated. it takes the following settings:
IP_PORTRANGE_DEFAULT use anonportmin (49152) -> anonportmax (65535)
IP_PORTRANGE_HIGH as IP_PORTRANGE_DEFAULT (retained for FreeBSD
compat reasons, where these are separate)
IP_PORTRANGE_LOW use 600 -> 1023. only works if uid==0.
* in_pcb flag INP_ANONPORT. set if port was allocated ephmerally
* support sysctl net.inet.ip.anonportmin (lowest ephemeral port)
and net.inet.ip.anonportmax (highest ephemeral port).
these can't be set to >65535, < IPPORT_RESERVED (unless IPNOPRIVPORTS
is defined), and anonportmin has to be < anonportmax.
* use a cleaner way of only cycling through the available set once;
this will be useful for when a random allocation scheme is used
* define IPPORT_ANON{MIN,MAX} instead of IPPORT_USER{LOW,HIGH}
so_linger is used as an argument to tsleep(), so was stuffed with
clockticks for the TCP linger time. However, so_linger is set directly from
l_linger if the linger time is specified, and l_linger is seconds (although
this is not currently documented anywhere). Fix this to set the TCP
linger time in seconds, and multiply so_linger by hz when tsleep() is
called to actually perform the linger.
- When running the slow timers, skip PCBs in LISTEN state.
- When processing the persist timer, drop the connection if the connection
idle time exceeds the maximum backoff for retransmit. Part of
kern/2335 (pete@daemon.net).
- If we fail to allocate mbufs for the outgoing segment, free the header
and abort.
From Stevens:
- Ensure the persist timer is running if the send window reaches zero.
Part of the fix for kern/2335 (pete@daemon.net).
The sysctl'able variable "tcp_init_win", when set to 0, selects an
auto-tuning algorithm for selecting the initial window, based on transmit
segment size, per discussion in the IETF tcpimpl working group.
Default initial window is still 1 segment, but will soon become 2 segments,
per discussion in tcpimpl.
in tcp_output(), and it will only be cleared in tcp_output() if the ACK was
transmitted sucessfully. Also, don't count delayed ACKs here, let tcp_output()
count them.
case. Sending an RST to ourselves is a little silly, considering that
we'll just attempt to remove a non-existent compressed state entry and
then drop the packet anyway.
socket:
- If we received a SYN,ACK, send an RST.
- If we received a SYN, and the connection attempt appears to come from
itself, send an RST, since it cannot possibly be valid.
pseudo-device rnd # /dev/random and in-kernel generator
in config files.
o Add declaration to all architectures.
o Clean up copyright message in rnd.c, rnd.h, and rndpool.c to include
that this code is derived in part from Ted Tyso's linux code.
Basically, in silly window avoidance, don't use the raw MSS we advertised
to the peer. What we really want here is the _expected_ size of received
segments, so we need to account for the path MTU (eventually; right now,
the interface MTU for "local" addresses and loopback or tcp_mssdflt for
non-local addresses). Without this, silly window avoidance would never
kick in if we advertised a very large (e.g. ~64k) MSS to the peer.
- Don't overload t_maxseg. Previous behavior was to set it to the min
of the peer's advertised MSS, our advertised MSS, and tcp_mssdflt
(for non-local networks). This breaks PMTU discovery running on
either host. Instead, remember the MSS we advertise, and use it
as appropriate (in silly window avoidance).
- Per last bullet, split tcp_mss() into several functions for handling
MSS (ours and peer's), and performing various tasks when a connection
becomes ESTABLISHED.
- Introduce a new function, tcp_segsize(), which computes the max size
for every segment transmitted in tcp_output(). This will eventually
be used to hook in PMTU discovery.
where it is now, and adding the specialized for Ethernet version of the ARP
structure, for the benefit of programs which are externally (to us) maintained
and not (yet) ported.
XXX This should NOT be used inside the kernel.
which happen to have a TCB in TIME_WAIT, where an mbuf which had been
advanced past the IP+TCP headers and TCP options would be reused as if
it had not been advanced. Problem found by Juergen Hannken-Illjes, who
also suggested a work-around on which this fix is based.
fixed in FreeBSD by John Polstra:
Fix a bug (apparently very old) that can cause a TCP connection to
be dropped when it has an unusual traffic pattern. For full details
as well as a test case that demonstrates the failure, see the
referenced PR (FreeBSD's kern/3998).
Under certain circumstances involving the persist state, it is
possible for the receive side's tp->rcv_nxt to advance beyond its
tp->rcv_adv. This causes (tp->rcv_adv - tp->rcv_nxt) to become
negative. However, in the code affected by this fix, that difference
was interpreted as an unsigned number by max(). Since it was
negative, it was taken as a huge unsigned number. The effect was
to cause the receiver to believe that its receive window had negative
size, thereby rejecting all received segments including ACKs. As
the test case shows, this led to fruitless retransmissions and
eventually to a dropped connection. Even connections using the
loopback interface could be dropped. The fix substitutes the signed
imax() for the unsigned max() function.
Bill informs me that his research indicates this bug appeared in Reno.
use of mbuf external storage and increasing performance (by eliminating
an m_pullup() for clusters in the IP reassembly code).
Changes from Koji Imada <koji@math.human.nagoya-u.ac.jp>, in PR #3628
and #3480, with ever-so-slight integration changes by me.
a socket, just calling tcp_disconnect() on the tcpcb will do the right thing.
From Thorsten Frueauf <frueauf@ira.uka.de> and W. Richard Stevens in PR/3738
resp. TCP/IP Illustrated, Vol. 2.
operation, since:
- in ipl_enable(), "fr_savep = fr_checkp" is not conditionalized
in the same way (not at all), and
- without this change, it was not possible to enable, disable,
and reenable ipfilter.
silly differences between the NetBSD copy of the code and the
vendor branch, keeping only those which are necessary. Of those
differences that currently exist, several "portability to NetBSD"
issues, which will be fed back to the ipfilter author.
- Fix a really obvious error: ipl_enable() disappeared, but the guts of
the function were scrunched into the "no-op" BSD pseudo-device attach
routine. Would not compile, because of non-void return from a void
function. Fixed by reincarnating ipl_enable(), and reimplementing
the no-op pseudo-device attach.
- #ifdef as appropriate to remove unused variable warnings.
- Call ipl_enable() in iplinit(), rather than the no-op ipfilterattach().
Description:
- A BSD pseudo-device initialization routine is declared as
void <pseudo-device name>attach __P((int count));
in ioconf.c by config(8). main() calls these functions
from a table.
- IP Filter has functions iplattach() and ipldetach() (or,
in the NetBSD case, were erroneously renamed ipfilterattach()
and ipfilterdetach()). These functions are used to establish
and disestablish the IP Filter "filter rule check" hook in
the IP input/output stream. They are declared:
int iplattach __P((void));
int ipldetach __P((void));
..and are expected to return a value by iplioctl().
- When main() calls (by sheer coincidence!) iplattach(),
the filter hook is established, and the IP Filter machinery
labeled as "initialized". This causes all packets, whether or
not the user intents to use filter rules, to be passed to
the filter rule checker if "ipfilter" is configured into the
kernel.
- As a result of the above, a kludge existed to default to
passing all packets (I can only assume that when this was
originally committed, the symptom of the bug was noticed by
the integrator, but the bug not actually found/fixed).
- In iplioctl(), if the SIOCFRENB ioctl is issued with an
argument of "enable" (i.e. user executed "ipf -E"), iplattach()
will notice that the machinery is already initialized and
return EBUSY.
Fix:
- Rename iplattach()/ipldetach() to ipl_enable() and ipl_disable().
- Create a pseudo-device entry stub named ipfilterattach()
(NetBSD case) or iplattach() (all other). This is a noop; none
of the machinery should be initialized until the caller expicitly
enables the filter with ipf -E. Add a comment to note that.
XXX !!! XXX !!!
I noticed a few semi-serious bugs while doing this merge, one of which
has existed for a fairly long time. Some of them are addressed in this
commit (because they caused the kernel to not compile), and are annoted
by "XXX" and "--thorpej". The other one will be addressed shortly in
a future commit, and, as far as I can tell, affects all operating systems
which IP Filter supports.
Among other, add ARPHRD_ARCNET definition, make sure the hardware type is
set on outgoing ARP packets, make sure we dont send out replies as broadcasts.
Some of the stuff (e.g., rarpd, bootpd, dhcpd etc., libsa) still will
only support Ethernet. Tcpdump itself should be ok, but libpcap needs
lot of work.
For the detailed change history, look at the commit log entries for
the is-newarp branch.
routed packets. This currently defaults to `drop,' but once we
verify that all applications that rely on determining remote IP
addresses for authentication are dropping the connection when they
see a source route option (not just disabling the source route
option), we can turn this back on and conform with the host
requirements.
interface using a sockaddr_dl in a control mbuf.
Implement SO_TIMESTAMP for IP datagrams.
Move packet information option processing into a generic function
so that they work with multicast UDP and raw IP as well as unicast UDP.
Contributed by Bill Fenner <fenner@parc.xerox.com>.
Scenario: If ip_insertoptions() prepends a new mbuf to the chain, the
bad: label's m_freem(m0) still would free only the original mbuf chain
if the transmission failed for, e.g., no route to host; resulting in
one lost mbuf per failed packet. (The original posting included a
demonstration program).
Original report of this bug was by jinmei@isl.rdc.toshiba.co.jp
(JINMEI Tatuya) on comp.bugs.4bsd.
[1] if user tries to enter in a bogus PVC don't leave it in the routing
table ... remove it
[2] change ioctl arg to include rxso for lower layer
[3] add hooks (inside "NATM" ifdef) for native mode atm sockets so that
they don't clash with IP PVCs. [i am still debugging the native
mode atm socket protosw code]
device and a printable "external name" (name + unit number), thus eliminating
if_name and if_unit. Updated interface to (*if_watchdog)() and (*if_reset)()
to take a struct ifnet *, rather than a unit number.