Commit Graph

131 Commits

Author SHA1 Message Date
ozaki-r
7ccd75e01e Remove dead codes and make if_free_sadl static
No functional change.
2014-11-28 08:29:00 +00:00
rmind
0cf67a7dfd sppp_input: handle pktqueue case correctly (fix for the previous). 2014-06-06 22:15:32 +00:00
rmind
60d350cf6d - Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.
2014-06-05 23:48:16 +00:00
msaitoh
e59c0b8f0f Save a NETISR_* value in a variable and call schednetisr() after enqueue
a packet for readability and future modification.
2014-05-15 09:23:03 +00:00
rmind
f04a92b1d6 - Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system).  Make the structures
  opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.
2013-06-29 21:06:57 +00:00
joerg
e240adbd0b Retire OSI network stack. OK core@ 2013-03-01 18:25:13 +00:00
tls
6e1dd068e9 Separate /dev/random pseudodevice implemenation from kernel entropy pool
implementation.  Rewrite pseudodevice code to use cprng_strong(9).

The new pseudodevice is cloning, so each caller gets bits from a stream
generated with its own key.  Users of /dev/urandom get their generators
keyed on a "best effort" basis -- the kernel will rekey generators
whenever the entropy pool hits the high water mark -- while users of
/dev/random get their generators rekeyed every time key-length bits
are output.

The underlying cprng_strong API can use AES-256 or AES-128, but we use
AES-128 because of concerns about related-key attacks on AES-256.  This
improves performance (and reduces entropy pool depletion) significantly
for users of /dev/urandom but does cause users of /dev/random to rekey
twice as often.

Also fixes various bugs (including some missing locking and a reseed-counter
overflow in the CTR_DRBG code) found while testing this.

For long reads, this generator is approximately 20 times as fast as the
old generator (dd with bs=64K yields 53MB/sec on 2Ghz Core2 instead of
2.5MB/sec) and also uses a separate mutex per instance so concurrency
is greatly improved.  For reads of typical key sizes for modern
cryptosystems (16-32 bytes) performance is about the same as the old
code: a little better for 32 bytes, a little worse for 16 bytes.
2011-12-17 20:05:38 +00:00
tls
3afd44cf08 First step of random number subsystem rework described in
<20111022023242.BA26F14A158@mail.netbsd.org>.  This change includes
the following:

	An initial cleanup and minor reorganization of the entropy pool
	code in sys/dev/rnd.c and sys/dev/rndpool.c.  Several bugs are
	fixed.  Some effort is made to accumulate entropy more quickly at
	boot time.

	A generic interface, "rndsink", is added, for stream generators to
	request that they be re-keyed with good quality entropy from the pool
	as soon as it is available.

	The arc4random()/arc4randbytes() implementation in libkern is
	adjusted to use the rndsink interface for rekeying, which helps
	address the problem of low-quality keys at boot time.

	An implementation of the FIPS 140-2 statistical tests for random
	number generator quality is provided (libkern/rngtest.c).  This
	is based on Greg Rose's implementation from Qualcomm.

	A new random stream generator, nist_ctr_drbg, is provided.  It is
	based on an implementation of the NIST SP800-90 CTR_DRBG by
	Henric Jungheim.  This generator users AES in a modified counter
	mode to generate a backtracking-resistant random stream.

	An abstraction layer, "cprng", is provided for in-kernel consumers
	of randomness.  The arc4random/arc4randbytes API is deprecated for
	in-kernel use.  It is replaced by "cprng_strong".  The current
	cprng_fast implementation wraps the existing arc4random
	implementation.  The current cprng_strong implementation wraps the
	new CTR_DRBG implementation.  Both interfaces are rekeyed from
	the entropy pool automatically at intervals justifiable from best
	current cryptographic practice.

	In some quick tests, cprng_fast() is about the same speed as
	the old arc4randbytes(), and cprng_strong() is about 20% faster
	than rnd_extract_data().  Performance is expected to improve.

	The AES code in src/crypto/rijndael is no longer an optional
	kernel component, as it is required by cprng_strong, which is
	not an optional kernel component.

	The entropy pool output is subjected to the rngtest tests at
	startup time; if it fails, the system will reboot.  There is
	approximately a 3/10000 chance of a false positive from these
	tests.  Entropy pool _input_ from hardware random numbers is
	subjected to the rngtest tests at attach time, as well as the
	FIPS continuous-output test, to detect bad or stuck hardware
	RNGs; if any are detected, they are detached, but the system
	continues to run.

	A problem with rndctl(8) is fixed -- datastructures with
	pointers in arrays are no longer passed to userspace (this
	was not a security problem, but rather a major issue for
	compat32).  A new kernel will require a new rndctl.

	The sysctl kern.arandom() and kern.urandom() nodes are hooked
	up to the new generators, but the /dev/*random pseudodevices
	are not, yet.

	Manual pages for the new kernel interfaces are forthcoming.
2011-11-19 22:51:18 +00:00
dyoung
53c8737e53 For these interfaces, the implementation of SIOCSIFDSTADDR is identical
to SIOCINITIFADDR, and SIOCSIFDSTADDR callers always fall back to
SIOCINITIFADDR, so just get rid of the SIOCSIFDSTADDR case.
2011-10-28 22:08:14 +00:00
rjs
66914c95f9 Add support for RFC 4638 to pppoe(4).
The change to if_spppsubr.c moves the test for whether LCP should
request a mru change until after the pppoe device has picked up the
mtu of the underlying ethernet device.
2011-09-05 12:19:09 +00:00
joerg
3eb244d801 Retire varargs.h support. Move machine/stdarg.h logic into MI
sys/stdarg.h and expect compiler to provide proper builtins, defaulting
to the GCC interface. lint still has a special fallback.
Reduce abuse of _BSD_VA_LIST_ by defining __va_list by default and
derive va_list as required by standards.
2011-07-17 20:54:30 +00:00
jmcneill
ce4300c675 COMPAT_50 support for SPPP[GS]ETIDLETO and SPPP[GS]ETKEEPALIVE, ok martin@ 2010-04-20 14:32:03 +00:00
snj
ccaf1e96be Fight the ever-increasing size of src checkouts by spelling "useful"
without an extra l.
2010-02-28 15:52:16 +00:00
tsutsui
d779b85d3e Remove extra whitespace added by a stupid tool.
XXX: more in src/sys/arch
2009-04-18 14:58:02 +00:00
cegger
e2cb85904d bcopy -> memcpy 2009-03-18 17:06:41 +00:00
martin
e6d2ca8863 Pass SIOCAIFADDR to ifioctl_common, fixes PR kern/39900. 2008-11-13 22:22:24 +00:00
dyoung
de87fe677d *** Summary ***
When a link-layer address changes (e.g., ifconfig ex0 link
02🇩🇪ad:be:ef:02 active), send a gratuitous ARP and/or a Neighbor
Advertisement to update the network-/link-layer address bindings
on our LAN peers.

Refuse a change of ethernet address to the address 00:00:00:00:00:00
or to any multicast/broadcast address.  (Thanks matt@.)

Reorder ifnet ioctl operations so that driver ioctls may inherit
the functions of their "class"---ether_ioctl(), fddi_ioctl(), et
cetera---and the class ioctls may inherit from the generic ioctl,
ifioctl_common(), but both driver- and class-ioctls may override
the generic behavior.  Make network drivers share more code.

Distinguish a "factory" link-layer address from others for the
purposes of both protecting that address from deletion and computing
EUI64.

Return consistent, appropriate error codes from network drivers.

Improve readability.  KNF.

*** Details ***

In if_attach(), always initialize the interface ioctl routine,
ifnet->if_ioctl, if the driver has not already initialized it.
Delete if_ioctl == NULL tests everywhere else, because it cannot
happen.

In the ioctl routines of network interfaces, inherit common ioctl
behaviors by calling either ifioctl_common() or whichever ioctl
routine is appropriate for the class of interface---e.g., ether_ioctl()
for ethernets.

Stop (ab)using SIOCSIFADDR and start to use SIOCINITIFADDR.  In
the user->kernel interface, SIOCSIFADDR's argument was an ifreq,
but on the protocol->ifnet interface, SIOCSIFADDR's argument was
an ifaddr.  That was confusing, and it would work against me as I
make it possible for a network interface to overload most ioctls.
On the protocol->ifnet interface, replace SIOCSIFADDR with
SIOCINITIFADDR.  In ifioctl(), return EPERM if userland tries to
invoke SIOCINITIFADDR.

In ifioctl(), give the interface the first shot at handling most
interface ioctls, and give the protocol the second shot, instead
of the other way around. Finally, let compatibility code (COMPAT_OSOCK)
take a shot.

Pull device initialization out of switch statements under
SIOCINITIFADDR.  For example, pull ..._init() out of any switch
statement that looks like this:

        switch (...->sa_family) {
        case ...:
                ..._init();
                ...
                break;
        ...
        default:
                ..._init();
                ...
                break;
        }

Rewrite many if-else clauses that handle all permutations of IFF_UP
and IFF_RUNNING to use a switch statement,

        switch (x & (IFF_UP|IFF_RUNNING)) {
        case 0:
                ...
                break;
        case IFF_RUNNING:
                ...
                break;
        case IFF_UP:
                ...
                break;
        case IFF_UP|IFF_RUNNING:
                ...
                break;
        }

unifdef lots of code containing #ifdef FreeBSD, #ifdef NetBSD, and
#ifdef SIOCSIFMTU, especially in fwip(4) and in ndis(4).

In ipw(4), remove an if_set_sadl() call that is out of place.

In nfe(4), reuse the jumbo MTU logic in ether_ioctl().

Let ethernets register a callback for setting h/w state such as
promiscuous mode and the multicast filter in accord with a change
in the if_flags: ether_set_ifflags_cb() registers a callback that
returns ENETRESET if the caller should reset the ethernet by calling
if_init(), 0 on success, != 0 on failure.  Pull common code from
ex(4), gem(4), nfe(4), sip(4), tlp(4), vge(4) into ether_ioctl(),
and register if_flags callbacks for those drivers.

Return ENOTTY instead of EINVAL for inappropriate ioctls.  In
zyd(4), use ENXIO instead of ENOTTY to indicate that the device is
not any longer attached.

Add to if_set_sadl() a boolean 'factory' argument that indicates
whether a link-layer address was assigned by the factory or some
other source.  In a comment, recommend using the factory address
for generating an EUI64, and update in6_get_hw_ifid() to prefer a
factory address to any other link-layer address.

Add a routing message, RTM_LLINFO_UPD, that tells protocols to
update the binding of network-layer addresses to link-layer addresses.
Implement this message in IPv4 and IPv6 by sending a gratuitous
ARP or a neighbor advertisement, respectively.  Generate RTM_LLINFO_UPD
messages on a change of an interface's link-layer address.

In ether_ioctl(), do not let SIOCALIFADDR set a link-layer address
that is broadcast/multicast or equal to 00:00:00:00:00:00.

Make ether_ioctl() call ifioctl_common() to handle ioctls that it
does not understand.

In gif(4), initialize if_softc and use it, instead of assuming that
the gif_softc and ifp overlap.

Let ifioctl_common() handle SIOCGIFADDR.

Sprinkle rtcache_invariants(), which checks on DIAGNOSTIC kernels
that certain invariants on a struct route are satisfied.

In agr(4), rewrite agr_ioctl_filter() to be a bit more explicit
about the ioctls that we do not allow on an agr(4) member interface.

bzero -> memset.  Delete unnecessary casts to void *.  Use
sockaddr_in_init() and sockaddr_in6_init().  Compare pointers with
NULL instead of "testing truth".  Replace some instances of (type
*)0 with NULL.  Change some K&R prototypes to ANSI C, and join
lines.
2008-11-07 00:20:01 +00:00
pooka
224186c110 Fix pointer size typo - affects only debug output.
Henning Petersen, PR lib/39689
2008-10-03 18:33:06 +00:00
martin
b0f7c9ec03 Backout previous/restore initial fix for PR kern/39280.
The later changes were only cosmetic, cause problems in IPv6-only-
connections (reported by Wolfgang Solfrank in private mail), as well
as reintroducing the original bug again.
2008-08-22 12:13:18 +00:00
christos
fd2004ecfd keep the loop, but arrange IDX_COUNT to be correct. 2008-08-04 12:03:14 +00:00
martin
2cfed21969 PR kern/39280: Uninitialized callout stopped in if_spppsubr layer
in kernels without options INET6.
2008-08-04 10:17:33 +00:00
gmcgarry
ccd9038096 ioctl commands are unsigned long. 2008-06-24 10:32:14 +00:00
matt
2b028087f5 s/u_\(int[0-9]*_t\)/u\1/g
(change u_int*_t to uint*_t)
2008-02-20 17:05:52 +00:00
dyoung
2ccede0a9c Start patching up the kernel so that a network driver always has
the opportunity to handle an ioctl before generic ifioctl handling
occurs.  This will ease extending the kernel and sharing of code
between drivers.

First steps:  Make the signature of ifioctl_common() match struct
ifinet->if_ioctl.  Convert SIOCSIFCAP and SIOCSIFMTU to the new
ifioctl() regime, throughout the kernel.
2008-02-07 01:21:52 +00:00
perry
b6a2ef7569 Convert many of the uses of __attribute__ to equivalent
__packed, __unused and __dead macros from cdefs.h
2007-12-25 18:33:32 +00:00
ad
88ab7da936 Merge some of the less invasive changes from the vmlocking branch:
- kthread, callout, devsw API changes
- select()/poll() improvements
- miscellaneous MT safety improvements
2007-07-09 20:51:58 +00:00
scw
2346befecc If the underlying link's MTU is less than PP_MTU (e.g. PPPoE), set our
MRU to the link's MTU and initiate an MRU negotiation with the peer.

This is useful when the PPP session is bridged from Ethernet to ATM
by an ADSL modem (such as the Linksys AM200). Unless we negotiate the
lower MRU, the peer is unaware that 1500-byte packets will not make
it umolested across the link (the Linksys AM200 silently truncates them
to 1498 bytes, creating a nice PMTU blackhole).

Note that the PPP RFC says peers MUST accept 1500 byte packets,
regardless of the negotiated MRU, so most ISPs which use PPPoA will
probably still send 1500-byte packets. However, I persuaded my ISP
(Andrews and Arnold) to modify their software to generate an ICMP error
"fragment needed" for packets with IP.DF set which are larger than the
negotiated MRU. They will still forward non-IP.DF packets, with the
associated truncation, but at least my PMTU troubles have gone.
2007-06-23 08:45:25 +00:00
christos
53524e44ef Kill caddr_t; there will be some MI fallout, but it will be fixed shortly. 2007-03-04 05:59:00 +00:00
dyoung
5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
wiz
fa34b615d2 Correct spelling of "immediate(ly)". From Zafer. 2006-11-24 21:23:07 +00:00
christos
168cd830d2 __unused removal on arguments; approved by core. 2006-11-16 01:32:37 +00:00
elad
74482de29f Kill a couple of KAUTH_GENERIC_ISSUSER usages.
I had to refactor the code a bit, I hope it's okay.
2006-10-26 15:11:22 +00:00
dogcow
2023789a40 More -Wunused fallout. sprinkle __unused when possible; otherwise, use the
do { if (&x) {} } while (/* CONSTCOND */ 0);
construct as suggested by uwe in <20061012224845.GA9449@snark.ptc.spbu.ru>.
2006-10-13 16:53:35 +00:00
christos
4d595fd7b1 - sprinkle __unused on function decls.
- fix a couple of unused bugs
- no more -Wno-unused for i386
2006-10-12 01:30:41 +00:00
dogcow
f2d329dca0 remove more vestiges of CCITT, LLC, HDLC, NS, and NSIP. 2006-09-07 02:40:31 +00:00
adrianp
3d8cbc06ba A problem has been identified in the in-kernel PPP code shared by ISDN PPP
interfaces ippp(4) and pppoe(4). Insufficient checking of options presented
by the peer may cause writing of copies of the malicious input beyond the
end of a buffer allocated for that purpose.

Issue found by pavel@
Fix from martin@

This is SA2006-019 (CVE-2006-4304)
2006-08-23 20:02:23 +00:00
ad
f474dceb13 Use the LWP cached credentials where sane. 2006-07-23 22:06:03 +00:00
martin
dee43775e6 Small simplification, pointed out by Christian Hattemer in private mail. 2006-07-13 23:43:13 +00:00
martin
81b2f47532 Do not automagically UP the interface when setting the address.
Together with previous ifconfig changes, this fixes PR 30694, at
least for pppoe (and other sppp based) interfaces.
2006-07-13 14:04:50 +00:00
kardel
de4337ab21 merge FreeBSD timecounters from branch simonb-timecounters
- struct timeval time is gone
  time.tv_sec -> time_second
- struct timeval mono_time is gone
  mono_time.tv_sec -> time_uptime
- access to time via
	{get,}{micro,nano,bin}time()
	get* versions are fast but less precise
- support NTP nanokernel implementation (NTP API 4)
- further reading:
  Timecounter Paper: http://phk.freebsd.dk/pubs/timecounter.pdf
  NTP Nanokernel: http://www.eecis.udel.edu/~mills/ntp/html/kern.html
2006-06-07 22:33:33 +00:00
christos
c52ff7f9d5 Fixes from David Boggs; in his words:
/sys/net/if_spppvar.h says:

	"Lower layer drivers that are always ready to communicate
	(like hardware HDLC) can shortcut pp_up from pp_tls,
	and pp_down from pp_tlf."

	When I follow those instructions, I get a kernel stack
	overflow as soon as I open the HDLC device.

	Here is the loop:
	 sppp_ioctl calls sppp_lcp_open
	 sppp_lcp_open calls sppp_open_event
	 sppp_open_event calls sppp_lcp_tls
	 sppp_lcp_tls calls pp_tls
	 pp_tls is the SHORTCUT to sppp_lcp_up
	 sppp_lcp_up calls spp_lcp_open
	 ...and around we go until the stack overflows.

	The fix is to reverse the order of the action (tls)
	and the state change (from INITIAL to STARTING) in
	sppp_open_event.

	There is a similar loop during closing:
	 sppp_ioctl calls sppp_lcp_close
	 sppp_lcp_close calls sppp_close_event
	 spp_close_event calls sppp_lcp_tlf
	 sppp_lcp_tlf calls pp_tlf
	 pp_tlf is the SHORTCUT to sppp_lcp_down
	 sppp_lcp_down calls sppp_lcp_close
	 ...and around we go until the stack overflows.

	The fix is to reverse the order of the action (tlf)
	and the state change (from STARTING to INITIAL) in
	sppp_close_event.

	Separately, while I was discovering this, I noticed
	that pp_tlf was being called unconditionally rather
	than first checking to see if it is NULL.  pp_tlf
	is a callout from sppp to the hdlc device driver.
	Elsewhere in sppp, this is always checked for NULL
	before calling it, and the comments in if_spppvar.h
	imply that filling it in is optional.

	From spppvar.h:
	"These functions need to be filled in by the lower layer
	(hardware) drivers if they request notification from the
	PPP layer whether the link is actually required."
	This clearly says that pp_tlf and pp_tls are optional
	and so sppp must check before calling them.
2006-05-21 05:09:13 +00:00
elad
874fef3711 integrate kauth. 2006-05-14 21:19:33 +00:00
christos
103d2f520c XXX: GCC uninitialized. 2006-05-14 05:30:31 +00:00
christos
667e91e30f Add an empty attach function. Reported by David Boggs 2006-04-20 17:03:35 +00:00
rpaulo
78678b130a Better support of IPv6 scoped addresses.
- most of the kernel code will not care about the actual encoding of
  scope zone IDs and won't touch "s6_addr16[1]" directly.
- similarly, most of the kernel code will not care about link-local
  scoped addresses as a special case.
- scope boundary check will be stricter.  For example, the current
  *BSD code allows a packet with src=::1 and dst=(some global IPv6
  address) to be sent outside of the node, if the application do:
    s = socket(AF_INET6);
    bind(s, "::1");
    sendto(s, some_global_IPv6_addr);
  This is clearly wrong, since ::1 is only meaningful within a single
  node, but the current implementation of the *BSD kernel cannot
  reject this attempt.
- and, while there, don't try to remove the ff02::/32 interface route
  entry in in6_ifdetach() as it's already gone.

This also includes some level of support for the standard source
address selection algorithm defined in RFC3484, which will be
completed on in the future.

From the KAME project via JINMEI Tatuya.
Approved by core@.
2006-01-21 00:15:35 +00:00
christos
95e1ffb156 merge ktrace-lwp. 2005-12-11 12:16:03 +00:00
christos
333e176687 - sprinkle const
- remove unneeded casts
- use more mem*() instead of b*() funcs.
2005-05-29 21:22:52 +00:00
martin
3cde73812d Fix typo, from C. Plasschaert in PR kern/30069. 2005-04-27 07:48:02 +00:00
christos
d7ec95d370 factor out the interface queueing code into two functions. One used by
the non point-to-point interfaces that has one queue, and one used by
the point to point interfaces that has two queues. No functional changes.
XXX: The ALTQ stuff makes the code ugly.
XXX: More cleanup to come
2005-03-31 15:48:13 +00:00
perry
f07677dd81 nuke trailing whitespace 2005-02-26 22:45:09 +00:00