Commit Graph

101 Commits

Author SHA1 Message Date
ozaki-r 9e214c7fd5 inet6: reduce silent packet discards 2020-08-28 06:32:24 +00:00
ozaki-r 4c639cc739 inet6: pass rcvif to ip6_forward to avoid extra psref_acquire 2020-08-28 06:28:58 +00:00
ozaki-r c1e00d7df1 inet, inet6: count packets dropped by IPsec
The counters count packets dropped due to security policy checks.
2020-08-28 06:19:13 +00:00
roy b05648aa26 Remove in-kernel handling of Router Advertisements
This is much better handled by a user-land tool.
Proposed on tech-net here:
https://mail-index.netbsd.org/tech-net/2020/04/22/msg007766.html

Note that the ioctl SIOCGIFINFO_IN6 no longer sets flags. That now
needs to be done using the pre-existing SIOCSIFINFO_FLAGS ioctl.

Compat is fully provided where it makes sense, but trying to turn on
RA handling will obviously throw an error as it no longer exists.

Note that if you use IPv6 temporary addresses, this now needs to be
turned on in dhcpcd.conf(5) rather than in sysctl.conf(5).
2020-06-12 11:04:44 +00:00
knakahara c535599f70 Fix ipsecif(4) IPV6_MINMTU does not work correctly. 2019-11-01 04:23:21 +00:00
ozaki-r e524fb36a1 Avoid having a rtcache directly in a percpu storage
percpu(9) has a certain memory storage for each CPU and provides it by the piece
to users.  If the storages went short, percpu(9) enlarges them by allocating new
larger memory areas, replacing old ones with them and destroying the old ones.
A percpu storage referenced by a pointer gotten via percpu_getref can be
destroyed by the mechanism after a running thread sleeps even if percpu_putref
has not been called.

Using rtcache, i.e., packet processing, typically involves sleepable operations
such as rwlock so we must avoid dereferencing a rtcache that is directly stored
in a percpu storage during packet processing.  Address this situation by having
just a pointer to a rtcache in a percpu storage instead.

Reviewed by knakahara@ and yamaguchi@
2019-09-19 04:08:29 +00:00
ozaki-r 6d8eb4f9d2 Count packets dropped by pfil 2019-05-13 07:47:59 +00:00
maxv 86ac125b49 Remove now unused net_osdep.h includes, the other BSDs did the same. 2018-05-01 07:21:39 +00:00
maxv eee3723d53 Stop using m_copy(), use m_copym() directly. m_copy is useless,
undocumented and confusing.
2018-04-26 19:50:09 +00:00
maxv e62bbe6865 Remove unused netipsec/xform.h includes. 2018-04-18 07:17:49 +00:00
maxv 90dd9967f8 style 2018-01-29 08:17:18 +00:00
maxv a7c056383d Fix two pretty bad mistakes. If ipsec6_check_policy fails m is not freed,
and a 'goto out' is missing after ipsec6_process_packet.
2018-01-29 08:14:54 +00:00
ozaki-r 8c09e9f90b Fix use-after-free of mbuf by ip6flow_create (one more)
XXX need pullup-[678]
2018-01-09 04:41:19 +00:00
ozaki-r a29d76a139 Fix use-after-free of mbuf by ip6flow_create
This fixes recent failures of some ATF tests such as t_ipsec_tunnel_odd.

XXX need pullup-[678]
2018-01-09 04:21:26 +00:00
ozaki-r 0c084e85e9 Make IPsec SPD MP-safe
We use localcount(9), not psref(9), to make the sptree and secpolicy (SP)
entries MP-safe because SPs need to be referenced over opencrypto
processing that executes a callback in a different context.

SPs on sockets aren't managed by the sptree and can be destroyed in softint.
localcount_drain cannot be used in softint so we delay the destruction of
such SPs to a thread context. To do so, a list to manage such SPs is added
(key_socksplist) and key_timehandler_spd deletes dead SPs in the list.

For more details please read the locking notes in key.c.

Proposed on tech-kern@ and tech-net@
2017-08-02 01:28:02 +00:00
ozaki-r 808b116a48 Add missing KEY_FREESP to ip6_forward 2017-05-09 04:24:10 +00:00
ozaki-r 3f909d1769 Do ND in L2_output in the same manner as arpresolve
The benefits of this change are:
- The flow is consistent with IPv4 (and FreeBSD and OpenBSD)
  - old: ip6_output => nd6_output (do ND if needed) => L2_output (lookup a stored cache)
  - new: ip6_output => L2_output (lookup a cache. Do ND if cache not found)
- We can remove some workarounds in nd6_output
- We can move L2 specific operations to their own place
- The performance slightly improves because one cache lookup is reduced
2017-02-14 03:05:06 +00:00
christos 35561f6b22 ip6_sprintf -> IN6_PRINT so that we pass the size. 2017-01-16 15:44:46 +00:00
ryo 28f4c24cc2 Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.
Reviewed by ozaki-r@
2017-01-16 07:33:36 +00:00
ozaki-r 2b82ef9b8f Get rid of unnecessary header inclusions 2017-01-11 13:08:29 +00:00
ozaki-r 4c25fb2f83 Add rtcache_unref to release points of rtentry stemming from rtcache
In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
2016-12-08 05:16:33 +00:00
ozaki-r 543e39c0d3 Make ipforward_rt and ip6_forward_rt percpu
Sharing one rtcache between CPUs is just a bad idea.

Reviewed by knakahara@
2016-08-31 09:14:47 +00:00
ozaki-r ca4ea29d93 Add missing NULL checks for m_get_rcvif_psref 2016-06-28 02:02:56 +00:00
ozaki-r fe6d427551 Avoid storing a pointer of an interface in a mbuf
Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
2016-06-10 13:31:43 +00:00
pooka 1c4a50f192 sprinkle _KERNEL_OPT 2015-08-24 22:21:26 +00:00
ozaki-r 55140c1926 Use time_uptime instead of time_second to avoid time leaps
Some codes in sys/net* use time_second to manage time periods such as
cache expirations. However, time_second doesn't increase monotonically
and can leap by say settimeofday(2) according to time_second(9). We
should use time_uptime instead of it to avoid such time leaps.

This change replaces time_second with time_uptime. Additionally it
converts a time based on time_uptime to a time based on time_second
when the kernel passes the time to userland programs that expect
the latter, and vice versa.

Note that we shouldn't leak time_uptime to other hosts over the
netowrk. My investigation shows there is no such leak:
http://mail-index.netbsd.org/tech-net/2015/08/06/msg005332.html

Discussed on tech-kern and tech-net.
2015-08-07 08:11:33 +00:00
christos e0b4678125 call vsnprintf instead of snprintf; provide more detail 2014-12-10 01:10:14 +00:00
christos cb7e0235f1 Merge some common code in the failed forwarding case, while providing better
diagnostics, and fixing leaks.
2014-12-08 00:19:37 +00:00
maxv 833172a8e0 Do not uselessly include <sys/malloc.h>. 2014-11-14 17:34:23 +00:00
christos 5d61e6c015 Introduce 2 new variables: ipsec_enabled and ipsec_used.
Ipsec enabled is controlled by sysctl and determines if is allowed.
ipsec_used is set automatically based on ipsec being enabled, and
rules existing.
2014-05-30 01:39:03 +00:00
rmind f04a92b1d6 - Rewrite parts of pfil(9): use array to store hooks and thus be more cache
friendly (there are only few hooks in the system).  Make the structures
  opaque and the interface more strict.
- Remove PFIL_HOOKS option by making pfil(9) mandatory.
2013-06-29 21:06:57 +00:00
christos 27fe772ddc IPSEC has not come in two speeds for a long time now (IPSEC == kame,
FAST_IPSEC). Make everything refer to IPSEC to avoid confusion.
2013-06-05 19:01:26 +00:00
drochner 364a06bb29 remove KAME IPSEC, replaced by FAST_IPSEC 2012-03-22 20:34:37 +00:00
drochner 23e5beaef1 rename the IPSEC in-kernel CPP variable and config(8) option to
KAME_IPSEC, and make IPSEC define it so that existing kernel
config files work as before
Now the default can be easily be changed to FAST_IPSEC just by
setting the IPSEC alias to FAST_IPSEC.
2011-12-19 11:59:56 +00:00
joerg 3d7916e198 Explicitly include opt_gateway.h when depending on GATEWAY. 2010-02-04 21:48:11 +00:00
joerg 1a57a79dcb Clear cksum flags before any further processing like ip_forward does.
Many drivers set the UDP/TCP v4 flags even for v6 traffic and if the
packet is encapsulated with gif, the IPv6 header would get corrupted by
ip_output. Patch suggested by bad@
2009-11-11 22:19:22 +00:00
cegger c363a9cb62 bzero -> memset 2009-03-18 16:00:08 +00:00
thorpej caf49ea572 Make IPSEC and FAST_IPSEC stats per-cpu. Use <net/net_stats.h> and
netstat_sysctl().
2008-04-23 06:09:04 +00:00
thorpej 0dd41b37de Make ip6 and icmp6 stats per-cpu. 2008-04-15 03:57:04 +00:00
thorpej 3f466bce48 Change IPv6 stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.
2008-04-08 23:37:43 +00:00
dyoung 19dd9ed4a7 Use rtcache_validate() instead of rtcache_getrt(). Shorten staircase
in in6_losing().
2008-01-14 04:16:45 +00:00
dyoung 1386ee4adf Good-bye, rtcache_check(). Call both rtcache_validate() and
rtcache_update(,1) instead of rtcache_check().
2008-01-12 02:58:58 +00:00
dyoung 45485bd0b7 Save some rtcache_getrt() calls. 2008-01-10 08:06:11 +00:00
dyoung 72fa642a86 Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt.  Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address.  Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c.  Move the rtflush()
code into rtcache_clear() and delete rtflush().  Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt().  They compile, but I have not
tested them.  I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.
2007-12-20 19:53:29 +00:00
christos 72cfe7327b Ansify + add a few comments, from Karl Sjödahl 2007-05-23 17:14:59 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
liamjfoy 8aa640dadd Add IPv6 Fast Forward - the IPv4 counterpart:
If ip6_forward successfully forwards a packet, a cache, in this case a
ip6flow struct entry, will be created. ether_input and friends will
then be able to call ip6flow_fastforward with the packet which will then
be passed to if_output (unless an issue is found - in that case the packet
is passed back to ip6_input).

ok matt@ christos@ dyoung@ and joerg@
2007-03-07 22:20:04 +00:00
dyoung 5493f188c7 KNF: de-__P, bzero -> memset, bcmp -> memcmp. Remove extraneous
parentheses in return statements.

Cosmetic: don't open-code TAILQ_FOREACH().

Cosmetic: change types of variables to avoid oodles of casts: in
   in6_src.c, avoid casts by changing several route_in6 pointers
   to struct route pointers.  Remove unnecessary casts to caddr_t
   elsewhere.

Pave the way for eliminating address family-specific route caches:
   soon, struct route will not embed a sockaddr, but it will hold
   a reference to an external sockaddr, instead.  We will set the
   destination sockaddr using rtcache_setdst().  (I created a stub
   for it, but it isn't used anywhere, yet.)  rtcache_free() will
   free the sockaddr.  I have extracted from rtcache_free() a helper
   subroutine, rtcache_clear().  rtcache_clear() will "forget" a
   cached route, but it will not forget the destination by releasing
   the sockaddr.  I use rtcache_clear() instead of rtcache_free()
   in rtcache_update(), because rtcache_update() is not supposed
   to forget the destination.

Constify:

   1 Introduce const accessor for route->ro_dst, rtcache_getdst().

   2 Constify the 'dst' argument to ifnet->if_output().  This
     led me to constify a lot of code called by output routines.

   3 Constify the sockaddr argument to protosw->pr_ctlinput.  This
     led me to constify a lot of code called by ctlinput routines.

   4 Introduce const macros for converting from a generic sockaddr
     to family-specific sockaddrs, e.g., sockaddr_in: satocsin6,
     satocsin, et cetera.
2007-02-17 22:34:07 +00:00
degroote e2211411a4 Commit my SoC work
Add ipv6 support for fast_ipsec
Note that currently, packet with extensions headers are not correctly
supported
Change the ipcomp logic
2007-02-10 09:43:05 +00:00
dyoung 2539c85ea4 bzero -> memset 2007-01-26 19:20:15 +00:00