Commit Graph

40 Commits

Author SHA1 Message Date
ozaki-r 2c87fbb8b4 Shorten the name of a workqueue instance to fit to the limit (15) 2018-02-06 03:37:00 +00:00
maxv 6163bade20 Style, and use __cacheline_aligned.
By the way, it would be nice to revisit the use of 'ip6flow_lock' in
ip6flow_fastforward(): it is taken right away because of 'ip6flow_inuse',
but then we perform several checks that do not require it.
2018-01-29 08:27:10 +00:00
knakahara d88bfb301b Committed debugging logs by mistake, sorry. Revert cryoto.c:r.1.103 and ip6_flow.c:r.1.37. 2018-01-08 23:33:40 +00:00
knakahara f2516b4ae6 Fix PR kern/52910. Reported and implemented a patch by Sevan Janiyan, thanks. 2018-01-08 23:23:25 +00:00
maxv df4b74a91d Fix use-after-free: if m_pullup fails the (freed) mbuf is pushed on the
ip6_pktq queue and re-processed later. Return 1 to say "processed and
freed".
2017-12-10 09:06:46 +00:00
ozaki-r cead3b8854 Provide macros for softnet_lock and KERNEL_LOCK hiding NET_MPSAFE switch
It reduces C&P codes such as "#ifndef NET_MPSAFE KERNEL_LOCK(1, NULL); ..."
scattered all over the source code and makes it easy to identify remaining
KERNEL_LOCK and/or softnet_lock that are held even if NET_MPSAFE.

No functional change
2017-11-17 07:37:12 +00:00
ozaki-r 2b82ef9b8f Get rid of unnecessary header inclusions 2017-01-11 13:08:29 +00:00
ozaki-r 4c25fb2f83 Add rtcache_unref to release points of rtentry stemming from rtcache
In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
2016-12-08 05:16:33 +00:00
ozaki-r 3be3142886 Don't hold global locks if NET_MPSAFE is enabled
If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in
part of the network stack such as IP forwarding paths. The aim of the
change is to make it easy to test the network stack without the locks
and reduce our local diffs.

By default (i.e., if NET_MPSAFE isn't enabled), the locks are held
as they used to be.

Reviewed by knakahara@
2016-10-18 07:30:30 +00:00
knakahara 74c24413b3 improve fast-forward performance when the number of flows exceeds ip6_maxflows.
This is porting of ip_flow.c:r1.76

In ip6flow case, the before degradation is about 45%, the after degradation is
bout 55%.
2016-08-23 09:59:20 +00:00
knakahara 48235e8230 ip6flow refactor like ipflow.
- move ip6flow sysctls into ip6_flow.c like ip_flow.c:r1.64
    - build ip6_flow.c only if GATEWAY kernel option is enabled
2016-08-02 04:50:16 +00:00
ozaki-r e449cc85bc Simplify by using atomic_swap instead of mutex
Suggested by kefren@
2016-07-26 05:53:30 +00:00
ozaki-r dca032f9f4 Run timers in workqueue
Timers (such as nd6_timer) typically free/destroy some data in callout
(softint). If we apply psz/psref for such data, we cannot do free/destroy
process in there because synchronization of psz/psref cannot be used in
softint. So run timer callbacks in workqueue works (normal LWP context).

Doing workqueue_enqueue a work twice (i.e., call workqueue_enqueue before
a previous task is scheduled) isn't allowed. For nd6_timer and
rt_timer_timer, this doesn't happen because callout_reset is called only
from workqueue's work. OTOH, ip{,6}flow_slowtimo's callout can be called
before its work starts and completes because the callout is periodically
called regardless of completion of the work. To avoid such a situation,
add a flag for each protocol; the flag is set true when a work is
enqueued and set false after the work finished. workqueue_enqueue is
called only if the flag is false.

Proposed on tech-net and tech-kern.
2016-07-11 07:37:00 +00:00
knakahara 95fc145695 apply if_output_lock() to L3 callers which call ifp->if_output() of L2(or L3 tunneling). 2016-06-20 06:46:37 +00:00
knakahara a6f4292e65 eliminate unnecessary splnet 2016-06-13 08:37:15 +00:00
knakahara e4ff09f05d MP-ify fastforward to support GATEWAY kernel option.
I add "ipflow_lock" mutex in ip_flow.c and "ip6flow_lock" mutex in ip6_flow.c
to protect all data in each file. Of course, this is not MP-scalable. However,
it is sufficient as tentative workaround. We should make it scalable somehow
in the future.

ok by ozaki-r@n.o.
2016-06-13 08:34:23 +00:00
roy a37502b2b6 Add RTF_BROADCAST to mark routes used for the broadcast address when
they are created on the fly. This makes it clear what the route is for
and allows an optimisation in ip_output() by avoiding a call to
in_broadcast() because most of the time we do talk to a host.
It also avoids a needless allocation for the storage of llinfo_arp and
thus vanishes from arp(8) - it showed as incomplete anyway so this
is a nice side effect.

Guard against this and routes marked with RTF_BLACKHOLE in
ip_fastforward().
While here, guard against routes marked with RTF_BLACKHOLE in
ip6_fastforward().
RTF_BROADCAST is IPv4 only, so don't bother checking that here.
2015-03-23 18:33:17 +00:00
bouyer 8ec9289dda Sync with the ipv4 code and call ifp->if_output() with KERNEL_LOCK
held.
Problem reported and fix tested by njoly@ on current-users@
2014-05-20 20:23:56 +00:00
pooka d610791504 Wrap ipflow_create() & ip6flow_create() in kernel lock. Prevents the
interrupt side on another core from seeing the situation while the ipflow
is being modified.
2014-04-01 13:11:44 +00:00
msaitoh c259649f35 Clear mbuf's csum_flags in ip6flow_fastforward(). Fixes PR#47849. 2013-05-23 16:49:46 +00:00
christos 202952fb98 PR/47058: Antti Kantee: If the ipv6 flow code modifies the mbuf, pass the
change up to the caller.
2012-10-11 20:05:50 +00:00
liamjfoy b723329891 Remove ip6f_start from ip6f struct 2012-01-19 13:19:34 +00:00
liamjfoy 29f894919e Init ip6flow pool dynamically instead of using a linkset. 2009-03-23 18:43:20 +00:00
martin ce099b4099 Remove clause 3 and 4 from TNF licenses 2008-04-28 20:22:51 +00:00
ad 15e29e981b Merge the socket locking patch:
- Socket layer becomes MP safe.
- Unix protocols become MP safe.
- Allows protocol processing interrupts to safely block on locks.
- Fixes a number of race conditions.

With much feedback from matt@ and plunky@.
2008-04-24 11:38:36 +00:00
thorpej 0dd41b37de Make ip6 and icmp6 stats per-cpu. 2008-04-15 03:57:04 +00:00
thorpej 3f466bce48 Change IPv6 stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old ip6stat structure; old netstat
binaries will continue to work properly.
2008-04-08 23:37:43 +00:00
dyoung 1c6cf449e1 Constify. 2008-01-04 23:35:00 +00:00
dyoung a4455600d4 Replace rtcache_down() with rtcache_validate() and update rtcache_down()
uses.
2008-01-04 23:26:44 +00:00
dyoung 72fa642a86 Poison struct route->ro_rt uses in the kernel by changing the name
to _ro_rt.  Use rtcache_getrt() to access a route cache's struct
rtentry *.

Introduce struct ifnet->if_dl that always points at the interface
identifier/link-layer address.  Make code that treated the first
ifaddr on struct ifnet->if_addrlist as the interface address use
if_dl, instead.

Remove stale debugging code from net/route.c.  Move the rtflush()
code into rtcache_clear() and delete rtflush().  Delete rtalloc(),
because nothing uses it any more.

Make ND6_HINT an inline, lowercase subroutine, nd6_hint.

I've done my best to convert IP Filter, the ISO stack, and the
AppleTalk stack to rtcache_getrt().  They compile, but I have not
tested them.  I have given the changes to PF, GRE, IPv4 and IPv6
stacks a lot of exercise.
2007-12-20 19:53:29 +00:00
lukem 456279df8f use __KERNEL_RCSID() 2007-12-11 12:29:11 +00:00
dyoung bd98464c6f Don't call rtcache_check() from the fast-forward code, which runs
at IPL_NET, because rtcache_check() may read the forwarding table.
Elsewhere, the kernel only blocks interrupts at priority IPL_SOFTNET
and below while it modifies the forwarding table, so rtcache_check()
could be reading the table in an inconsistent state.  Use
rtcache_done(), instead.

XXX netinet/ip_flow.c and netinet6/ip6_flow.c are virtually identical.
XXX They should share code.
2007-08-20 19:42:34 +00:00
dyoung 8b646d9bb9 Remove obsolete files netinet/in_route.[ch]. 2007-05-02 22:39:03 +00:00
dyoung 72f0a6dfb0 Eliminate address family-specific route caches (struct route, struct
route_in6, struct route_iso), replacing all caches with a struct
route.

The principle benefit of this change is that all of the protocol
families can benefit from route cache-invalidation, which is
necessary for correct routing.  Route-cache invalidation fixes an
ancient PR, kern/3508, at long last; it fixes various other PRs,
also.

Discussions with and ideas from Joerg Sonnenberger influenced this
work tremendously.  Of course, all design oversights and bugs are
mine.

DETAILS

1 I added to each address family a pool of sockaddrs.  I have
  introduced routines for allocating, copying, and duplicating,
  and freeing sockaddrs:

        struct sockaddr *sockaddr_alloc(sa_family_t af, int flags);
        struct sockaddr *sockaddr_copy(struct sockaddr *dst,
                                       const struct sockaddr *src);
        struct sockaddr *sockaddr_dup(const struct sockaddr *src, int flags);
        void sockaddr_free(struct sockaddr *sa);

  sockaddr_alloc() returns either a sockaddr from the pool belonging
  to the specified family, or NULL if the pool is exhausted.  The
  returned sockaddr has the right size for that family; sa_family
  and sa_len fields are initialized to the family and sockaddr
  length---e.g., sa_family = AF_INET and sa_len = sizeof(struct
  sockaddr_in).  sockaddr_free() puts the given sockaddr back into
  its family's pool.

  sockaddr_dup() and sockaddr_copy() work analogously to strdup()
  and strcpy(), respectively.  sockaddr_copy() KASSERTs that the
  family of the destination and source sockaddrs are alike.

  The 'flags' argumet for sockaddr_alloc() and sockaddr_dup() is
  passed directly to pool_get(9).

2 I added routines for initializing sockaddrs in each address
  family, sockaddr_in_init(), sockaddr_in6_init(), sockaddr_iso_init(),
  etc.  They are fairly self-explanatory.

3 structs route_in6 and route_iso are no more.  All protocol families
  use struct route.  I have changed the route cache, 'struct route',
  so that it does not contain storage space for a sockaddr.  Instead,
  struct route points to a sockaddr coming from the pool the sockaddr
  belongs to.  I added a new method to struct route, rtcache_setdst(),
  for setting the cache destination:

        int rtcache_setdst(struct route *, const struct sockaddr *);

  rtcache_setdst() returns 0 on success, or ENOMEM if no memory is
  available to create the sockaddr storage.

  It is now possible for rtcache_getdst() to return NULL if, say,
  rtcache_setdst() failed.  I check the return value for NULL
  everywhere in the kernel.

4 Each routing domain (struct domain) has a list of live route
  caches, dom_rtcache.  rtflushall(sa_family_t af) looks up the
  domain indicated by 'af', walks the domain's list of route caches
  and invalidates each one.
2007-05-02 20:40:22 +00:00
liamjfoy 72a3be8fc7 use size_t for indexes
ok christos@
2007-04-05 18:12:49 +00:00
macallan cc085574cb caddr_t -> void * 2007-03-23 17:35:02 +00:00
liamjfoy a3580ff06f Add a new sysctl net.inet6.ip6.hashsize to control the hash table size.
The sysctl handler will ensure this value is a power of 2

ok dyoung@
2007-03-23 14:24:22 +00:00
ad 59d979c5f1 Pass an ipl argument to pool_init/POOL_INIT to be used when initializing
the pool's lock.
2007-03-12 18:18:22 +00:00
liamjfoy d0d904ff73 Use ip6flowtable when looking up 2007-03-08 17:09:15 +00:00
liamjfoy 8aa640dadd Add IPv6 Fast Forward - the IPv4 counterpart:
If ip6_forward successfully forwards a packet, a cache, in this case a
ip6flow struct entry, will be created. ether_input and friends will
then be able to call ip6flow_fastforward with the packet which will then
be passed to if_output (unless an issue is found - in that case the packet
is passed back to ip6_input).

ok matt@ christos@ dyoung@ and joerg@
2007-03-07 22:20:04 +00:00