Commit Graph

3105 Commits

Author SHA1 Message Date
ozaki-r 8d757e7177 Disable rt_update mechanism by default
This is a workaround for PR kern/51877. Enable again once the issue
is fixed.
2017-01-19 06:58:55 +00:00
ozaki-r 6e752b2732 Fix typo in comments 2017-01-17 07:53:06 +00:00
christos 35561f6b22 ip6_sprintf -> IN6_PRINT so that we pass the size. 2017-01-16 15:44:46 +00:00
ryo 28df50d7bb Make pfil(9) MP-safe (applying psref(9)) 2017-01-16 09:28:40 +00:00
ryo 28f4c24cc2 Make ip6_sprintf(), in_fmtaddr(), lla_snprintf() and icmp6_redirect_diag() mpsafe.
Reviewed by ozaki-r@
2017-01-16 07:33:36 +00:00
maya fe2925feed appease coverity by using strlcpy instead of strncpy
ok riastradh
2017-01-14 16:34:44 +00:00
msaitoh 98ff4b5f9f Fix a bug that the parent interface's callback wasn't called when the vlan
interface is configured. A callback function uses VLAN_ATTACHED() function
which check ec->ec_nvlans, the value should be incremented before calling the
callback. This bug was added in if_vlan.c rev. 1.83 (2015/11/19).
2017-01-13 06:11:56 +00:00
ryo de1b0d7b6e * pfil_add_hook() no longer treats PFIL_IFADDR and PFIL_IFNET. delete them from pfil_flag_cases[].
* add/fix KASSERT
* fix comment
2017-01-12 17:19:17 +00:00
ozaki-r 2b82ef9b8f Get rid of unnecessary header inclusions 2017-01-11 13:08:29 +00:00
ozaki-r 93472e9c29 Don't call ifa_remove with holding psref 2017-01-11 07:03:59 +00:00
ozaki-r 7f1388a5d2 Add softnet_lock to if_link_state_change_si
Fix
  panic: lock error: Mutex: mutex_vector_exit: assertion failed:
  MUTEX_OWNER(mtx->mtx_owner) == curthread
at callout_halt <= arp_dad_stop <= in_if_link_down.
2017-01-10 08:45:45 +00:00
ozaki-r a94a205118 Enable some sysctl knobs on rump kernels for ifmcstat 2017-01-10 05:42:34 +00:00
ozaki-r 9ac3cb6dea Replace adaptive mutex for ethercom with spin one
Unfortunately even wm(4) doesn't allow adaptive mutex because wm(4)
tries to hold it with holding its own spin mutex.
2017-01-10 05:40:59 +00:00
ryo 7c72d62b0f Not to use ph_inout[2]. dir (= PFIL_IN or PFIL_OUT) is 1 or 2, not 0 or 1. 2017-01-04 13:03:41 +00:00
rmind c65c0a1d00 NPF: fix the interface table initialisation on load. 2017-01-03 00:58:05 +00:00
christos 2cc136e9e2 make this compile as a module. 2017-01-02 23:02:04 +00:00
rmind 0b635e7c1d NPF: implement dynamic handling of interface addresses (the kernel part). 2017-01-02 21:49:51 +00:00
ozaki-r 0e2548254a Use kmem_intr_alloc instead of kmem_alloc
ether_addmulti still can be called in softint.

Fix PR kern/51755
2016-12-31 15:07:02 +00:00
christos 1913804a7a export rprocs too so we don't lose them. 2016-12-28 21:55:04 +00:00
ozaki-r bf5ce79b5b Protect ec_multi* with mutex
The data can be accessed from sysctl, ioctl, interface watchdog
(if_slowtimo) and interrupt handlers. We need to protect the data against
parallel accesses from them.

Currently the mutex is applied to some drivers, we need to apply it to all
drivers in the future.

Note that the mutex is adaptive one for ease of implementation but some
drivers access the data in interrupt context so we cannot apply the mutex
to every drivers as is. We have two options: one is to replace the mutex
with a spin one, which requires some additional works (see
ether_multicast_sysctl), and the other is to modify the drivers to access
the data not in interrupt context somehow.
2016-12-28 07:32:16 +00:00
ozaki-r b79bd95d27 Use ether_ifattach in carp_clone_create instead of C&P code
carp_clone_destroy calls ether_ifdetach so not calling ether_ifattach is
inconsistent. If we add something pair of initialization and destruction
to ether_ifattach and ether_ifdetach (e.g., mutex_init/mutex_destroy),
ether_ifdetach of carp_clone_destroy won't work. So use ether_ifattach.

In order to do so, make ether_ifattach accept the 2nd argument (lla) as
NULL to allow carp to initialize its link level address by itself.
2016-12-28 07:26:24 +00:00
christos e3864be530 Another missed patch 2016-12-27 13:49:58 +00:00
christos d687232cdf fix merge conflict. 2016-12-27 01:31:06 +00:00
rmind f3628a9718 Convert NPF to the latest pfil(9) changes. 2016-12-26 23:59:47 +00:00
rmind ce5a9d0b35 Bump NPF_VERSION to 19. 2016-12-26 23:39:18 +00:00
christos 8dd9914047 pfil(9) improvements to handle address changes:
Add:
  PFIL_IFADDR     call on interface reconfig (mbuf is ioctl #)
  PFIL_IFNET      call on interface attach/detach (mbuf is PFIL_IFNET_*)

from rmind@
2016-12-26 23:21:49 +00:00
rmind 40a029b661 npf_tcp_fsm: fix for the NPF_TCPS_SYN_RECEIVED state.
SYN re-transmission after SYN-ACK was seen by NPF should not terminate
the connection.  Thanks to: Alexander Kiselev <kiselev99 at gmail com>
2016-12-26 23:10:46 +00:00
christos f75d79eb69 Sync NPF with the version on github: backport standalone NPF changes,
which allow us to create and run separate NPF instances. Minor fixes.
(from rmind@)
2016-12-26 23:05:05 +00:00
rmind e8a8032d56 Fix kmem_free() in hashmap_remove(). 2016-12-26 21:16:06 +00:00
rmind 50c1611937 Fix kmem_free() sizes in hashmap_rehash() and lpm_clear(). 2016-12-26 12:44:10 +00:00
ozaki-r 78a5509e4b Use psz/psref to hold ifa 2016-12-26 07:25:00 +00:00
ozaki-r ec260ed075 Remove assertion that the lock isn't held
It's useless in this case, because without it we can know that
the lock is held or not on a next lock acquisition and even more
if LOCKDEBUG is enabled a failure on the acquisition will provide
useful information for debugging while an assertion failure will
provide just the fact that the assertion failed.
2016-12-22 03:46:51 +00:00
ozaki-r 6261537b3d Fix deadlock between llentry timers and destruction of llentry
llentry timer (of nd6) holds both llentry's lock and softnet_lock.
A caller also holds them and calls callout_halt to wait for the
timer to quit. However we can pass only one lock to callout_halt,
so passing either of them can cause a deadlock. Fix it by avoid
calling callout_halt without holding llentry's lock.

BTW in the first place we cannot pass llentry's lock to callout_halt
because it's a rwlock...
2016-12-21 08:47:02 +00:00
ozaki-r 8be8a178cb Don't call psref_target_destroy unless NET_MPSAFE
We don't need it if NET_MPSAFE off and also it causes lockup
sometimes because of calling it with holding softnet_lock.
2016-12-21 04:01:57 +00:00
ozaki-r eceb88a68b Fix kernel build with RT_DEBUG and !NET_MPSAFE 2016-12-21 00:33:49 +00:00
roy 996a7c47cf Fix gcc complaining about int to unsigned long conversion issues by
explictly marking as unsigned in RT_ROUNDUP2.
2016-12-19 11:17:00 +00:00
christos fb7054ae79 Can't hide stuff from userland, because struct route is embedded in other
structures (like inpcb) and things like fstat stop working.
2016-12-16 20:11:52 +00:00
knakahara 7ae52b382e fix unlock and splx inversion. Currently, this doesn't cause problem because either one is used. 2016-12-16 08:47:36 +00:00
ozaki-r dd8638eea5 Move bpf_mtap and if_ipackets++ on Rx of each driver to percpuq if_input
The benefits of the change are:
- We can reduce codes
- We can provide the same behavior between drivers
  - Where/When if_ipackets is counted up
  - Note that some drivers still update packet statistics in their own
    way (periodical update)
- Moved bpf_mtap run in softint
  - This makes it easy to MP-ify bpf

Proposed on tech-kern and tech-net
2016-12-15 09:28:02 +00:00
knakahara 237f476937 fix race of gif_softc->gif_ro when we send multiple flows over gif on NET_MPSAFE enabled kernel.
make gif_softc->gif_ro percpu as well as ipforward_rt to resolve this race.
and add future TODO comment for etherip(4).
2016-12-14 11:19:15 +00:00
ozaki-r 81f2534aaa Constify ifp of if_is_deactivated 2016-12-13 02:05:48 +00:00
knakahara 22bec9c1a0 MP-safe pppoe(4).
Nearly all parts is implemented by Shoichi YAMAGUCHI<s-yamaguchi@IIJ>, thanks.
2016-12-13 00:35:11 +00:00
maya 21cc7f1b6b acknowleg -> acknowledg, proceedure -> procedure.
only comments were changed.

from miod
2016-12-12 15:58:44 +00:00
ozaki-r 6fb8880601 Make the routing table and rtcaches MP-safe
See the following descriptions for details.

Proposed on tech-kern and tech-net


Overview
--------

We protect the routing table with a rwock and protect
rtcaches with another rwlock. Each rtentry is protected
from being freed or updated via reference counting and psref.

Global rwlocks
--------------

There are two rwlocks; one for the routing table (rt_lock) and
the other for rtcaches (rtcache_lock). rtcache_lock covers
all existing rtcaches; there may have room for optimizations
(future work).

The locking order is rtcache_lock first and rt_lock is next.

rtentry references
------------------

References to an rtentry is managed with reference counting
and psref. Either of the two mechanisms is used depending on
where a rtentry is obtained. Reference counting is used when
we obtain a rtentry from the routing table directly via
rtalloc1 and rtrequest{,1} while psref is used when we obtain
a rtentry from a rtcache via rtcache_* APIs. In both cases,
a caller can sleep/block with holding an obtained rtentry.

The reasons why we use two different mechanisms are (i) only
using reference counting hurts the performance due to atomic
instructions (rtcache case) (ii) ease of implementation;
applying psref to APIs such rtaloc1 and rtrequest{,1} requires
additional works (adding a local variable and an argument).

We will finally migrate to use only psref but we can do it
when we have a lockless routing table alternative.

Reference counting for rtentry
------------------------------

rt_refcnt now doesn't count permanent references such as for
rt_timers and rtcaches, instead it is used only for temporal
references when obtaining a rtentry via rtalloc1 and rtrequest{,1}.
We can do so because destroying a rtentry always involves
removing references of rt_timers and rtcaches to the rtentry
and we don't need to track such references. This also makes
it easy to wait for readers to release references on deleting
or updating a rtentry, i.e., we can simply wait until the
reference counter is 0 or 1. (If there are permanent references
the counter can be arbitrary.)

rt_ref increments a reference counter of a rtentry and rt_unref
decrements it. rt_ref is called inside APIs (rtalloc1 and
rtrequest{,1} so users don't need to care about it while
users must call rt_unref to an obtained rtentry after using it.

rtfree is removed and we use rt_unref and rt_free instead.
rt_unref now just decrements the counter of a given rtentry
and rt_free just tries to destroy a given rtentry.

See the next section for destructions of rtentries by rt_free.

Destructions of rtentries
-------------------------

We destroy a rtentry only when we call rtrequst{,1}(RTM_DELETE);
the original implementation can destroy in any rtfree where it's
the last reference. If we use reference counting or psref, it's
easy to understand if the place that a rtentry is destroyed is
fixed.

rt_free waits for references to a given rtentry to be released
before actually destroying the rtentry. rt_free uses a condition
variable (cv_wait) (and psref_target_destroy for psref) to wait.

Unfortunately rtrequst{,1}(RTM_DELETE) can be called in softint
that we cannot use cv_wait. In that case, we have to defer the
destruction to a workqueue.

rtentry#rt_cv, rtentry#rt_psref and global variables
(see rt_free_global) are added to conduct the procedure.

Updates of rtentries
--------------------

One difficulty to use refcnt/psref instead of rwlock for rtentry
is updates of rtentries. We need an additional mechanism to
prevent readers from seeing inconsistency of a rtentry being
updated.

We introduce RTF_UPDATING flag to rtentries that are updating.
While the flag is set to a rtentry, users cannot acquire the
rtentry. By doing so, we avoid users to see inconsistent
rtentries.

There are two options when a user tries to acquire a rtentry
with the RTF_UPDATING flag; if a user runs in softint context
the user fails to acquire a rtentry (NULL is returned).
Otherwise a user waits until the update completes by waiting
on cv.

The procedure of a updater is simpler to destruction of
a rtentry. Wait on cv (and psref) and after all readers left,
proceed with the update.

Global variables (see rt_update_global) are added to conduct
the procedure.

Currently we apply the mechanism to only RTM_CHANGE in
rtsock.c. We would have to apply other codes. See
"Known issues" section.

psref for rtentry
-----------------

When we obtain a rtentry from a rtcache via rtcache_* APIs,
psref is used to reference to the rtentry.

rtcache_ref acquires a reference to a rtentry with psref
and rtcache_unref releases the reference after using it.
rtcache_ref is called inside rtcache_* APIs and users don't
need to take care of it while users must call rtcache_unref
to release the reference.

struct psref and int bound that is needed for psref is
embedded into struct route. By doing so we don't need to
add local variables and additional argument to APIs.

However this adds another constraint to psref other than
reference counting one's; holding a reference of an rtentry
via a rtcache is allowed by just one caller at the same time.
So we must not acquire a rtentry via a rtcache twice and
avoid a recursive use of a rtcache. And also a rtcache must
be arranged to be used by a LWP/softint at the same time
somehow. For IP forwarding case, we have per-CPU rtcaches
used in softint so the constraint is guaranteed. For a h
rtcache of a PCB case, the constraint is guaranteed by the
solock of each PCB. Any other cases (pf, ipf, stf and ipsec)
are currently guaranteed by only the existence of the global
locks (softnet_lock and/or KERNEL_LOCK). If we've found the
cases that we cannot guarantee the constraint, we would need
to introduce other rtcache APIs that use simple reference
counting.

psref of rtcache is created with IPL_SOFTNET and so rtcache
shouldn't used at an IPL higher than IPL_SOFTNET.

Note that rtcache_free is used to invalidate a given rtcache.
We don't need another care by my change; just keep them as
they are.

Performance impact
------------------

When NET_MPSAFE is disabled the performance drop is 3% while
when it's enabled the drop is increased to 11%. The difference
comes from that currently we don't take any global locks and
don't use psref if NET_MPSAFE is disabled.

We can optimize the performance of the case of NET_MPSAFE
on by reducing lookups of rtcache that uses psref;
currently we do two lookups but we should be able to trim
one of two. This is a future work.

Known issues
------------

There are two known issues to be solved; one is that
a caller of rtrequest(RTM_ADD) may change rtentry (see rtinit).
We need to prevent new references during the update. Or
we may be able to remove the code (perhaps, need more
investigations).

The other is rtredirect that updates a rtentry. We need
to apply our update mechanism, however it's not easy because
rtredirect is called in softint and we cannot apply our
mechanism simply. One solution is to defer rtredirect to
a workqueue but it requires some code restructuring.
2016-12-12 03:55:57 +00:00
christos 8a52904684 revert dir hack. 2016-12-10 22:09:49 +00:00
christos 9543d3ecd6 Welcome to version 18:
- Connection state keys are not stored and loaded using the logical key
  contents.
- connection finder key is stored in a map that contains the key and the
  direction.
2016-12-10 19:05:45 +00:00
christos 8feed355cb Add missing extcalls array. This is currently a no-op, but this is what
userland does too. Allows npfctl save; npfctl load to work again.
2016-12-10 19:02:18 +00:00
kre 27d85bae08 Remove what looks like remnant (partly removed already) debug code,
which could not possibly compile as it was.
2016-12-10 09:26:16 +00:00
christos 0ce32297f1 add functionality to lookup a nat entry from the connection list. 2016-12-10 05:41:10 +00:00
christos 8f6d079f97 This patches ditches the ptree(3) library, because it is broken (you
can get missing entries!).  Instead, as a temporary solution, we switch
to a simple linear scan of the hash tables for the longest-prefix-match
(lpm.c lpm.h) algorithm. In fact, with few unique prefixes in the set,
on modern hardware this simple algorithm is pretty fast anyway!
2016-12-09 02:40:38 +00:00
christos 9230257046 This spams 100's of times during boot! 2016-12-09 02:38:14 +00:00
christos 2ca960b3b3 make this compile again 2016-12-09 02:26:36 +00:00
rmind f453fec4c6 NPF: adjust the 'stateful-ends' mechanism to tag the packets and thus
pass-through them on other interfaces.  Per discussion with christos@.
2016-12-08 23:07:11 +00:00
ozaki-r 4c25fb2f83 Add rtcache_unref to release points of rtentry stemming from rtcache
In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
2016-12-08 05:16:33 +00:00
ozaki-r 0cfa5e16fd Introduce deferred if_start framework
The framework provides a means to schedule if_start that will be executed
in softint later. It intends to be used to avoid calling if_start,
especially bpf_mtap, in hardware interrupt.

It adds a dedicated softint to a driver if the driver requests to use the
framework via if_deferred_start_init. The driver can schedule deferred
if_start by if_schedule_deferred_start.

Proposed and discussed on tech-kern and tech-net
2016-12-08 01:06:35 +00:00
knakahara ec7a5d403a add API to manipulate ifa->ia_hash and ia_hash_pslist_entry, and fix ia_hash_pslist_entry race by using them.
in_ifaddr_lock is required before writing ifa->ia_hash and
ia_hash_pslist_entry to serialize writer processings.

reviewed by ozaki-r@n.o.
2016-12-06 07:01:47 +00:00
ozaki-r 8bc1ab68b3 Fix memory leak of struct if_percpuq on interface destruction 2016-12-06 01:23:01 +00:00
knakahara 4348961022 fix two races between set_ip_addrs and clear_ip_addrs race.
(1) if set_ip_addrs and clear_ip_addrs run parallel, they can parallel call
        IN_ADDRHASH_WRITER_REMOVE to the same ifa.
    (2) if set_ip_addrs's workqueue is separated from clear_ip_addrs's one,
        the workers can run in reverse order of enqueued.
2016-12-01 02:30:54 +00:00
knakahara a98d843d1f fix CID 1396600: Null pointer dereferences 2016-12-01 02:15:20 +00:00
joerg cf9fc2baf8 Don't check parent capabilities when a parent interface hasn't been
assigned.
2016-11-28 00:39:03 +00:00
knakahara ca3627eb6c make workqueue sppp_{set,clear}_ip_addrs to be able to call pserialize_perform. 2016-11-25 05:03:12 +00:00
knakahara cc9ef345db refactor sppp_{set,clear}_ip_addrs(). reduce iterating if_addr_pslist. 2016-11-25 05:00:29 +00:00
ozaki-r e099415fd1 Make lortrequest static and rename it to loop_rtrequest
No functional change.
2016-11-22 02:06:00 +00:00
njoly 4c2d33e4a7 Make fstat(2) work on AF_LINK socket descriptors. 2016-11-19 14:44:00 +00:00
knakahara 92613f0abe We must use PSLIST_ENTRY_DESTROY after PSLIST_WRITER_REMOVE and waiting all readers done.
And then, if we want to re-insert the removed pslist element, we need to
call PSLIST_ENTERY_INIT again.

advised by riastradh@n.o and reviewed by ozaki-r@n.o, thanks.
2016-11-18 10:38:55 +00:00
knakahara b93b57bdab if_register() must be called after ifp->if_dl initialized.
There may be similar problems. I will fix step by step...
2016-11-18 08:13:02 +00:00
ozaki-r 5879478f65 Don't use rt_walktree to delete routes
Some functions use rt_walktree to scan the routing table and delete
matched routes. However, we shouldn't use rt_walktree to delete
routes because rt_walktree is recursive to the routing table (radix
tree) and isn't friendly to MP-ification. rt_walktree allows a caller
to pass a callback function to delete an matched entry. The callback
function is called from an API of the radix tree (rn_walktree) but
also calls an API of the radix tree to delete an entry.

This change adds a new API of the radix tree, rn_search_matched,
which returns a matched entry that is selected by a callback
function passed by a caller and the caller itself deletes the
entry. By using the API, we can avoid the recursive form.
2016-11-15 01:50:06 +00:00
jnemeth 1b10164966 fixup misplaced #endif 2016-11-07 18:16:07 +00:00
pgoyette 94bd54f794 Move if_43.c back into the shared Makefile.sysio where it really
belongs.

Update the code to invoke the two routines compat_cvtcmd() and
compat_ifioctl() through indirect pointers.  Initialize those
pointers in sys/net/if.c and update them in the compat module's
initialization code.

Addresses the issue pointed out in PR kern/51598
2016-11-05 23:30:22 +00:00
ozaki-r 83d656c8d0 Fix the position of IFADDR_ENTRY_DESTROY
It must be called after all readers left, i.e, after pserialize_perform.
2016-10-28 05:52:05 +00:00
ozaki-r c0732db998 Pull RTM_CHANGE code out of route_output to make further changes easy
No functional change.
2016-10-26 06:49:10 +00:00
ozaki-r cf96c34d79 Remove unnecessary argument
No functional change.
2016-10-25 02:45:09 +00:00
ozaki-r d5985ea7e4 Revert v1.157
We need to hold the rtentry over rtrequest1 for info that dereferences
member variables of the rtentry after rtrequest1.
2016-10-24 03:19:07 +00:00
ozaki-r 01d48c0fd7 Delete rt_timers on RTM_DELETE surely
We want to ensure that a rtentry is referenced by nobody after
RTM_DELETE (except for the caller). However, rt_timer could
have a reference to the rtentry after that.
2016-10-21 10:56:35 +00:00
ozaki-r 951f676f30 Remove unnecessary argument
No functional change.
2016-10-21 10:52:47 +00:00
ozaki-r 302ac4ae0e Make some rt_timer functions and variables static
No functional change.
2016-10-21 09:01:44 +00:00
ozaki-r e07d22aae6 Avoid temporal dangling reference 2016-10-21 03:04:33 +00:00
ozaki-r df2616c199 Remove unused rtcache_lookup_noclone 2016-10-18 09:43:20 +00:00
ozaki-r 3be3142886 Don't hold global locks if NET_MPSAFE is enabled
If NET_MPSAFE is enabled, don't hold KERNEL_LOCK and softnet_lock in
part of the network stack such as IP forwarding paths. The aim of the
change is to make it easy to test the network stack without the locks
and reduce our local diffs.

By default (i.e., if NET_MPSAFE isn't enabled), the locks are held
as they used to be.

Reviewed by knakahara@
2016-10-18 07:30:30 +00:00
roy 103ec7fade Mark arprequest static and introduce arpannounce so that gratuitous
ARP requests are only send from valid addresses.
2016-10-11 12:32:30 +00:00
joerg 3cc071a817 Since IFF_MULTICAST's value can't be represented without implicit cast
as signed short, make if_flags unsigned.
2016-10-08 17:40:12 +00:00
joerg fce2ad3141 Use uint8_t for opt as some of the values don't fit into the (positive)
range of a signed char.
2016-10-08 17:37:32 +00:00
ozaki-r 8f4376cb6f Fix race condition on ifqueue used by traditional netisr
If a underlying network device driver supports MSI/MSI-X, RX interrupts
can be delivered to arbitrary CPUs. This means that Layer 2 subroutines
such as ether_input (softint) and subsequent Layer 3 subroutines (softint)
which are called via traditional netisr can be dispatched on an arbitrary
CPU. Layer 2 subroutines now run without any locks (expected) and so a
Layer 2 subroutine and a Layer 3 subroutine can run in parallel.

There is a shared data between a Layer 2 routine and a Layer 3 routine,
that is ifqueue and IF_ENQUEUE (from L2) and IF_DEQUEUE (from L3) on it
are racy now.

To fix the race condition, use ifqueue#ifq_lock to protect ifqueue
instead of splnet that is meaningless now.

The same race condition exists in route_intr. Fix it as well.

Reviewed by knakahara@
2016-10-03 11:06:06 +00:00
ozaki-r 14c6c81f32 Add missing return 2016-10-03 07:13:29 +00:00
christos 9c7db92f68 MFREE -> m_free 2016-10-02 14:16:02 +00:00
roy 9288933cf3 Set dstaddr in in_ifinit so that sppp consumers announce the correct
dstaddr in routing messages.
2016-09-29 15:04:17 +00:00
roy fb8ac61d3a Ensure we only call pfil_run_hooks if if_init succeeded.
While here, improve improve some logging.
2016-09-29 14:08:40 +00:00
roy 98b0d70fff Add ifam_pid and ifam_addrflags to ifa_msghdr.
Re-version RTM_NEWADDR, RTM_DELADDR, RTM_CHGADDR and NET_RT_IFLIST.
Add compat code for old version.
2016-09-21 10:50:22 +00:00
roy 70c02d276f Drop hostIsNew from in_ifinit, let the function work out if the address
has changed.
Sync address flag setup with the IPv6 counterpart.
When scrubbing the address, or setting up the address fails, restore the
old address flags as well as the old address.
2016-09-16 14:17:23 +00:00
pgoyette 06402e0a42 Move kern_ctf.c into the dtrace_fbt module (the only place it is used)
rather than including in kernels with KDTRACE_HOOKS defined.  Update
the dtrace_fbt module to depend on the zlib module.

Bump kernel version to avoid module mismatch.

Welcome to 7.99.38 !
2016-09-16 03:10:45 +00:00
christos 9faa331084 Always do the mbuf checks. The packet filters (npf) expect the mbuf to be
pulled-up. (Krists Krilovs)
2016-09-15 14:40:43 +00:00
knakahara feed793fff kmem_alloc(size, KM_SLEEP) return value NULL check is not required any more.
kmem_alloc(size, KM_SLEEP) is already fixed, that is, it never return NULL.
see: sys/kern/subr_kmem.c:r1.62
2016-09-15 06:59:32 +00:00
roy 2a96904518 Call ifmedia_delete_instance() for safety. 2016-09-14 11:54:42 +00:00
roy 169f562155 Introduce IFM_GENERIC.
This allows use of the media interface, but without media as such.
It's sole purpose is to facilitate the reporting of the link status.
2016-09-14 11:43:08 +00:00
roy 64cd8217dd Add interface media for sppp consumers.
While there is no actual media to select,
the ioctl is used to query link status from userland.
2016-09-14 10:58:38 +00:00
joerg 12114a9bee Report link state changes for sppp consumers. The link is considered up,
if the current phase is SPPP_PHASE_NETWORK, otherwise it is down. Useful
when using dhcpcd for DHCPv6 PD.
2016-09-13 19:51:12 +00:00
pgoyette 9a575d933d Move tun.c into the module's own directory, since it is specific to the
module subsystem.
2016-09-10 03:26:10 +00:00
pgoyette eb2b2a3e77 Add a dummy "tun" module, whose only job is to trigger an autoload of
required module "if_tun".  This allows access to /dev/tunN to autload
the require interface module.

XXX There's might be a better place/name for net/tun.c
2016-09-10 02:20:10 +00:00
christos 6324edf045 PR/51464: Shoichi YAMAGUCHI: chap authenticator of pppoe does not work 2016-09-09 12:41:14 +00:00
ozaki-r e50076cac8 Fix tun_enable
Before the rearrangement of ifaddr initializations (in.c,v 1.169),
when we called tun_enable via ioctl(SIOCINITIFADDR), an ifaddr
in question was inserted in the interface address list. However,
after the change the ifaddr isn't in the list at that point. So
we shouldn't rely on that we can find the ifaddr by
IFADDR_READER_FOREACH. Instead simply use the ifaddr passed by
ioctl(SIOCINITIFADDR).
2016-09-07 10:27:44 +00:00
ozaki-r 86bbab733a Rename tuncreate to tun_enable
It should be more proper.
2016-09-07 10:24:57 +00:00
ozaki-r 6dc1297521 Support tun devices on rump kernels 2016-09-05 02:25:37 +00:00
ozaki-r 586dc438d1 Fix typo in a comment 2016-09-05 01:57:54 +00:00
roy 5ffab45fad Split out sysctl_iflist into sysctl_iflist_if and sysctl_iflist_addr.
Setup a command and function pointer in one case statement
instead of having a seconary case statement within a loop.
This makes the code much easier to follow, and possibly to add more compat
in the future.

Don't panic when running an old binary without compat support.
2016-09-01 19:04:30 +00:00
knakahara 4ba8ad0bcb gif(4)'s if_output() is already MP-safe. It should enable IFEF_OUTPUT_MPSAFE. 2016-09-01 06:50:09 +00:00
ozaki-r 60ae5732ab KNF; replace white spaces with hard tabs
No functional change.
2016-08-29 03:31:59 +00:00
knakahara 7b53554dfc fix: failed to create sysctl entries for module version gif(4).
The sysctl entries are below 2 entries.
    - net.inet.ip.gifttl
    - net.inet6.ip6.gifhlim
2016-08-18 11:44:22 +00:00
knakahara 950010ff93 eliminate stf(4)'s dependency on gif(4).
stf(4) depends on not gif(4) but ip_encap.
2016-08-18 11:38:58 +00:00
maxv 3b447e1ea5 Memory leak, found by brainy; not tested, but obvious enough 2016-08-15 09:14:12 +00:00
christos bbc7b97ded remove MODULAR/COMPAT_40 ifdef. 2016-08-15 05:10:33 +00:00
christos a74f222e94 fix rump tests. 2016-08-14 11:03:21 +00:00
christos dc521af48a kill unknown sessions ifdef, link set for sysctl. 2016-08-11 15:16:07 +00:00
kre 58cdd27b4a Avoid init'ing lo0 twice ... which rump kernels do without this hack.
If rump gets fixed, this could be removed (though it is harmless in
any case.)

This should fix several more of the currently failing ATF tests.
2016-08-11 13:57:02 +00:00
kre d6b671c40b On the first day (that being the eighth day of the eighth month,) the
building was completed only to discover that within there lay havoc.

On the second day all just groaned and moaned, and it must be someone
else's problen.

On the third day, St. Martin stepped in and traced the culprit, which
provided inspiration, and a correction was made.

Forevermore all were agog at just how such a trivial thing could do
so much damage...


OK...   to be a little less vague.   The loopback interface is a truly
"special" thing, and rump knew that - and treated it very specially.
Unfortunately, when the loopback interface is changed, and rump does
not keep up, bad things happen.

This (overall) might, or might not, be the correct fix - but for now
it appears to work.   If someone, sometime, finds a better way to
deal with the issues of the loopback interfaces true majesty, feel
free to revert this and do it another way.
2016-08-10 10:09:42 +00:00
knakahara 79536286e1 follow renaming ifmpls to mpls.
This fixes i386 ALL build.
2016-08-10 05:56:30 +00:00
kre 73176b8121 create++, destroy-- 2016-08-08 16:40:39 +00:00
pgoyette 822a1852a7 Typo (missing ampersand) 2016-08-08 09:51:39 +00:00
pgoyette 2bdbc91cbc Final part of fixing if_tap. The module needs to attach its cdevsw (and
detach it later).
2016-08-08 09:42:33 +00:00
pgoyette 1a2474cc13 Add the devsw_attach stuff, since the tap device can be accessed via
/dev/tap

This is a partial fix for the build.  The rump tap component will be
fixed shortly.
2016-08-08 09:23:13 +00:00
pgoyette b70b5f48f4 Partial fix - restore creation of our sysctl subtree for _MODULE
builds (it's already handled for built-in builds via registration
in a link-set).

XXX The build is still broken in rump...
2016-08-08 07:35:12 +00:00
roy 5f2c1f90c4 Fix compile without modules. 2016-08-08 07:23:27 +00:00
pgoyette 093a61346d Don't try to set-up our sysctl sub-tree if we're built-in - this will
happen automatically (via "registration" of the setup function in a
link-set), and if we're not a module, the SYSCTL_SETUP_PROTO() will
not have declared a function prototype!
2016-08-08 02:50:05 +00:00
christos 1d8e08d4c8 modularize some more drivers and merge the module glue 2016-08-07 17:38:33 +00:00
pgoyette 69aa6fadc2 For modular configurations, always build with PPPOE_TERM_UNKNOWN_SESSIONS
defined, and provide a sysctl variable for enabling/disabling the option.

Update man page accordingly.
2016-08-07 01:59:43 +00:00
pgoyette 5328fd5944 Modularize the pppoe driver 2016-08-06 23:46:30 +00:00
pgoyette abe8e5ebff Destroy the mutex when detaching ppp. Otherwise on a re-attach (ie,
module reload) we can end up with a panic "lock already initialized"
2016-08-06 22:54:34 +00:00
pgoyette 5dd5da5fa0 Catch up with the renaming of module ppp --> if_ppp and avoid warning
messages at boot (or module load) time.
2016-08-06 22:38:18 +00:00
pgoyette c075b7e43f Modularize the sppp_subr stuff so it can be shared by pppoe and lmc
drivers as they get modularized.
2016-08-06 22:03:45 +00:00
christos c20d3604bf make strip and slip modular, and cosmetic for ppp. 2016-08-06 12:48:23 +00:00
pgoyette 57989e45da Change the internal name of the module to match its external (file
system) name.  Otherwise "bad things" can happen, such as modload(8)
being able to load a second copy!
2016-08-06 12:42:40 +00:00
pgoyette e7e9717270 Modularize the ppp driver, and adjust dependencies of the compressor
modules.

For now, this is still included as a built-in module in GENERIC kernels.
2016-08-06 02:35:05 +00:00
pgoyette 87cc8eeb68 Actually commit the changes for making this into a loadable module. The
module infrastructure was committed earlier, but the "guts" of the commit
were somehow missed.
2016-08-05 08:56:36 +00:00
ozaki-r 9b97df78c1 CID 1364759: fix using uninitialized value 2016-08-05 00:52:02 +00:00
ozaki-r a403cbd4f5 Apply pserialize and psref to struct ifaddr and its variants
This change makes struct ifaddr and its variants (in_ifaddr and in6_ifaddr)
MP-safe by using pserialize and psref. At this moment, pserialize_perform
and psref_target_destroy are disabled because (1) we don't need them
because of softnet_lock (2) they cause a deadlock because of softnet_lock.
So we'll enable them when we remove softnet_lock in the future.
2016-08-01 03:15:30 +00:00
ozaki-r 74fbff1628 Revert "Revert part of "Switch the address list of intefaces to pslist(9)" (r1.220)"
netstat now uses sysctl instead of kvm(3) to get address information from
the kernel. So we can avoid the issue introduced by the reverted commit
(PR kern/51325) by updating netstat with the latest source code.
2016-08-01 02:50:03 +00:00
alnsn db4395c55a Don't trigger BJ_ASSERT(false) on invalid BPF_Jxxx opcode in jmp_to_op().
This change helps survive AFL fuzzing without calling bpf_validate() first.

Also change alu_to_op() function to have a similar interface.
2016-07-29 20:29:38 +00:00
martin 1982ce327f PR kern/51371: avoid shifting negative values 2016-07-28 07:54:31 +00:00
rjs 0dd1bf859c Restore correct test for return value from aarpresolve(). 2016-07-25 23:46:09 +00:00
knakahara ef38d1c0f4 Reduce KERNEL_LOCK thereby ifq_lock is used by default.
if_snd is always excluded by ifq_lock now. So, the KERNEL_LOCK in if_transmit()
which serializes packet output processing is not needed now.
2016-07-22 07:13:56 +00:00
knakahara b14a26cee3 Toward NET_MPSAFE-on in future, if_snd uses if_snd->ifq_lock by default.
That can reduce confusing difference between NET_MPSAFE on and off.
2016-07-22 07:09:40 +00:00
ozaki-r 60f4a9a871 Make complex RTM_CHANGE code understandable
Tests for route change added recently would reduce the possibility of
regressions.

Reviewed by ryo@
2016-07-21 03:45:56 +00:00
ozaki-r 4f21a42704 Apply pserialize to some iterations of IP address lists 2016-07-20 07:37:51 +00:00
pgoyette 7c20c5d3bb Fix regression introduced in tests/net/bpf and tests/net/bpfilter
The rump code needs to call devsw_attach() in order to assign a dev_major
for bpf;  it then uses this to create rumps /dev/bpf node.  Unfortunately,
this leaves the devsw attached, so when the bpf module tries to initialize
itself, it gets an EEXIST error and fails.

So, once rump has figured what the dev_major should be, call devsw_detach()
to remove the devsw.  Then, when the module initialization code calls
devsw_attach() it will succeed.
2016-07-19 02:47:45 +00:00
pgoyette b380080ebc Now that we're only calling devsw_attach() in the modular driver, it
is not ok for the driver/module to already exist.  So don't ignore
EEXIST.
2016-07-17 02:48:07 +00:00
pgoyette 3c6a976d2d Don't initialize variables that no longer exist in built-in module. 2016-07-17 01:16:30 +00:00
pgoyette 5233aa279b Don't try to call devsw_attach() for built-in driver code. 2016-07-17 01:03:46 +00:00
martin 17f84ba4fd Mark the rt_timer callout MPSAFE and move the first reset a few lines
down so the the workqueue is properly prepared (the latter being more
a cosmetical change). Ok: ozaki-r@
2016-07-15 09:25:47 +00:00
hannken da7d165fe0 rtcache_clear_rtentry: use LIST_FOREACH_SAFE as the element gets
removed from the list.
2016-07-13 09:56:20 +00:00
msaitoh 71fbb921c3 KNF. No functional change. 2016-07-11 11:31:49 +00:00
ozaki-r dca032f9f4 Run timers in workqueue
Timers (such as nd6_timer) typically free/destroy some data in callout
(softint). If we apply psz/psref for such data, we cannot do free/destroy
process in there because synchronization of psz/psref cannot be used in
softint. So run timer callbacks in workqueue works (normal LWP context).

Doing workqueue_enqueue a work twice (i.e., call workqueue_enqueue before
a previous task is scheduled) isn't allowed. For nd6_timer and
rt_timer_timer, this doesn't happen because callout_reset is called only
from workqueue's work. OTOH, ip{,6}flow_slowtimo's callout can be called
before its work starts and completes because the callout is periodically
called regardless of completion of the work. To avoid such a situation,
add a flag for each protocol; the flag is set true when a work is
enqueued and set false after the work finished. workqueue_enqueue is
called only if the flag is false.

Proposed on tech-net and tech-kern.
2016-07-11 07:37:00 +00:00