This flag was only set for virtual interfaces.
All virtual interfaces have a means of knowing if they are going to work
or not and as such now support link state changes.
If we want this flag back, it should be used as an indicator that
the interfaces does not support link state changes that userland can use
so it can make a decision on what to do when the link state is UNKNOWN.
The exact flow is detatch addresses, join bridge and then mark detached
addresses as tentative.
This ensures that Duplicate Address Detection for the joining interface
are performed across all members of the bridge.
The vether interface simulates a normal Ethernet interface by encapsulating
standard network frames with an Ethernet header, specifically for use as
a member in a bridge(4).
To use vether the administrator needs to configure an address onto the
interface so that packets can be routed to it. An Ethernet header will
be prepended and, if the vether interface is a member of a bridge(4),
the frame will show up there.
Taken from OpenBSD.
If any member is LINK_STATE_UP then it's LINK_STATE_UP.
Otherwise if any member is LINK_STATE_UNKNOWN then it's LINK_STATE_UNKNOWN.
Otherwise it's LINK_STATE_DOWN.
Link state changes are not dependant on the interface being up, but we also
need to guard against more link state changes being scheduled when the
interface is being detached.
We do this by clearing the link queue but keeping if_link_sheduled = true.
We can check for this in both if_link_state_change() and
if_link_state_change_work() to abort early as there is no point in doing
anything if the interface is being detached because if_down() is called
in if_detach() after the workqueue has been drained to the same overall
effect.
For AF_LINK addrs from getifaddrs(2), ifa_data is struct if_data.
This in turn holds ifi_link_state which we can use to report
link status if the interface does not support media where it's normally
reported.
Based on OpenBSD.
RFC 7048 Section 3 says in the UNREACHABLE state packets continue to be
sent to the link-layer address and then backoff exponentially.
We adjust this slightly and move to the INCOMPLETE state after
`nd_mmaxtries` probes and then start backing off.
This results in simpler code whilst providing a more robust model which
doubles the time to failure over what we did before.
We don't want to be back to the old ARP model where no unreachability
errors are returned because very few applications would look at
unreachability hints provided such as ND_LLINFO_UNREACHABLE or RTM_MISS.
While here, remove the IFQ_CLASSIFY bottleneck (takes the ifq lock,
so it would serialize all transmission to all peers on a single wg(4)
interface).
altq can be disabled at compile-time or at run-time; even if included
at comple-time the run-time impact should be negligible if disabled.
Should really fix workqueue(9) so workqueue_create can be done before
CPUs have been detected in configure, but this will serve as a stop-
gap measure.
This brings the benefit of Neighbour Unreachability Detection which is
something ARP sorely lacks.
The new timings mirror those of IPv6 and are adjustable via sysctl(8).
Unlike IPv6 ND, these are global and not per interface.
Otherwise pktqueues can't be created before all CPUs are detected --
they will have a queue only for the primary CPU, not for others.
This will also be necessary if we want to add CPU hotplug (still need
some way to block hotplug during pktq_set_maxlen but it's a start).
- This is safe because wgp_endpoint_changing locks out any attempts
to change the endpoint until the draining is complete.
- This is necessary to avoid a deadlock where the handshake thread
holds a psref and awaits mutex_enter(wgp->wgp_lock).
XXX The same deadlock may occur in wg_destroy_session. Not clear
that it's safe to just release wgp_lock there; may need to create a
new session state, say WGS_STATE_DRAINING, while we wait for
psref_target_destroy. But this needs a little more thought; a new
state may not be necessary, and would be nice to avoid if not
necessary.
- Using threadpool(9) job per interface to receive incoming handshake
messages gives the same concurrency for active interfaces but
doesn't waste kthreads for inactive ones.
=> Can't really do this with a global workqueue(9) because there's
no bound on the amount of time wg_receive_packets() might run
for; we really need separate threads or threadpool jobs in order
to avoid having one interface starve all the others.
- Using a global workqueue(9) for asynchronous peer tasks avoids
creating unnecessary kthreads.
=> Each task does a more or less bounded amount of work, so it's OK
to share a global workqueue -- there's no advantage to adding
concurrency for what is almost certainly going to be CPU-bound
asymmetric crypto.
=> This way we don't need a thread per peer or iteration over a
list of all peers, so the task mechanism should no longer be a
bottleneck to scaling to thousands of peers.
XXX This doesn't distribute the load across CPUs -- it keeps it on
the same CPU where the packet came in. Should consider doing
something to balance the load -- maybe note if the current CPU is
loaded, and if so, sort CPUs by queue length or some other measure of
load and pick the least loaded one or something.
- Improves scalability -- won't hit limit on softints no matter how
many peers there are.
- Improves parallelism -- softint was kernel-locked to serialize
access to the pcq.
- Requires per-peer queue on handshake init to avoid dropping first
packet.
. Per-peer queue is currently a single packet -- should serve well
enough for pings, dns queries, tcp connections, &c.
Summary: Access to a stable established session is still allowed via
psref; all other access to peer and session state is now serialized
by struct wg_peer::wgp_lock, with no dancing around a per-session
lock. This way, the handshake paths are locked, while the data
transmission paths are pserialized.
- Eliminate struct wg_session::wgs_lock.
- Eliminate wg_get_unstable_session -- access to the unstable session
is allowed only with struct wgp_peer::wgp_lock held.
- Push INIT_PASSIVE->ESTABLISHED transition down into a thread task.
- Push rekey down into a thread task.
- Allocate session indices only on transition from UNKNOWN and free
them only on transition back to UNKNOWN.
- Be a little more explicit about allowed state transitions, and
reject some nonsensical ones.
- Sprinkle assertions and comments.
- Reduce atomic r/m/w swap operations that can just as well be
store-release.
- Ensure all access to struct wg_peer::wgp_endpoint happens while
holding a psref.
- Simplify internalize/externalize logic and be more careful about
verifying it before printing anything.