ifnet_lock is a dedicated method to safely destroy an interface over running
ioctl operations. Replace it with a more generic mechanism using psref(9).
The new API enables to obtain an ifnet object with protected by psref(9).
It is intended to be used where an obtained ifnet object is used over
sleepable operations.
The change ensures that ifnet objects in the ifnet list aren't freed during
list iterations by using pserialize(9) and psref(9).
Note that the change adds a pslist(9) for ifnet but doesn't remove the
original ifnet list (ifnet_list) to avoid breaking kvm(3) users. We
shouldn't use the original list in the kernel anymore.
We no longer need to change rtentry below if_output.
The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.
rt_gwroute of rtentry is a reference to a rtentry of the gateway
for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp)
to look up L2 addresses. By separating L2 nexthop caches, we don't
need a route for the purpose and we can stop using rt_gwroute.
By doing so, we can reduce referencing and modifying rtentries,
which makes it easy to apply a lock (and/or psref) to the
routing table and rtentries.
One issue to do this is to keep RTF_REJECT behavior. It seems it
was broken when we moved rtalloc1 things from L2 output routines
(e.g., ether_output) to ip_hresolv_output, but (fortunately?)
it works unexpectedly. What we mistook are:
- RTF_REJECT was checked for any routes in L2 output routines,
but in ip_hresolv_output it is checked only when the route
is RTF_GATEWAY
- The RTF_REJECT check wasn't copied to IPv6 (nd6_output)
It seems that rt_gwroute checks hid the mistakes and it looked
work (unexpectedly) and removing rt_gwroute checks unveil the
issue. So we need to fix RTF_REJECT checks in ip_hresolv_output
and also add them to nd6_output.
One more point we have to care is returning an errno; we need
to mimic looutput behavior. Originally RTF_REJECT check was
done either in L2 output routines or in looutput. The latter is
applied when a reject route directs to a loopback interface.
However, now RTF_REJECT check is done before looutput so to keep
the original behavior we need to return an errno which looutput
chooses. Added rt_check_reject_route does such tweaks.
Note that there is an issue that ioctls for an interface and a destruction
of the interface can run in parallel and it causes race conditions on
bridge as well (it rarely happens). The issue will be addressed in the
interface common code (if.c).
We need to enable it by default because bridge_input now runs
in softint, but bridge_input w/o BRIDGE_MPSAFE was designed as
it runs in hardware interrupt.
Note that there remains a racy code in bridge_output; it will be
solved in the upcoming change (applying psref(9)).
show arptab command of ddb is now inappropriate because it actually dumps
routes but arp entries aren't routes anymore. So rename it to show routes
and move the code from if_arp.c to route.c.
ok christos@