Without KERNEL_LOCK, rt_timer_work and rt_free_work can run in parallel
with other LWPs running in the network stack, which eventually results
in say use-after-free of a deleted route.
Following changes in r. 1.249 "bpf: support sending packets on loopback
interfaces", also allow for this to succeed when the "header complete"
flag is set, which is the practice of some tools, e.g., tcpreplay and
Scapy. With this change, both of those example tools now work, e.g.,
Scapy passes "L3bpfSocket - send and sniff on loopback" in its test
suite.
There are several ways of addressing this issue; this commit is
intended to be the most conservative and consistent with the previous
changes. (E.g., FreeBSD instead has special handling of this condition
in its if_loop.c.)
It is forbidden to hold a spin lock around copyout, and t_lock is a
spin lock.
We need t_lock in order to iterate over the list of entries.
However, during copyout itself, we only need to ensure that the
object we're copying out isn't freed by npf_table_remove or
npf_table_gc.
Fortunately, the only caller of npf_table_list, npf_table_remove, and
npf_table_gc is npfctl_table, and it serializes all of them by the
npf config lock. So we can safely drop t_lock across copyout.
PR kern/57136
PR kern/57181
Previously sending packets on a loopback interface via bpf failed
because the packets are treated as AF_UNSPEC by bpf and the loopback
interface couldn't handle such packets.
This fix enables user programs to prepend a protocol family (AF_INET or
AF_INET6) to a payload. bpf interprets it and treats a packet as so,
not just AF_UNSPEC. The protocol family is encoded as 4 bytes, host byte
order as per DLT_NULL in the specification(*).
(*) https://www.tcpdump.org/linktypes.html
Proposed on tech-net and tech-kern
Currently, NetBSD supports implicit unnumbered interface by setting
the same IP address to two interfaces. However, such interface is not
treated as unnumbered when one of the interfaces is being changed and
has been changed IP address. That behavior can be harmful for some
routing daemons.
RFC 5227 section 1.1 states that for a DaD ARP probe the sender hardware
address must match the hardware address of the interface sending the
packet.
We can now verify this by checking the mbuf tag PACKET_TAG_ETHERNET_SRC.
This fixes an obsure issue where an old router was sending out bogus
ARP probes.
Thanks to Ryo Shimizu <ryo@nerv.org> for the re-implementation.
The BSD networking stack is designed around passing a mbuf down the chain
and each layer removes the part it's interested in before passing it to
the next. This makes it easy for each layer to do it's work,
but non trivial to work backwards.
As such we now store a pointer to the Senders Hardware address in the
mbuf packet header so that protocols can perform any required validation.
COMPAT_9 is not required.
- The getifaddrs(3) function has no problem. The routing message has no
problem because struct rtm_msglen has rtm_msglen and we can get the next
message using with it. There is no any kernel data structure which has
struct sockaddr_dl foobadr[xxx] array.
- A data passed from userland and a kernel data are compared with
sockaddr_cmp(). The return value is used to check if the size is
inadequate or not.
- In the kernel, sdl_len is not directly used for the length of memcpy()
but the sockaddr_dl_measure() is used for it.
This extension(struct sadb_x_policy) is *not* defined by RFC2367.
OpenBSD does not have reserved fields in struct sadb_x_policy.
Linux does not use this field yet.
FreeBSD uses this field as "sadb_x_policy_scope"; the value range is
from 0x00 to 0x04.
We use from most significant bit to avoid the above usage.
If we want to use fixed SP reqid for ipsecif(4), set
net.ipsecif.use_fixed_reqid=1 Default(=0) is the same as before.
net.ipsecif.use_fixed_reqid can be changed only if there is no ipsecif(4) yet.
If we want to change the range of ipseif(4) SP reqid,
set net.ipsecif.reqid_base and net.ipsecif.reqid_last.
These can also be changed only if there is no ipsecif(4) yet.
A route that has a gateway is on a connected route can be invalid if the
connected route is deleted, i.e., an associated address is removed.
Traditionally NetBSD doesn't sweep such a route on the address removal. Sending
packets over the route fails with "No route to host". Also the route holds an
orphan ifaddr as rt_ifa that is destructed say by in_purgeaddr.
If the same address is assgined again in such a state, there can be two
different ifaddr objects with the same address. Until recently it's not a
big problem because we can send packets anyway. However after MP-ification
of the network stack, we can't send packets because we strictly check if rt_ifa
(i.e., the (old) ifaddr) is valid.
This change automatically removes such routes on a removal of an associated
address to avoid keeping inconsistent routes.
- Run a dummy softint at IPL_SOFTNET on all CPUs to ensure that the
ISR for this pktqueue is not running (addresses a pre-existing XXX).
- Hold the barrier lock around the critical section to ensure that
implicit pktq_barrier() calls via pktq_ifdetach() are held off during
the critical section.
- Ensure the critical section completes in minimal time by not freeing
memory during the critical section; instead, just build a list of the
packets pulled out of the per-CPU queues and free them after the critical
section is over.