Commit Graph

133 Commits

Author SHA1 Message Date
roy f217990197 CTASSERT -> __CTASSERT to unbreak userland build.
While here move __packed in tcp_debug.h back to where it was and
note removal warrants more investigation.
2021-02-03 18:13:13 +00:00
roy 7a849d032c Sprinkle CTASSERT to enforce on-wire layout without __packed 2021-02-03 11:53:43 +00:00
roy 1ca39e8727 Remove __packed from various network structures
They are already network aligned and adding the __packed attribute
just causes needless compiler warnings about accssing members of packed
objects.
2021-02-03 05:51:40 +00:00
ozaki-r d511e529cc inet: reduce silent packet discards 2020-08-28 06:31:42 +00:00
ozaki-r c1e00d7df1 inet, inet6: count packets dropped by IPsec
The counters count packets dropped due to security policy checks.
2020-08-28 06:19:13 +00:00
ozaki-r 6d8eb4f9d2 Count packets dropped by pfil 2019-05-13 07:47:59 +00:00
maxv 15652348f3 Use non-variadic function pointer in protosw::pr_input. 2018-09-14 05:09:51 +00:00
maxv 41fcd1f412 Remove the second argument from ip_reass_packet(). We want the IP header
on the mbuf, not elsewhere. Simplifies the NPF reassembly code a little.
No real functional change.
2018-07-10 15:46:58 +00:00
maxv 95ba030c14 Remove the ipre_mlast field and the TRAVERSE macro.
The goal was to store in ipre_mlast the last mbuf of the chain, so that
m_cat could be called on it. But it's not needed, since m_cat already
does the equivalent of TRAVERSE itself.

If it were needed, there would be a bug, since we don't call TRAVERSE on
ipre_mlast when creating a new reassembly entry.
2018-04-08 12:18:06 +00:00
maxv 3b1b66cce9 Remove unused field, and sync comment with reality. 2018-04-08 11:50:46 +00:00
maxv 9e4ad71de9 Remove unused fields and outdated comment. 2018-04-03 08:46:01 +00:00
knakahara 4ab3af3e3e add ipsec(4) interface, which is used for route-based VPN.
man and ATF are added later, please see man for details.

reviewed by christos@n.o, joerg@n.o and ozaki-r@n.o, thanks.
https://mail-index.netbsd.org/tech-net/2017/12/18/msg006557.html
2018-01-10 10:56:30 +00:00
ryo c2f85cb6e9 As is the case with IPV6_PKTINFO, IP_PKTINFO can be sent without EADDRINUSE
even if the UDP address:port in use is specified.
2017-12-11 05:47:18 +00:00
ryo 1581658c21 Add support IP_PKTINFO for sendmsg(2).
The source address or output interface can be specified by adding IP_PKTINFO
to the control part of the message on a SOCK_DGRAM or SOCK_RAW socket.

Reviewed by ozaki-r@ and christos@. thanks.
2017-08-10 04:31:58 +00:00
ozaki-r 67c047d165 Don't use a single global variable to store source route information for multiple incoming packets
It's not MP-safe. So use a m_tag to store the information instead.

Pointed out by knakahara@
The fix is from OpenBSD (originally fixed in FreeBSD)
2017-03-31 06:49:44 +00:00
ozaki-r 2495e7a0c7 Pass inpcb/in6pcb instead of socket to ip_output/ip6_output
- Passing a socket to Layer 3 is layer violation and even unnecessary
- The change makes codes of callers and IPsec a bit simple
2017-03-03 07:13:06 +00:00
knakahara 939a415a7d add l2tp(4) L2TPv3 interface.
originally implemented by IIJ SEIL team.
2017-02-16 08:12:43 +00:00
ozaki-r 4c25fb2f83 Add rtcache_unref to release points of rtentry stemming from rtcache
In the MP-safe world, a rtentry stemming from a rtcache can be freed at any
points. So we need to protect rtentries somehow say by reference couting or
passive references. Regardless of the method, we need to call some release
function of a rtentry after using it.

The change adds a new function rtcache_unref to release a rtentry. At this
point, this function does nothing because for now we don't add a reference
to a rtentry when we get one from a rtcache. We will add something useful
in a further commit.

This change is a part of changes for MP-safe routing table. It is separated
to avoid one big change that makes difficult to debug by bisecting.
2016-12-08 05:16:33 +00:00
knakahara 56995fecb7 improve fast-forward performance when the number of flows exceeds IPFLOW_MAX.
In the fast-forward case, when the number of flows exceeds IPFLOW_MAX, the
performmance degraded to about 50% compared to the case less than IPFLOW_MAX
flows. This modification suppresses the degradation to 65%. Furthermore,
the modified kernel is about the same performance as the original kernel
when the number of flows is less than IPFLOW_MAX.

The original patch is implemented by ryo@n.o. Thanks.
2016-08-01 10:22:53 +00:00
ozaki-r 43c5ab376f Replace ifp of ip_moptions and ip6_moptions with if_index
The motivation is the same as the mbuf's rcvif case; avoid having a pointer
of an ifnet object in ip_moptions and ip6_moptions, which is not MP-safe.

ip_moptions and ip6_moptions can be stored in a PCB for inet or inet6
that's life time is different from ifnet one and so an ifnet object can be
disappeared anytime we get it via them. Thus we need to look up an ifnet
object by if_index every time for safe.
2016-06-21 03:28:27 +00:00
knakahara 14ea9af5f7 make ipflow_reap() static function. 2016-06-13 08:29:55 +00:00
ozaki-r 2cf7873b92 Constify rtentry of if_output
We no longer need to change rtentry below if_output.

The change makes it clear where rtentries are changed (or not)
and helps forthcoming locking (os psrefing) rtentries.
2016-04-28 00:16:56 +00:00
ozaki-r 9e0f6c5e36 Stop using rt_gwroute on packet sending paths
rt_gwroute of rtentry is a reference to a rtentry of the gateway
for a rtentry with RTF_GATEWAY. That was used by L2 (arp and ndp)
to look up L2 addresses. By separating L2 nexthop caches, we don't
need a route for the purpose and we can stop using rt_gwroute.
By doing so, we can reduce referencing and modifying rtentries,
which makes it easy to apply a lock (and/or psref) to the
routing table and rtentries.

One issue to do this is to keep RTF_REJECT behavior. It seems it
was broken when we moved rtalloc1 things from L2 output routines
(e.g., ether_output) to ip_hresolv_output, but (fortunately?)
it works unexpectedly. What we mistook are:
- RTF_REJECT was checked for any routes in L2 output routines,
  but in ip_hresolv_output it is checked only when the route
  is RTF_GATEWAY
- The RTF_REJECT check wasn't copied to IPv6 (nd6_output)

It seems that rt_gwroute checks hid the mistakes and it looked
work (unexpectedly) and removing rt_gwroute checks unveil the
issue. So we need to fix RTF_REJECT checks in ip_hresolv_output
and also add them to nd6_output.

One more point we have to care is returning an errno; we need
to mimic looutput behavior. Originally RTF_REJECT check was
done either in L2 output routines or in looutput. The latter is
applied when a reject route directs to a loopback interface.
However, now RTF_REJECT check is done before looutput so to keep
the original behavior we need to return an errno which looutput
chooses. Added rt_check_reject_route does such tweaks.
2016-04-26 09:30:01 +00:00
riastradh f8b0ac1cb4 Give proper prototype to ip_output. 2016-01-20 22:12:22 +00:00
riastradh 6439c4109a Give proper prototype to rip_output. 2016-01-20 22:02:54 +00:00
ozaki-r 6ea8c2e666 Pull out route lookups from L2 output routines
Route lookups for routes of RTF_GATEWAY were done in L2 output
routines such as ether_output, but they should be done in L3
i.e., before L2 output routines. This change places the lookups
between L3 output routines (say ip_output) and the L2 output
routines.

The change is based on dyoung's patch submitted in the thread:
https://mail-index.netbsd.org/tech-net/2013/02/01/msg003847.html
You can find out detailed investigations by dyoung about the
issue in there.

Note that the change introduces a workaround for MPLS. ether_output
knew that it needs to fill the ethertype of a frame as MPLS,
based on a tag of an original route (rtentry), but now we don't
pass it to ehter_output. So we have to tell that in another way.
We use mtag to do so for now, which introduces some overhead.
We should fix it somehow in the future.

Discussed on tech-kern and tech-net.
2015-06-04 09:19:59 +00:00
christos 07d9441357 exposet multicast option functions which are used by the v6 code now. 2014-10-11 21:12:51 +00:00
rmind 60d350cf6d - Implement pktqueue interface for lockless IP input queue.
- Replace ipintrq and ip6intrq with the pktqueue mechanism.
- Eliminate kernel-lock from ipintr() and ip6intr().
- Some preparation work to push softnet_lock out of ipintr().

Discussed on tech-net.
2014-06-05 23:48:16 +00:00
rmind 8e2c3f0d1c Use __CTASSERT() in the header. 2014-05-30 02:17:01 +00:00
rmind d49a7f6aed Make IGMP and multicast group management code MP-safe. Use a read-write
lock to protect the hash table of multicast address records; also, make it
private and eliminate some macros.  In the long term, the lookup path ought
to be optimised.
2014-05-29 23:02:48 +00:00
rmind 28e5c2a160 Make ip_forward() static, there is no need to expose it. 2014-05-23 19:35:24 +00:00
rmind 8086b5d55a - Make ip_setmoptions(), ip_getmoptions() and ip_pcbopts() static.
- ip_output: eliminate 7th variadic argument; IP_RETURNMTU is flag
  always used to store MTU size into struct inpcb::inp_errormtu.
- Clean up these routines: reduce #ifdefs, variable scopes, etc.
2014-05-22 23:42:53 +00:00
rmind f499e20dfc - Add in_init() and move some functions, variables and sysctls into in.c
where they belong to.  Make some functions and variables static.
- ip_input.c: reduce some #ifdefs, cleanup a little.
- Move some sysctls into ip_flow.c as they belong there.

No functional change.
2014-05-22 22:01:12 +00:00
rmind 39bd8dee77 Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw.  Welcome to 6.99.62!
2014-05-18 14:46:15 +00:00
liamjfoy 6ea5422df7 Move ipflow into ip_var.h and fix confliction 2014-03-19 10:54:20 +00:00
liamjfoy e45b308c2a Remove ipflow_prune and replace with ipflow_reap. ok rmind@ 2014-03-19 08:27:21 +00:00
dyoung ac162b774b *_drain() routines may be called with locks held, so instead of doing
any work in *_drain(), set a drain-needed flag.  Do the work in the
fasttimo handler.

Contributed by Coyote Point Systems, Inc.
2011-05-03 17:44:30 +00:00
rmind aa7dc4aa25 ip_reass_packet: finish abstraction; some clean-up.
Discussed some time ago with matt@.
2010-11-05 00:21:51 +00:00
rmind 574e8cee41 Use own IPv4 reassembly queue entry structure and leave struct ipqent only
for TCP.  Now both struct ipfr_qent, struct ipfr_queue and hashed fragment
queue are abstracted and no longer public.
2010-08-25 00:05:14 +00:00
rmind 7b5ee09e0b Revert previous change of making struct ipqent invisible to userland. 2010-07-19 19:16:45 +00:00
rmind 2f196e2fd9 Abstract IP reassembly into single generic routine - ip_reass_packet().
Make struct ipq private and struct ipqent not visible to userland.
Push ip_len adjustment into reassembly layer.

OK matt@
2010-07-19 14:09:44 +00:00
rmind bcc65ff09f Split-off IPv4 re-assembly mechanism into a separate module. Abstract
into ip_reass_init(), ip_reass_lookup(), etc (note: abstraction is not
yet complete).  No functional changes to the actual mechanism.

OK matt@
2010-07-13 22:16:10 +00:00
pooka b660d07d87 Init ipflow pool dynamically instead of using a linkset. 2009-02-01 17:04:11 +00:00
plunky d2fcfe2b55 update ip_pcbopts() to use sockopt(9) API.
cleans up function and one small fix is that we now stop copying user
options to the mbuf when the _EOL is given, previously this function
would continue to copy options.
2008-10-12 11:15:54 +00:00
plunky 8094317b1b constify sockopt in the PRCO_SETOPT path 2008-08-16 21:51:43 +00:00
plunky fd7356a917 Convert socket options code to use a sockopt structure
instead of laying everything into an mbuf.

approved by core
2008-08-06 15:01:23 +00:00
thorpej 7ff8d08aae Make IP, TCP, UDP, and ICMP statistics per-CPU. The stats are collated
when the user requests them via sysctl.
2008-04-12 05:58:22 +00:00
thorpej 04e54b2ef5 - ipflow is not used outside ip_flow.c; move its definition there.
- Make ipflow_reap() private to ip_flow.c, and introduce ipflow_prune()
  for external callers to use (avoids returning an ipflow * that is never
  actually used anyway).
2008-04-09 05:14:20 +00:00
thorpej 88d65e9212 Change IP stats from a structure to an array of uint64_t's.
Note: This is ABI-compatible with the old ipstat structure; old netstat
binaries will continue to work properly.
2008-04-07 06:31:27 +00:00
matt fb71901dbc Add a new ip_id generation scheme based on a Fisher-Yates shuffle over a
sliding window.  XXX replace use of arc4random RSN.
2008-02-06 03:20:50 +00:00