Commit Graph

48 Commits

Author SHA1 Message Date
thorpej dd8687cce0 Convert MPLS from a legacy netisr to pktqueue. 2022-09-03 02:24:59 +00:00
martin dc194ae5f0 Fix memory leaks pointed out by Ilja Van Sprundel: all
sendoob() functions are expted to free both passed
mbuf chains.
2019-01-28 12:53:01 +00:00
maxv 506a6ab369 Remove M_COPY_PKTHDR, M_MOVE_PKTHDR, M_ALIGN and MH_ALIGN. 2018-12-27 14:03:54 +00:00
riastradh d1579b2d70 Rename min/max -> uimin/uimax for better honesty.
These functions are defined on unsigned int.  The generic name
min/max should not silently truncate to 32 bits on 64-bit systems.
This is purely a name change -- no functional change intended.

HOWEVER!  Some subsystems have

	#define min(a, b)	((a) < (b) ? (a) : (b))
	#define max(a, b)	((a) > (b) ? (a) : (b))

even though our standard name for that is MIN/MAX.  Although these
may invite multiple evaluation bugs, these do _not_ cause integer
truncation.

To avoid `fixing' these cases, I first changed the name in libkern,
and then compile-tested every file where min/max occurred in order to
confirm that it failed -- and thus confirm that nothing shadowed
min/max -- before changing it.

I have left a handful of bootloaders that are too annoying to
compile-test, and some dead code:

cobalt ews4800mips hp300 hppa ia64 luna68k vax
acorn32/if_ie.c (not included in any kernels)
macppc/if_gm.c (superseded by gem(4))

It should be easy to fix the fallout once identified -- this way of
doing things fails safe, and the goal here, after all, is to _avoid_
silent integer truncations, not introduce them.

Maybe one day we can reintroduce min/max as type-generic things that
never silently truncate.  But we should avoid doing that for a while,
so that existing code has a chance to be detected by the compiler for
conversion to uimin/uimax without changing the semantics until we can
properly audit it all.  (Who knows, maybe in some cases integer
truncation is actually intended!)
2018-09-03 16:29:22 +00:00
maxv 65f0aceba1 Retire ICMPPRINTFS, it's annoying and it doesn't build. 2018-05-11 14:38:28 +00:00
maxv db5deb0877 Add one more XXX in the list. 2018-04-11 06:37:32 +00:00
maxv 7fdfbc6917 Add XXX. 2018-01-19 14:15:35 +00:00
maxv e265d7d557 Move the ICMP Extension structures from mpls_ttl.c to ip_icmp.h; that's
part of the ICMP protocol (per RFC4884), and not specific to MPLS. Also
add ih_exthdr in struct icmp, the 'length' field appeared.

While here, style in MPLS.
2018-01-19 10:54:31 +00:00
maxv 55408dcfdb Style, and fix several bugs:
- ip4_check(), mpls_unlabel_inet() and mpls_unlabel_inet6() perform
   pullups, so we need to pass the updated pointers back
 - in mpls_lse() the route is not always freed
Looks a little better now.
2017-12-08 17:49:54 +00:00
ozaki-r 0092eb7df6 Invalidate rtcache based on a global generation counter
The change introduces a global generation counter that is incremented when any
routes have been added or deleted. When a rtcache caches a rtentry into itself,
it also stores a snapshot of the generation counter. If the snapshot equals to
the global counter, the cache is still valid, otherwise invalidated.

One drawback of the change is that all rtcaches of all protocol families are
invalidated when any routes of any protocol families are added or deleted.
If that matters, we should have separate generation counters based on
protocol families.

This change removes LIST_ENTRY from struct route, which fixes a part of
PR kern/52515.
2017-09-21 07:15:34 +00:00
joerg f22328325c Assert size for sockaddr_mpls, but don't require it to be packed. 2016-10-08 20:19:37 +00:00
ozaki-r 8f4376cb6f Fix race condition on ifqueue used by traditional netisr
If a underlying network device driver supports MSI/MSI-X, RX interrupts
can be delivered to arbitrary CPUs. This means that Layer 2 subroutines
such as ether_input (softint) and subsequent Layer 3 subroutines (softint)
which are called via traditional netisr can be dispatched on an arbitrary
CPU. Layer 2 subroutines now run without any locks (expected) and so a
Layer 2 subroutine and a Layer 3 subroutine can run in parallel.

There is a shared data between a Layer 2 routine and a Layer 3 routine,
that is ifqueue and IF_ENQUEUE (from L2) and IF_DEQUEUE (from L3) on it
are racy now.

To fix the race condition, use ifqueue#ifq_lock to protect ifqueue
instead of splnet that is meaningless now.

The same race condition exists in route_intr. Fix it as well.

Reviewed by knakahara@
2016-10-03 11:06:06 +00:00
ozaki-r fe6d427551 Avoid storing a pointer of an interface in a mbuf
Having a pointer of an interface in a mbuf isn't safe if we remove big
kernel locks; an interface object (ifnet) can be destroyed anytime in any
packet processing and accessing such object via a pointer is racy. Instead
we have to get an object from the interface collection (ifindex2ifnet) via
an interface index (if_index) that is stored to a mbuf instead of an
pointer.

The change provides two APIs: m_{get,put}_rcvif_psref that use psref(9)
for sleep-able critical sections and m_{get,put}_rcvif that use
pserialize(9) for other critical sections. The change also adds another
API called m_get_rcvif_NOMPSAFE, that is NOT MP-safe and for transition
moratorium, i.e., it is intended to be used for places where are not
planned to be MP-ified soon.

The change adds some overhead due to psref to performance sensitive paths,
however the overhead is not serious, 2% down at worst.

Proposed on tech-kern and tech-net.
2016-06-10 13:31:43 +00:00
ozaki-r d938d837b3 Introduce m_set_rcvif and m_reset_rcvif
The API is used to set (or reset) a received interface of a mbuf.
They are counterpart of m_get_rcvif, which will come in another
commit, hide internal of rcvif operation, and reduce the diff of
the upcoming change.

No functional change.
2016-06-10 13:27:10 +00:00
ozaki-r a79dfa5db0 Sweep unnecessary route.h inclusions 2016-04-26 08:44:44 +00:00
pooka 1c4a50f192 sprinkle _KERNEL_OPT 2015-08-24 22:21:26 +00:00
rtr fd12cf39ee make connect syscall use sockaddr_big and modify pr_{send,connect}
nam parameter type from buf * to sockaddr *.

final commit for parameter type changes to protocol user requests

* bump kernel version to 7.99.15 for parameter type changes to pr_{send,connect}
2015-05-02 17:18:03 +00:00
rtr d2aa9dd71f remove pr_generic from struct pr_usrreqs and all implementations of
pr_generic in protocols.

bump to 7.99.13

approved by rmind@
2015-04-26 21:40:48 +00:00
rtr eddf3af3c6 make accept, getsockname and getpeername syscalls use sockaddr_big and modify
pr_{accept,sockname,peername} nam parameter type from mbuf * to sockaddr *.

* retained use of mbuftypes[MT_SONAME] for now.
* bump to netbsd version 7.99.12 for parameter type change.

patch posted to tech-net@ 2015/04/19
2015-04-24 22:32:37 +00:00
rtr a2ba5e69ab * change pr_bind to accept struct sockaddr * instead of struct mbuf *
* update protocol bind implementations to use/expect sockaddr *
  instead of mbuf *
* introduce sockaddr_big struct for storage of addr data passed via
  sys_bind; sockaddr_big is of sufficient size and alignment to
  accommodate all addr data sizes received.
* modify sys_bind to allocate sockaddr_big instead of using an mbuf.
* bump kernel version to 7.99.9 for change to pr_bind() parameter type.

Patch posted to tech-net@
  http://mail-index.netbsd.org/tech-net/2015/03/15/msg005004.html

The choice to use a new structure sockaddr_big has been retained since
changing sockaddr_storage size would lead to unnecessary ABI change. The
use of the new structure does not preclude future work that increases
the size of sockaddr_storage and at that time sockaddr_big may be
trivially replaced.

Tested by mrg@ and myself, discussed with rmind@, posted to tech-net@
2015-04-03 20:01:07 +00:00
rtr 8cf67cc6d5 split PRU_CONNECT2 & PRU_PURGEIF function out of pr_generic() usrreq
switches and put into separate functions

  - always KASSERT(solocked(so)) even if not implemented
    (for PRU_CONNECT2 only)

  - replace calls to pr_generic() with req = PRU_CONNECT2 with calls to
    pr_connect2()

  - replace calls to pr_generic() with req = PRU_PURGEIF with calls to
    pr_purgeif()

put common code from unp_connect2() (used by unp_connect() into
unp_connect1() and call out to it when needed

patch only briefly reviewed by rmind@
2014-08-09 05:33:00 +00:00
rtr 822872eada split PRU_RCVD function out of pr_generic() usrreq switches and put into
separate functions

  - always KASSERT(solocked(so)) even if not implemented

  - replace calls to pr_generic() with req = PRU_RCVD with calls to
    pr_rcvd()
2014-08-08 03:05:44 +00:00
rtr 651e5bd3f8 split PRU_SEND function out of pr_generic() usrreq switches and put into
separate functions

   xxx_send(struct socket *, struct mbuf *, struct mbuf *,
       struct mbuf *, struct lwp *)

  - always KASSERT(solocked(so)) even if not implemented

  - replace calls to pr_generic() with req = PRU_SEND with calls to
    pr_send()

rename existing functions that operate on PCB for consistency (and to
free up their names for xxx_send() PRUs

  - l2cap_send() -> l2cap_send_pcb()
  - sco_send() -> sco_send_pcb()
  - rfcomm_send() -> rfcomm_send_pcb()

patch reviewed by rmind
2014-08-05 07:55:31 +00:00
rtr ce6a5ff64f revert the removal of struct lwp * parameter from bind, listen and connect
user requests.

this should resolve the issue relating to nfs client hangs presented
recently by wiz on current-users@
2014-08-05 05:24:26 +00:00
rtr 81c514052c add missing KASSERT(req != PRU_XXX) to mpls_usrreq() for PRUs that have
already been split.
2014-07-31 05:37:00 +00:00
rtr 892163b8e9 split PRU_DISCONNECT, PRU_SHUTDOWN and PRU_ABORT function out of
pr_generic() usrreq switches and put into separate functions

   xxx_disconnect(struct socket *)
   xxx_shutdown(struct socket *)
   xxx_abort(struct socket *)

   - always KASSERT(solocked(so)) even if not implemented
   - replace calls to pr_generic() with req =
PRU_{DISCONNECT,SHUTDOWN,ABORT}
     with calls to pr_{disconnect,shutdown,abort}() respectively

rename existing internal functions used to implement above functionality
to permit use of the names for xxx_{disconnect,shutdown,abort}().

   - {l2cap,sco,rfcomm}_disconnect() ->
{l2cap,sco,rfcomm}_disconnect_pcb()
   - {unp,rip,tcp}_disconnect() -> {unp,rip,tcp}_disconnect1()
   - unp_shutdown() -> unp_shutdown1()

patch reviewed by rmind
2014-07-31 03:39:35 +00:00
rtr ad6ae402db split PRU_CONNECT function out of pr_generic() usrreq switches and put
into seaparate functions

  xxx_listen(struct socket *, struct mbuf *)

  - always KASSERT(solocked(so)) and KASSERT(nam != NULL)
  - replace calls to pr_generic() with req = PRU_CONNECT with
    pr_connect()
  - rename existin {l2cap,sco,rfcomm}_connect() to
    {l2cap,sco,rfcomm}_connect_pcb() respectively to permit
    naming consistency with other protocols functions.
  - drop struct lwp * parameter from unp_connect() and at_pcbconnect()
    and use curlwp instead where appropriate.

patch reviewed by rmind
2014-07-30 10:04:25 +00:00
rtr 6dd8eef044 split PRU_BIND and PRU_LISTEN function out of pr_generic() usrreq
switches and put into separate functions
  xxx_bind(struct socket *, struct mbuf *)
  xxx_listen(struct socket *)

  - always KASSERT(solocked(so)) even if not implemented

  - replace calls to pr_generic() with req = PRU_BIND with call to
    pr_bind()

  - replace calls to pr_generic() with req = PRU_LISTEN with call to
    pr_listen()

  - drop struct lwp * parameter from at_pcbsetaddr(), in_pcbbind() and
    unp_bind() and always use curlwp.

rename existing functions that operate on PCB for consistency (and to
free up their names for xxx_{bind,listen}() PRUs

  - l2cap_{bind,listen}() -> l2cap_{bind,listen}_pcb()
  - sco_{bind,listen}() -> sco_{bind,listen}_pcb()
  - rfcomm_{bind,listen}() -> rfcomm_{bind,listen}_pcb()

patch reviewed by rmind

welcome to netbsd 6.99.48
2014-07-24 15:12:03 +00:00
rtr 35b22fa96a split PRU_SENDOOB and PRU_RCVOOB function out of pr_generic() usrreq
switches and put into separate functions
  xxx_sendoob(struct socket *, struct mbuf *, struct mbuf *)
  xxx_recvoob(struct socket *, struct mbuf *, int)

  - always KASSERT(solocked(so)) even if request is not implemented

  - replace calls to pr_generic() with req = PRU_{SEND,RCV}OOB with
    calls to pr_{send,recv}oob() respectively.

there is still some tweaking of m_freem(m) and m_freem(control) to come
for consistency.  not performed with this commit for clarity.

reviewed by rmind
2014-07-23 13:17:18 +00:00
rtr d27b133d27 * split PRU_ACCEPT function out of pr_generic() usrreq switches and put
into a separate function xxx_accept(struct socket *, struct mbuf *)

note: future cleanup will take place to remove struct mbuf parameter
type and replace it with a more appropriate type.

patch reviewed by rmind
2014-07-09 14:41:42 +00:00
rtr d575eb5454 * split PRU_PEERADDR and PRU_SOCKADDR function out of pr_generic()
usrreq switches and put into separate functions
  xxx_{peer,sock}addr(struct socket *, struct mbuf *).

    - KASSERT(solocked(so)) always in new functions even if request
      is not implemented

    - KASSERT(pcb != NULL) and KASSERT(nam) if the request is
      implemented and not for tcp.

* for tcp roll #ifdef KPROF and #ifdef DEBUG code from tcp_usrreq() into
  easier to cut & paste functions tcp_debug_capture() and
tcp_debug_trace()

    - functions provided by rmind
    - remaining use of PRU_{PEER,SOCK}ADDR #define to be removed in a
      future commit.

* rename netbt functions to permit consistency of pru function names
  (as has been done with other requests already split out).

    - l2cap_{peer,sock}addr()  -> l2cap_{peer,sock}_addr_pcb()
    - rfcomm_{peer,sock}addr() -> rfcomm_{peer,sock}_addr_pcb()
    - sco_{peer,sock}addr()    -> sco_{peer,sock}_addr_pcb()

* split/refactor do_sys_getsockname(lwp, fd, which, nam) into
  two functions do_sys_get{peer,sock}name(fd, nam).

    - move PRU_PEERADDR handling into do_sys_getpeername() from
      do_sys_getsockname()
    - have svr4_stream directly call do_sys_get{sock,peer}name()
      respectively instead of providing `which' & fix a DPRINTF string
      that incorrectly wrote "getpeername" when it meant "getsockname"
    - fix sys_getpeername() and sys_getsockname() to call
      do_sys_get{sock,peer}name() without `which' and `lwp' & adjust
      comments
    - bump kernel version for removal of lwp & which parameters from
      do_sys_getsockname()

note: future cleanup to remove struct mbuf * abuse in
xxx_{peer,sock}name()
still to come, not done in this commit since it is easier to do post
split.

patch reviewed by rmind

welcome to 6.99.47
2014-07-09 04:54:03 +00:00
rtr ff90c29d04 * sprinkle KASSERT(solocked(so)); in all pr_stat() functions.
* fix remaining inconsistent struct socket parameter names.
2014-07-07 17:13:56 +00:00
rtr c8a2e91d5f * split PRU_SENSE functionality out of mpls_usrreq() and place into
separate mpls_stat(struct socket *, struct stat *) function

missed this in previous commit, fixes build of ALL kernel.
2014-07-06 04:47:26 +00:00
rtr 0dedd9772f fix parameter types in pr_ioctl, called xx_control() functions and remove
abuse of pointer to struct mbuf type.

param2 changed to u_long type and uses parameter name 'cmd' (ioctl command)
param3 changed to void * type and uses parameter name 'data'
param4 changed to struct ifnet * and uses parameter name 'ifp'
param5 has been removed (formerly struct lwp *) and uses of 'l' have been
       replaced with curlwp from curproc(9).

callers have had (now unnecessary) casts to struct mbuf * removed, called
code has had (now unnecessary) casts to u_long, void * and struct ifnet *
respectively removed.

reviewed by rmind@
2014-07-01 05:49:18 +00:00
rtr d54d7ab24a * split PRU_CONTROL functionality out of xxx_userreq() switches and place
into separate xxx_ioctl() functions.
* place KASSERT(req != PRU_CONTROL) inside xxx_userreq() as it is now
  inappropriate for req = PRU_CONTROL in xxx_userreq().
* replace calls to pr_generic() with req = PRU_CONTROL with pr_ioctl().
* remove & fixup references to PRU_CONTROL xxx_userreq() function comments.
* fix various comments references for xxx_userreq() that mentioned
  PRU_CONTROL as xxx_userreq() no longer handles the request.

a further change will follow to fix parameter and naming inconsistencies
retained from original code.

Reviewed by rmind@
2014-06-22 08:10:18 +00:00
rmind e401453f3f Adjust PR_WRAP_USRREQS() to include the attach/detach functions.
We still need the kernel-lock for some corner cases.
2014-05-20 19:04:00 +00:00
rmind 4ae03c1815 - Split off PRU_ATTACH and PRU_DETACH logic into separate functions.
- Replace malloc with kmem and eliminate M_PCB while here.
- Sprinkle more asserts.
2014-05-19 02:51:24 +00:00
rmind 39bd8dee77 Add struct pr_usrreqs with a pr_generic function and prepare for the
dismantling of pr_usrreq in the protocols; no functional change intended.
PRU_ATTACH/PRU_DETACH changes will follow soon.

Bump for struct protosw.  Welcome to 6.99.62!
2014-05-18 14:46:15 +00:00
pooka 4f6fb3bf35 Ensure that the top level sysctl nodes (kern, vfs, net, ...) exist before
the sysctl link sets are processed, and remove redundancy.

Shaves >13kB off of an amd64 GENERIC, not to mention >1k duplicate
lines of code.
2014-02-25 18:30:08 +00:00
kefren a502ce7a7c reserve space for ICMP header accordingly to RFC4884 2013-08-07 06:55:00 +00:00
kefren 326bf6fa4a Implement RFC4182 changes - switchable via sysctl 2013-07-23 11:11:55 +00:00
kefren 201ffb9853 explicitly call sysctl setup in init. Needed for future dynamic loading 2013-07-18 06:23:07 +00:00
christos 6c9b9f2ea5 old style def 2012-02-01 16:49:36 +00:00
dyoung 060522dec8 Hide the radix-trie implementation of the forwarding table so that we
will have an easier time replacing it with something different, even if
it is a second radix-trie implementation.

sys/net/route.c and sys/net/rtsock.c no longer operate directly on
radix_nodes or radix_node_heads.

Hopefully this will reduce the temptation to implement multipath or
source-based routing using grotty hacks to the grotty old radix-trie
code, too. :-)
2011-03-31 19:40:51 +00:00
kefren 6fc9085e84 do some rudimentary checks on ip4 header before passing packet to
mpls_icmp_error
2010-07-05 09:54:26 +00:00
kefren ec6d386e53 * correct packet size
* fix crash when cluster was involved
* extension offset is constant
* fix endianess issues in BoS loop
* free cluster if INET not defined but icmp_respond sysctl != 1
2010-07-02 12:25:54 +00:00
kefren 25133d6d8f Fix build for MPLS import: add options MPLS, changed pseudo-device mpls
to pseudo-device ifmpls
2010-06-26 15:17:56 +00:00
kefren 826653c190 Add MPLS support, proposed on tech-net@ a couple of days ago
Welcome to 5.99.33
2010-06-26 14:24:27 +00:00