$NetBSD: TODO.smpnet,v 1.16 2017/10/25 07:35:40 ozaki-r Exp $ MP-safe components ================== - Device drivers - vioif(4) - vmx(4) - wm(4) - ixg(4) - ixv(4) - Layer 2 - Ethernet (if_ethersubr.c) - bridge(4) - STP - Fast forward (ipflow) - Layer 3 - All except for items in the below section - Interfaces - gif(4) - l2tp(4) - pppoe(4) - if_spppsubr.c - tun(4) - vlan(4) - Packet filters - npf(7) - Others - bpf(4) - ipsec(4) - opencrypto(9) - pfil(9) Non MP-safe components and kernel options ========================================= - Device drivers - Most drivers other than ones listed in the above section - Layer 2 - ARCNET (if_arcsubr.c) - ATM (if_atmsubr.c) - BRIDGE_IPF - if_ecosubr.c - FDDI (if_fddisubr.c) - HIPPI (if_hippisubr.c) - IEEE 1394 (if_ieee1394subr.c) - IEEE 802.11 (ieee80211(4)) - Token ring (if_tokensubr.c) - Layer 3 - IPSELSRC - MROUTING - PIM - MPLS (mpls(4)) - Layer 4 - DCCP - SCTP - TCP - UDP - Interfaces - agr(4) - carp(4) - etherip(4) - faith(4) - gre(4) - ppp(4) - sl(4) - stf(4) - strip(4) - if_srt - tap(4) - Packet filters - ipf(4) - pf(4) - Others - AppleTalk (sys/netatalk/) - ATM (sys/netnatm/) - Bluetooth (sys/netbt/) - altq(4) - CIFS (sys/netsmb/) - ISDN (sys/netisbn/) - kttcp(4) - NFS Know issues =========== NOMPSAFE -------- We use "NOMPSAFE" as a mark that indicates that the code around it isn't MP-safe yet. We use it in comments and also use as part of function names, for example m_get_rcvif_NOMPSAFE. Let's use "NOMPSAFE" to make it easy to find non-MP-safe codes by grep. bpf --- MP-ification of bpf requires all of bpf_mtap* are called in normal LWP context or softint context, i.e., not in hardware interrupt context. For Tx, all bpf_mtap satisfy the requrement. For Rx, most of bpf_mtap are called in softint. Unfortunately some bpf_mtap on Rx are still called in hardware interrupt context. This is the list of the functions that have such bpf_mtap: - sca_frame_process() @ sys/dev/ic/hd64570.c - en_intr() @ sys/dev/ic/midway.c - rxintr_cleanup() and txintr_cleanup() @ sys/dev/pci/if_lmc.c - ipr_rx_data_rdy() @ sys/netisdn/i4b_ipr.c Ideally we should make the functions run in softint somehow, but we don't have actual devices, no time (or interest/love) to work on the task, so instead we provide a deferred bpf_mtap mechanism that forcibly runs bpf_mtap in softint context. It's a workaround and once the functions run in softint, we should use the original bpf_mtap again. Lingering obsolete variables ----------------------------- Some obsolete global variables and member variables of structures remain to avoid breaking old userland programs which directly access such variables via kvm(3). The following programs still use kvm(3) to get some information related to the network stack. - netstat(1) - vmstat(1) - fstat(1) netstat(1) accesses ifnet_list, the head of a list of interface objects (struct ifnet), and traverses each object through ifnet#if_list member variable. ifnet_list and ifnet#if_list is obsoleted by ifnet_pslist and ifnet#if_pslist_entry respectively. netstat also accesses the IP address list of an interface throught ifnet#if_addrlist. struct ifaddr, struct in_ifaddr and struct in6_ifaddr are accessed and the following obsolete member variables are stuck: ifaddr#ifa_list, in_ifaddr#ia_hash, in_ifaddr#ia_list, in6_ifaddr#ia_next and in6_ifaddr#_ia6_multiaddrs. Note that netstat already implements alternative methods to fetch the above information via sysctl(3). vmstat(1) shows statistics of hash tables created by hashinit(9) in the kernel. The statistic information is retrieved via kvm(3). The global variables in_ifaddrhash and in_ifaddrhashtbl, which are for a hash table of IPv4 addresses and obsoleted by in_ifaddrhash_pslist and in_ifaddrhashtbl_pslist, are kept for this purpose. We should provide a means to fetch statistics of hash tables via sysctl(3). fstat(1) shows information of bpf instances. Each bpf instance (struct bpf) is obtained via kvm(3). bpf_d#_bd_next, bpf_d#_bd_filter and bpf_d#_bd_list member variables are obsolete but remain. ifnet#if_xname is also accessed via struct bpf_if and obsolete ifnet#if_list is required to remain to not change the offset of ifnet#if_xname. The statistic counters (bpf#bd_rcount, bpf#bd_dcount and bpf#bd_ccount) are also victims of this restriction; for scalability the statistic counters should be per-CPU and we should stop using atomic operations for them however we have to remain the counters and atomic operations. Scalability ----------- - Per-CPU rtcaches (used in say IP forwarding) aren't scalable on multiple flows per CPU - ipsec(4) isn't scalable on the number of SA/SP; the cost of a look-up is O(n) - opencrypto(9)'s crypto_newsession()/crypto_freesession() aren't scalable as they are serialized by one mutex ec_multi* of ethercom --------------------- ec_multiaddrs and ec_multicnt of struct ethercom and items listed in ec_multiaddrs must be protected by ec_lock. The core of ethernet subsystem is already MP-safe, however, device drivers that use the data should also be fixed. A typical change should be to protect manipulations of the data via ETHER_* macros such as ETHER_FIRST_MULTI by ETHER_LOCK and ETHER_UNLOCK.