it has a bug in the backoff calculation. so,
- clip it to 1-60 sec. (suggested by Rick Macklem)
- use a constant multiplier instead of nfs_backoff, which
is already exponential.
- move some related constant definations to nfs.h from nqnfs.h and
prefix with NFS_ instead of NQ_ because they are not nqnfs-specific.
past the end of the file. This can happen when two clients are writting to
the same file.
Close PR 21696 by myself, discussed on tech-net in 2003/05 and 2003/06.
Issue raised by Chuck Silvers (commit and truncate ops needs to be serialised)
still unadressed.
* Remove the "lwp *" argument that was added to vget(). Turns out
that nothing actually used it!
* Remove the "lwp *" arguments that were added to VFS_ROOT(), VFS_VGET(),
and VFS_FHTOVP(); all they did was pass it to vget() (which, as noted
above, didn't use it).
* Remove all of the "lwp *" arguments to internal functions that were added
just to appease the above.
be inserted into ktrace records. The general change has been to replace
"struct proc *" with "struct lwp *" in various function prototypes, pass
the lwp through and use l_proc to get the process pointer when needed.
Bump the kernel rev up to 1.6V
- for READ procedure, don't send back more bytes than requested.
- don't have doubtful assumptions on mbuf chain structure.
- rename a function (nfsm_adj -> nfs_zeropad) to avoid confusion as
the semantics of the function was changed.
retransmitted mbufs can survive even after requests themselves
finished. so, before unbusy pages, make sure that mbufs referring them
go away.
pointed by enami tsugutomo on port-mips.
while our nfsd announces MAXBSIZE as wtmax for tcp,
VOP_GETPAGES of filesystems that uses genfs_getpages can't
handle >= MAX_READ_AHEAD(16) pages at once.
therefore, depending on PAGE_SIZE of the machine and file offset of
a read request, we can't VOP_GETPAGES the range at once.
instead of "struct vnode". This saves a number of pointer dereferences;
it sums up to about half a kB for me. And it paves the way for future
fixes.
While cleaning up, eliminate a write-only member of "struct nfsreq"
and a pointless assignment in the NFS_V2_ONLY case.
- Under chroot it displays only the visible filesystems with appropriate paths.
- The statfs f_mntonname gets adjusted to contain the real path from root.
- While was there, fixed a bug in ext2fs, locking problems with vfs_getfsstat(),
and factored out some of the vfsop statfs() code to copy_statfs_info(). This
fixes the problem where some filesystems forgot to set fsid.
- Made coda look more like a normal fs.
belong to us. otherwise, data will be lost on server crash.
- use b_bcount instead of b_bufsize to determine
how many pages we should deal with.
based on a patch from Chuck Silvers.
discussed on tech-kern.
Do a little mbuf rework while here. Change all uses of MGET*(*, M_WAIT, *)
to m_get*(M_WAIT, *). These are not performance critical and making them
call m_get saves considerable space. Add m_clget analogue of MCLGET and
make corresponding change for M_WAIT uses.
Modify netinet, gem, fxp, tulip, nfs to support MBUFTRACE.
Begin to change netstat to use sysctl.
(there are still some details to work out) but expect that to go
away soon. To support these basic changes (creation of lfs_putpages,
lfs_gop_write, mods to lfs_balloc) several other changes were made, to
wit:
* Create a writer daemon kernel thread whose purpose is to handle page
writes for the pagedaemon, but which also takes over some of the
functions of lfs_check(). This thread is started the first time an
LFS is mounted.
* Add a "flags" parameter to GOP_SIZE. Current values are
GOP_SIZE_READ, meaning that the call should return the size of the
in-core version of the file, and GOP_SIZE_WRITE, meaning that it
should return the on-disk size. One of GOP_SIZE_READ or
GOP_SIZE_WRITE must be specified.
* Instead of using malloc(...M_WAITOK) for everything, reserve enough
resources to get by and use malloc(...M_NOWAIT), using the reserves if
necessary. Use the pool subsystem for structures small enough that
this is feasible. This also obsoletes LFS_THROTTLE.
And a few that are not strictly necessary:
* Moves the LFS inode extensions off onto a separately allocated
structure; getting closer to LFS as an LKM. "Welcome to 1.6O."
* Unified GOP_ALLOC between FFS and LFS.
* Update LFS copyright headers to correct values.
* Actually cast to unsigned in lfs_shellsort, like the comment says.
* Keep track of which segments were empty before the previous
checkpoint; any segments that pass two checkpoints both dirty and
empty can be summarily cleaned. Do this. Right now lfs_segclean
still works, but this should be turned into an effectless
compatibility syscall.
It will never get back... it will not be found in nfs_nget, a new
nfsnode+vnode is allocated instead, which causes a node leak, and
also makes the mountpointness of the vnode to be forgotten, breaking
filesystem crossing lookups through this vnode.
into nfs_inactive, this is a better place for it.
This doesn't actually solve the actual problem, which appears to be a race
condition with unmounting and vnode recycling somewhere, but it fixes
it in the sense that nfs_reclaim will not reference a bad v_mount anymore.
malloc types into a structure, a pointer to which is passed around,
instead of an int constant. Allow the limit to be adjusted when the
malloc type is defined, or with a function call, as suggested by
Jonathan Stone.
kqueue provides a stateful and efficient event notification framework
currently supported events include socket, file, directory, fifo,
pipe, tty and device changes, and monitoring of processes and signals
kqueue is supported by all writable filesystems in NetBSD tree
(with exception of Coda) and all device drivers supporting poll(2)
based on work done by Jonathan Lemon for FreeBSD
initial NetBSD port done by Luke Mewburn and Jason Thorpe
add a flag that specify if the file can be truncated safely or not
to nfsm_loadattr and friends. when it isn't safe, just mark the nfsnode
as "should be truncated later".
ok'ed by Frank van der Linden and Chuck Silvers.
close kern/18036.
routers dropping the packet
(seems to be a problem with Cisco and its "helper-address" feature;
a Cabletron SSR I tested with didn't have this problem)
PGO_LOCKED getpages request. So, just make the lock fail and tell
the caller that there is no pages available if we can't acquire it.
The caller will call us again soon without PGO_LOCKED. Reviewed by chuq.
set via NFSV3SATTRTIME_TOSERVER and not NFSV3SATTRTIME_TOCLIENT,
add VA_UTIMES_NULL to the va_vflags. This reflects our policy
where we're much more liberal about who can set a & m times to 'now'
than we are about who can set them to a specific time.
Should close PR 15597 from Martin Husemann. Patch is based on the
one Matthias Drochner gave in the PR.
clean and without writable mappings. if we try to flush dirty pages past
EOF to the server when NMODIFIED is clear, we'll update the attrcache before
doing the write, which will try to free the pages past EOF and deadlock.
to deal with this, we write-protect pages before we send them to the server,
and restrict ourselves to creating read-only mappings if NMODIFIED isn't set.
score another one for enami.
deal with shortages of the VM maps where the backing pages are mapped
(usually kmem_map). Try to deal with this:
* Group all information about the backend allocator for a pool in a
separate structure. The pool references this structure, rather than
the individual fields.
* Change the pool_init() API accordingly, and adjust all callers.
* Link all pools using the same backend allocator on a list.
* The backend allocator is responsible for waiting for physical memory
to become available, but will still fail if it cannot callocate KVA
space for the pages. If this happens, carefully drain all pools using
the same backend allocator, so that some KVA space can be freed.
* Change pool_reclaim() to indicate if it actually succeeded in freeing
some pages, and use that information to make draining easier and more
efficient.
* Get rid of PR_URGENT. There was only one use of it, and it could be
dealt with by the caller.
From art@openbsd.org.
rather than using home-grown code to find a free reserved socket.
this also results in nfs pcb's having the INP_ANONPORT and INP_LOWPORT flags
set, which is useful for netstat(1) to know.
frank's scheme, with one new twist: don't wait until we've totally run
out of free pages before committing, but instead notice when we've built
up a largish range of uncommitted pages and commit only the older half of
the range, which is likely to already be on disk on the server.
struct nfssvc_sock.
Affected only when a recordmark of RPC over TCP is fragmented to
multiple mbufs. I do not know whether this code has ever been executed :)
and confusion about the actual filesize. From Matt Dillon's
similar change in FreeBSD.
XXX n_size is really redundant in -current and must die. This commit
XXX is more of a placeholder for a pullup into the 1.5 branch.
uint32_t namei_hash(const char *p, const char **ep)
which determines the equivalent MI hash32_str() hash for p.
If *ep != NULL, calculate the hash to the character before ep.
If *ep == NULL, calculate the has to the first / or NUL found, and
point *ep to that location.
- Use namei_hash() to calculate cn_hash in lookup() and relookup().
Hash distribution goes from 35-40% to 55-70%, with similar profiled
time spent in cache_lookup() and cache_enter() on my P3-600.
- Use namei_hash() to calculate cn_hash in nfs_readdirplusrpc(),
insetad of homegrown code (that differed from that in lookup() !)
namei_hash() has better spread and is faster than previous code
(which used a non-constant multiplication).