to initiate self destruct, i.e. unmount(MNT_FORCE). This, however,
is a semi-controlled self-destruct, since all caches are flushed
before the (possibly) violent unmount takes place.
context. This fixes a long-standing but seldomly seen deadlock,
where the kernel was holding pages busy (due to e.g. readahead
request) while waiting for the server to respond, and the server
made a callback into the kernel asking to invalidate those pages.
... or, well, theoretically fixes, since I didn't have any reliable
way of repeating the deadlock and I think I saw it only twice.
* it depended on the biglock (in a very cruel way)
* it was attached to userspace transactions rather than logical
fs operations
(If someone wants to revisit it some day, most of the stuff can be
reused from cvs history)
a valid optimization, but that's long gone and once VOP_INACTIVE is
called and the file server says that the vnode is going to be recycled,
it really is going to be recycled extra references gained or not.
based on code taken from FreeBSD.
This stops truncation of files larger than 4GB by VOP_SETATTR() which e.g.
happened when copying large files "rump_smbfs". Kudos to Antti Kantee
for diagnosing the problem in smbfs_smb_setfsize().
Change cd9660_mount, in MNT_UPDATE case, to check dev_t's for equality
instead of just vnode pointers. Fixes erroneous "Invalid argument"
errors from mount(8) with -u against cd9660 root in the presence of
mfs or tmpfs /dev prepared after initial mountroot.
Tested on QEMU running cobalt Restore CD.
operations creating a node cannot be considered outgoing operations,
since after return from userspace they modify file system state
by creating a new node. if we do not protect the file system by
holding the directory lock, a lookup operation might race us into
the kernel and create the node earlier.
* remove pnode from hashlish before sending the reclaim faf off to
userspace. also, hold pmp_lock while frobbing the list.
system generate heaps of odd allocations since the end of write request was
overwritten by the start of the second resulting in another relocation.
Also added a full flush of the file on a VOP_CLOSE(). This automatically
flushes file tails to disc.
- pcinfo = kmem_zalloc(sizeof_puffs_cacheinfo) + runsize,
+ pcinfo = kmem_zalloc(sizeof(struct puffs_cacheinfo) + runsize,
in #ifdef'ed out code, per paired kmem_free() in the same function.
Closes PR kern/41840.
complaint in the "ntfs ubc_uiomove error" (ubc_uiomove error was
not coming from ntfs but instead the "to" file system) and PR
kern/38531 (well, I assume the submitter wanted cp(1) working on
ntfs instead of mangling ntfs the way the PR title suggests). Yes,
mmap works on ntfs like it always has.
*) well, um, and in other places too ... uuuh ... no comments.
but I guess this works as long as in-kernel ntfs doesn't grow write
support.
out before automatically.
However, when dealing with faulty discs that fail to mount, system nodes are
of course not written out and thus may still be marked dirty, if only due to
access. Especially on sequential media this gave rise to panics on reading
trackinfo since the write track section had not yet been initialised.
tested with a DEBUG+DIAGNOSTIC+LOCKDEBUG kernel. To summerise NiLFS, i'll
repeat my posting to tech-kern here:
NiLFS stands for New implementation of Logging File System; LFS done
right they claim :) It is at version 2 now and is being developed by NTT, the
Japanese telecom company and recently put into the linux source tree. See
http://www.nilfs.org. The on-disc format is not completely frozen and i expect
at least one minor revision to come in time.
The benefits of NiLFS are build-in fine-grained checkpointing, persistent
snapshots, multiple mounts and very large file and media support. Every
checkpoint can be transformed into a snapshot and v.v. It is said to perform
very well on flash media since it is not overwriting pieces apart from a
incidental update of the superblock, but that might change. It is accompanied
by a cleaner to clean up the segments and recover lost space.
My work is not a port of the linux code; its a new implementation. Porting the
code would be more work since its very linux oriented and never written to be
ported outside linux. The goal is to be fully interchangable. The code is non
intrusive to other parts of the kernel. It is also very light-weight.
The current state of the code is read-only access to both clean and dirty
NiLFS partitions. On mounting a dirty partition it rolls forward the log to
the last checkpoint. Full read-write support is however planned!
Just as the linux code, mount_nilfs allows for the `head' to be mounted
read/write and allows multiple read-only snapshots/checkpoint mounts next to
it.
By allowing the RW mount at a different snapshot for read-write it should be
possible eventually to revert back to a previous state; i.e. try to upgrade a
system and being able to revert to the exact state prior to the upgrade.
Compared to other FS's its pretty light-weight, suitable for embedded use and
on flash media. The read-only code is currently 17kb object code on
NetBSD/i386. I doubt the read-write code will surpass the 50 or 60. Compared
this to FFS being 156kb, UDF being 84 kb and NFS being 130kb. Run-time memory
usage is most likely not very different from other uses though maybe a bit
higher than FFS.
This variable was not actually used uninitialised, but some compilers
(e.g. gcc-4.3.3) warned that the variable might be used uninitialised.
Inspired by PR 41255 from Kurt Lidl.
would push out elements to fillup-read only when the time had come for them.
This could then trickle feed the read queue slowly, but fast enough to prevent
it from switching state.
Benefits are significant speed improvements on node creation/insertion while
keeping the lookup times low and still allowing sequential iteration over the
nodes.
member.
XXX No idea if this is the right solution to this problem, but it does
XXX at least allow thebuild to continue. The original committed should
XXX verify that this does what was intended!
(Hello again, Elad)
Be sure that no other active vnodes remains, before trying to release
the root one. Likewise, do not destroy the smbmount specific structure
if the umount will fail (busy conditions).
No objection from pooka@.
check_console, veriexecclose, veriexec_delete, veriexec_file_add,
emul_find_root, coff_load_shlib (sh3 version), coff_load_shlib,
compat_20_sys_statfs, compat_20_netbsd32_statfs,
ELFNAME2(netbsd32,probe_noteless), darwin_sys_statfs,
ibcs2_sys_statfs, ibcs2_sys_statvfs, linux_sys_uselib,
osf1_sys_statfs, sunos_sys_statfs, sunos32_sys_statfs,
ultrix_sys_statfs, do_sys_mount, fss_create_files (3 of 4),
adosfs_mount, cd9660_mount, coda_ioctl, coda_mount, ext2fs_mount,
ffs_mount, filecore_mount, hfs_mount, lfs_mount, msdosfs_mount,
ntfs_mount, sysvbfs_mount, udf_mount, union_mount, sys_chflags,
sys_lchflags, sys_chmod, sys_lchmod, sys_chown, sys_lchown,
sys___posix_chown, sys___posix_lchown, sys_link, do_sys_pstatvfs,
sys_quotactl, sys_revoke, sys_truncate, do_sys_utimes, sys_extattrctl,
sys_extattr_set_file, sys_extattr_set_link, sys_extattr_get_file,
sys_extattr_get_link, sys_extattr_delete_file,
sys_extattr_delete_link, sys_extattr_list_file, sys_extattr_list_link,
sys_setxattr, sys_lsetxattr, sys_getxattr, sys_lgetxattr,
sys_listxattr, sys_llistxattr, sys_removexattr, sys_lremovexattr
All have been scrutinized (several times, in fact) and compile-tested,
but not all have been explicitly tested in action.
XXX: While I haven't (intentionally) changed the use or nonuse of
XXX: TRYEMULROOT in any of these places, I'm not convinced all the
XXX: uses are correct; an audit might be desirable.
of free blocks on the device and when free blocks are getting tight it tries
to readjust/recalculate that value by syncing the FS.
Second stage will be resizing the data/metadata partitions.
the other routines of the same spirit.
Adjust file-system code to use it.
Keep vaccess() for KPI compatibility and to keep element of least
surprise. A "diagnostic" message warning that vaccess() is deprecated will
be printed when it's used (obviously, only in DIAGNOSTIC kernels).
No objections on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2009/06/21/msg005310.html
the security checks when mounting a device (VOP_ACCESS() + kauth(9) call)).
Proposed with no objections on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2009/04/20/msg004859.html
The vnode is always expected to be locked, so no locking is done outside
the file-system code.
the security checks when mounting a device (VOP_ACCESS() + kauth(9) call)).
Proposed with no objections on tech-kern@:
http://mail-index.netbsd.org/tech-kern/2009/04/20/msg004859.html
The vnode is always expected to be locked, so no locking is done outside
the file-system code.
dirent is no longer cached in lookup and we do the lookup ourselves
in rename, we are most definitely not allowed to assert that it
matches the source vnode passed as an argument. In case the source
node does not exist or has been replaced, punt with ENOENT.
Also, nuke some misleading prehistoric comments which haven't been
valid in over a year.
Fixes PR kern/41128 by Nicolas Joly
There are still about 1600 left, but they have ',' or /* ... */
in the actual variable definitions - which my awk script doesn't handle.
There are also many that need () -> (void).
(The script does handle misordered arguments.)
allow CD-ROM/DVD-ROM/DB-ROM drives to read the media while still allowing them
to be appended later. It can also be seen as a way to make mountable
snapshots.
its currently still misbehaving. If supported later, it would allow one or a
series of sessions on a sequential recordable media to be ignored as if they
never were created.
Also fix a small comment: its not the direct but the bootstrap disc strategy
that we close down.
writable tracks on session opening. Note that this an optionally implemented
feature and thus no error will be generated if it fails; the drive will most
likely autorepair it.
of when working deep into the directory tree it can reliably be reget and
marked correctly as the FS root.
Fixed pwd(1) lock panic and possible endless loop in other tools.
#include "opt_quota.h" which do exactly nothing. Speeds up kernel
compilation by 1.375*10^-20001 seconds. But leave the most moxious
comment in msdosfs_vfsops untouched.
1) Enhance write speed significantly on RMW media like CD-RW, DVD-RW but also
on the DVD+RW and all other ECC blocked media. Significant speedups of access
to the device for say compilation on the DVD. Streaming copy is also still at
maximum speed though vast amounts of directory copy work can show side effects
that appear it to slow down but are actually logical when you consider that
most small files are embedded into the descriptors itself.
2) explicit wait for the created RMW thread to spinup
for now since its a lot slower than `rmw' access.
For archs that have trouble with `rmw' for whatever reason can so use it as a
scapegoat to allways mount savely rdonly though slower.
likely due to the file system being full).
Otherwise we'd fail in VOP_PUTPAGES(), which might not happen during
VOP_WRITE(), thus giving the caller the wrong impression that
writing was succesful.
default case of ptyfs mounted under /dev/pts as any chroot would get
/%d as slave names. This allows null mounts of ptyfs to work.
To allow pty allocation from within chroots, either no ptyfs must be
mounted or a null mount exist.
old somewhat naive selection scheme that didn't allow different allocation
settings for nodes, directory information (FIDs) and data.
Also fix some curious side-effects of atime updates on RMW devices.
it could result in theory result in descriptor trashing.
On the performance side, it would try to fixup *every* descriptor even if
it wasn't an internally allocated one. Performance loss wasn't that big but
every bit helps.
preliminary Metadata partition write support but its disabled still since
its not finished yet and not functioning correctly. All other formats are
checked and should work fine.
improvements of at least 4 times in untarring and roughly 100 to 500 times
on file creation in big directories. Lookup of files was O(n*n) and is now
O(1) even for file creation. Free spaces in the directory are kept in a
seperate list for fast file creation.
The postmark benchmark gives:
UDF old:
pm>set transactions 2000
pm>set number 3000
pm>run
Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
1593 seconds total
681 seconds of transactions (2 per second)
Files:
3956 created (2 per second)
Creation alone: 3000 files (4 per second)
Mixed with transactions: 956 files (1 per second)
990 read (1 per second)
1010 appended (1 per second)
3956 deleted (2 per second)
Deletion alone: 2912 files (9 per second)
Mixed with transactions: 1044 files (1 per second)
Data:
5.26 megabytes read (3.38 kilobytes per second)
21.93 megabytes written (14.10 kilobytes per second)
pm>
UDF new:
pm>set transactions 2000
pm>set number 3000
pm>run
Creating files...Done
Performing transactions..........Done
Deleting files...Done
Time:
19 seconds total
3 seconds of transactions (666 per second)
Files:
3956 created (208 per second)
Creation alone: 3000 files (230 per second)
Mixed with transactions: 956 files (318 per second)
990 read (330 per second)
1010 appended (336 per second)
3956 deleted (208 per second)
Deletion alone: 2912 files (970 per second)
Mixed with transactions: 1044 files (348 per second)
Data:
5.26 megabytes read (283.66 kilobytes per second)
21.93 megabytes written (1.15 megabytes per second)
unlock the source directory again on exit. The stub that doesn't allow
cross directory renames for now jumped to the wrong exit point and thus
left a locked directory node that paniced on next locking.
heavily fragmented files.
Also fixing some (rare) allocation bugs and function name streamlining.
Tested on harddisc, CD-RW and CD-R i.e. all three basic backend classes.
corruptions that could take place when overwriting sparse files.
Still one rare corruption possible where blocks are accidentally marked
free, but the cause is not yet found and looking at the pattern it won't
happen in every day use.
the name ".." on a parent path component. To prevent other similar errors,
name length checking is not done but the passed name that shouldn't be
passed is ignored.