Work around deadlock between fstchg and fstcnt

When suspending a filesystem in fstrans_setstate(), we wait on
fstcnt for threads to finish transactions. While we do this, any
thread trying to start a filesystem transaction will wait on fstchg
in fstrans_start(), a situation which can deadlock.

The wait for fstcnt in fstrans_setstate() can be interrupted by
a signal, but the wait for fstchg in fstrans_start() cannot. Once
most processes are stuck in fstchg, it is impossible to send a
signal to the thread that waits on fstcnt, because no process
respond anymore to user input.

We fix that by adding a timeout to the wait on fstcnt in
fstrans_setstate(). This means suspending a filesystem may fail,
but it was already the case when the sleep was interupted by
a signal, hence calling function must already handle a possible
failure.

Fixes kern/53624
This commit is contained in:
manu 2018-09-27 01:03:40 +00:00
parent d794b9b637
commit 83a99212b0
1 changed files with 8 additions and 3 deletions

View File

@ -1,4 +1,4 @@
/* $NetBSD: vfs_trans.c,v 1.48 2017/06/18 14:00:17 hannken Exp $ */
/* $NetBSD: vfs_trans.c,v 1.49 2018/09/27 01:03:40 manu Exp $ */
/*-
* Copyright (c) 2007 The NetBSD Foundation, Inc.
@ -30,7 +30,7 @@
*/
#include <sys/cdefs.h>
__KERNEL_RCSID(0, "$NetBSD: vfs_trans.c,v 1.48 2017/06/18 14:00:17 hannken Exp $");
__KERNEL_RCSID(0, "$NetBSD: vfs_trans.c,v 1.49 2018/09/27 01:03:40 manu Exp $");
/*
* File system transaction operations.
@ -42,6 +42,7 @@ __KERNEL_RCSID(0, "$NetBSD: vfs_trans.c,v 1.48 2017/06/18 14:00:17 hannken Exp $
#include <sys/param.h>
#include <sys/systm.h>
#include <sys/kernel.h>
#include <sys/atomic.h>
#include <sys/buf.h>
#include <sys/kmem.h>
@ -532,10 +533,14 @@ fstrans_setstate(struct mount *mp, enum fstrans_state new_state)
/*
* All threads see the new state now.
* Wait for transactions invalid at this state to leave.
* We cannot wait forever because many processes would
* get stuck waiting for fstcnt in fstrans_start(). This
* is acute when suspending the root filesystem.
*/
error = 0;
while (! state_change_done(mp)) {
error = cv_wait_sig(&fstrans_count_cv, &fstrans_lock);
error = cv_timedwait_sig(&fstrans_count_cv,
&fstrans_lock, hz / 4);
if (error) {
new_state = fmi->fmi_state = FSTRANS_NORMAL;
break;