Skip to content
  • Konstantin Belousov's avatar
    Handle LoR in flush_pagedep_deps(). · 8a1509e4
    Konstantin Belousov authored
    When operating in SU or SU+J mode, ffs_syncvnode() might need to
    instantiate other vnode by inode number while owning syncing vnode
    lock.  Typically this other vnode is the parent of our vnode, but due
    to renames occuring right before fsync (or during fsync when we drop
    the syncing vnode lock, see below) it might be no longer parent.
    
    More, the called function flush_pagedep_deps() needs to lock other
    vnode while owning the lock for vnode which owns the buffer, for which
    the dependencies are flushed.  This creates another instance of the
    same LoR as was fixed in softdep_sync().
    
    Put the generic code for safe relocking into new SU helper
    get_parent_vp() and use it in flush_pagedep_deps().  The case for safe
    relocking of two vnodes with undefined lock order was extracted into
    vn helper vn_lock_pair().
    
    Due to call sequence
         ffs_syncvnode()->softdep_sync_buf()->flush_pagedep_deps(),
    ffs_syncvnode() indicates with ERELOOKUP that passed vnode was
    unlocked in process, and can return ENOENT if the passed vnode
    reclaimed.  All callers of the function were inspected.
    
    Because UFS namei lookups store auxiliary information about directory
    entry in in-memory directory inode, and this information is then used
    by UFS code that creates/removed directory entry in the actual
    mutating VOPs, it is critical that directory vnode lock is not dropped
    between lookup and VOP.  For softdep_prelink(), which ensures that
    later link/unlink operation can proceed without overflowing the
    journal, calls were moved to the place where it is safe to drop
    processing VOP because mutations are not yet applied.  Then, ERELOOKUP
    causes restart of the whole VFS operation (typically VFS syscall) at
    top level, including the re-lookup of the involved pathes.  [Note that
    we already do the same restart for failing calls to vn_start_write(),
    so formally this patch does not introduce new behavior.]
    
    Similarly, unsafe calls to fsync in snapshot creation code were
    plugged.  A possible view on these failures is that it does not make
    sense to continue creating snapshot if the snapshot vnode was
    reclaimed due to forced unmount.
    
    It is possible that relock/ERELOOKUP situation occurs in
    ffs_truncate() called from ufs_inactive().  In this case, dropping the
    vnode lock is not safe.  Detect the situation with VI_DOINGINACT and
    reschedule inactivation by setting VI_OWEINACT.  ufs_inactive()
    rechecks VI_OWEINACT and avoids reclaiming vnode is truncation failed
    this way.
    
    In ffs_truncate(), allocation of the EOF block for partial truncation
    is re-done after vnode is synced, since we cannot leave the buffer
    locked through ffs_syncvnode().
    
    In collaboration with:	pho
    Reviewed by:	mckusick (previous version), markj
    Tested by:	markj (syzkaller), pho
    Sponsored by:	The FreeBSD Foundation
    Differential revision:	https://reviews.freebsd.org/D26136
    8a1509e4