Skip to content
  • Mark Johnston's avatar
    Change synchonization rules for vm_page reference counting. · fee2a2fa
    Mark Johnston authored
    There are several mechanisms by which a vm_page reference is held,
    preventing the page from being freed back to the page allocator.  In
    particular, holding the page's object lock is sufficient to prevent the
    page from being freed; holding the busy lock or a wiring is sufficent as
    well.  These references are protected by the page lock, which must
    therefore be acquired for many per-page operations.  This results in
    false sharing since the page locks are external to the vm_page
    structures themselves and each lock protects multiple structures.
    
    Transition to using an atomically updated per-page reference counter.
    The object's reference is counted using a flag bit in the counter.  A
    second flag bit is used to atomically block new references via
    pmap_extract_and_hold() while removing managed mappings of a page.
    Thus, the reference count of a page is guaranteed not to increase if the
    page is unbusied, unmapped, and the object's write lock is held.  As
    a consequence of this, the page lock no longer protects a page's
    identity; operations which move pages between objects are now
    synchronized solely by the objects' locks.
    
    The vm_page_wire() and vm_page_unwire() KPIs are changed.  The former
    requires that either the object lock or the busy lock is held.  The
    latter no longer has a return value and may free the page if it releases
    the last reference to that page.  vm_page_unwire_noq() behaves the same
    as before; the caller is responsible for checking its return value and
    freeing or enqueuing the page as appropriate.  vm_page_wire_mapped() is
    introduced for use in pmap_extract_and_hold().  It fails if the page is
    concurrently being unmapped, typically triggering a fallback to the
    fault handler.  vm_page_wire() no longer requires the page lock and
    vm_page_unwire() now internally acquires the page lock when releasing
    the last wiring of a page (since the page lock still protects a page's
    queue state).  In particular, synchronization details are no longer
    leaked into the caller.
    
    The change excises the page lock from several frequently executed code
    paths.  In particular, vm_object_terminate() no longer bounces between
    page locks as it releases an object's pages, and direct I/O and
    sendfile(SF_NOCACHE) completions no longer require the page lock.  In
    these latter cases we now get linear scalability in the common scenario
    where different threads are operating on different files.
    
    __FreeBSD_version is bumped.  The DRM ports have been updated to
    accomodate the KPI changes.
    
    Reviewed by:	jeff (earlier version)
    Tested by:	gallatin (earlier version), pho
    Sponsored by:	Netflix
    Differential Revision:	https://reviews.freebsd.org/D20486
    fee2a2fa