1. 23 Mar, 2021 1 commit
    • Hans Petter Selasky's avatar
      MFC 6eb60f5b: · dd426d67
      Hans Petter Selasky authored
      Use the word "LinuxKPI" instead of "Linux compatibility", to not confuse with
      user-space Linux compatibility support. No functional change.
      
      Sponsored by:	Mellanox Technologies // NVIDIA Networking
      
      (cherry picked from commit 6eb60f5b)
      dd426d67
  2. 19 Jan, 2021 1 commit
  3. 25 Jul, 2020 1 commit
    • Alexander Motin's avatar
      Allow swi_sched() to be called from NMI context. · aba10e13
      Alexander Motin authored
      For purposes of handling hardware error reported via NMIs I need a way to
      escape NMI context, being too restrictive to do something significant.
      
      To do it this change introduces new swi_sched() flag SWI_FROMNMI, making
      it careful about used KPIs.  On platforms allowing IPI sending from NMI
      context (x86 for now) it immediately wakes clk_intr_event via new IPI_SWI,
      otherwise it works just like SWI_DELAY.  To handle the delayed SWIs this
      patch calls clk_intr_event on every hardclock() tick.
      
      MFC after:	2 weeks
      Sponsored by:	iXsystems, Inc.
      Differential Revision:	https://reviews.freebsd.org/D25754
      aba10e13
  4. 23 Jan, 2020 2 commits
  5. 13 Dec, 2019 1 commit
  6. 21 Nov, 2019 1 commit
  7. 20 May, 2019 1 commit
    • Conrad Meyer's avatar
      Extract eventfilter declarations to sys/_eventfilter.h · e2e050c8
      Conrad Meyer authored
      This allows replacing "sys/eventfilter.h" includes with "sys/_eventfilter.h"
      in other header files (e.g., sys/{bus,conf,cpu}.h) and reduces header
      pollution substantially.
      
      EVENTHANDLER_DECLARE and EVENTHANDLER_LIST_DECLAREs were moved out of .c
      files into appropriate headers (e.g., sys/proc.h, powernv/opal.h).
      
      As a side effect of reduced header pollution, many .c files and headers no
      longer contain needed definitions.  The remainder of the patch addresses
      adding appropriate includes to fix those files.
      
      LOCK_DEBUG and LOCK_FILE_LINE_ARG are moved to sys/_lock.h, as required by
      sys/mutex.h since r326106 (but silently protected by header pollution prior
      to this change).
      
      No functional change (intended).  Of course, any out of tree modules that
      relied on header pollution for sys/eventhandler.h, sys/lock.h, or
      sys/mutex.h inclusion need to be fixed.  __FreeBSD_version has been bumped.
      e2e050c8
  8. 10 May, 2019 1 commit
    • Andrew Gallatin's avatar
      Bind TCP HPTS (pacer) threads to NUMA domains · 4e255d74
      Andrew Gallatin authored
      Bind the TCP pacer threads to NUMA domains and build per-domain
      pacer-thread lookup tables. These tables allow us to use the
      inpcb's NUMA domain information to match an inpcb with a pacer
      thread on the same domain.
      
      The motivation for this is to keep the TCP connection local to a
      NUMA domain as much as possible.
      
      Thanks to jhb for pre-reviewing an earlier version of the patch.
      
      Reviewed by:	rrs
      Sponsored by:	Netflix
      Differential Revision:	https://reviews.freebsd.org/D20134
      4e255d74
  9. 02 Mar, 2019 1 commit
    • Justin Hibbits's avatar
      powerpc: Scale intrcnt by mp_ncpus · 51244b1e
      Justin Hibbits authored
      On very large powerpc64 systems (2x22x4 power9) it's very easy to run out of
      available IRQs and crash the system at boot.  Scale the count by mp_ncpus,
      similar to x86, so this doesn't happen.  Further work can be done in the future
      to scale the I/O IRQs as well, but that's left for the future.
      
      Submitted by:	mmacy
      MFC after:	3 weeks
      51244b1e
  10. 17 Dec, 2018 1 commit
    • Andriy Gapon's avatar
      add support for marking interrupt handlers as suspended · 82a5a275
      Andriy Gapon authored
      The goal of this change is to fix a problem with PCI shared interrupts
      during suspend and resume.
      
      I have observed a couple of variations of the following scenario.
      Devices A and B are on the same PCI bus and share the same interrupt.
      Device A's driver is suspended first and the device is powered down.
      Device B generates an interrupt. Interrupt handlers of both drivers are
      called. Device A's interrupt handler accesses registers of the powered
      down device and gets back bogus values (I assume all 0xff). That data is
      interpreted as interrupt status bits, etc. So, the interrupt handler
      gets confused and may produce some noise or enter an infinite loop, etc.
      
      This change affects only PCI devices.  The pci(4) bus driver marks a
      child's interrupt handler as suspended after the child's suspend method
      is called and before the device is powered down.  This is done only for
      traditional PCI interrupts, because only they can be shared.
      
      At the moment the change is only for x86.
      
      Notable changes in core subsystems / interfaces:
      - BUS_SUSPEND_INTR and BUS_RESUME_INTR methods are added to bus
        interface along with convenience functions bus_suspend_intr and
        bus_resume_intr;
      - rman_set_irq_cookie and rman_get_irq_cookie functions are added to
        provide a way to associate an interrupt resource with an interrupt
        cookie;
      - intr_event_suspend_handler and intr_event_resume_handler functions
        are added to the MI interrupt handler interface.
      
      I added two new interrupt handler flags, IH_SUSP and IH_CHANGED, to
      implement the new intr_event functions.  IH_SUSP marks a suspended
      interrupt handler.  IH_CHANGED is used to implement a barrier that
      ensures that a change to the interrupt handler's state is visible
      to future interrupts.
      While there, I fixed some whitespace issues in comments and changed a
      couple of logically boolean variables to be bool.
      
      MFC after:	1 month (maybe)
      Differential Revision: https://reviews.freebsd.org/D15755
      82a5a275
  11. 28 Aug, 2018 1 commit
    • John Baldwin's avatar
      Dynamically allocate IRQ ranges on x86. · fd036dea
      John Baldwin authored
      Previously, x86 used static ranges of IRQ values for different types
      of I/O interrupts.  Interrupt pins on I/O APICs and 8259A PICs used
      IRQ values from 0 to 254.  MSI interrupts used a compile-time-defined
      range starting at 256, and Xen event channels used a
      compile-time-defined range after MSI.  Some recent systems have more
      than 255 I/O APIC interrupt pins which resulted in those IRQ values
      overflowing into the MSI range triggering an assertion failure.
      
      Replace statically assigned ranges with dynamic ranges.  Do a single
      pass computing the sizes of the IRQ ranges (PICs, MSI, Xen) to
      determine the total number of IRQs required.  Allocate the interrupt
      source and interrupt count arrays dynamically once this pass has
      completed.  To minimize runtime complexity these arrays are only sized
      once during bootup.  The PIC range is determined by the PICs present
      in the system.  The MSI and Xen ranges continue to use a fixed size,
      though this does make it possible to turn the MSI range size into a
      tunable in the future.
      
      As a result, various places are updated to use dynamic limits instead
      of constants.  In addition, the vmstat(8) utility has been taught to
      understand that some kernels may treat 'intrcnt' and 'intrnames' as
      pointers rather than arrays when extracting interrupt stats from a
      crashdump.  This is determined by the presence (vs absence) of a
      global 'nintrcnt' symbol.
      
      This change reverts r189404 which worked around a buggy BIOS which
      enumerated an I/O APIC twice (using the same memory mapped address for
      both entries but using an IRQ base of 256 for one entry and a valid
      IRQ base for the second entry).  Making the "base" of MSI IRQ values
      dynamic avoids the panic that r189404 worked around, and there may now
      be valid I/O APICs with an IRQ base above 256 which this workaround
      would incorrectly skip.
      
      If in the future the issue reported in PR 130483 reoccurs, we will
      have to add a pass over the I/O APIC entries in the MADT to detect
      duplicates using the memory mapped address and use some strategy to
      choose the "correct" one.
      
      While here, reserve room in intrcnts for the Hyper-V counters.
      
      PR:		229429, 130483
      Reviewed by:	kib, royger, cem
      Tested by:	royger (Xen), kib (DMAR)
      Approved by:	re (gjb)
      MFC after:	2 weeks
      Differential Revision:	https://reviews.freebsd.org/D16861
      fd036dea
  12. 03 Aug, 2018 1 commit
    • Andriy Gapon's avatar
      safer wait-free iteration of shared interrupt handlers · e0fa977e
      Andriy Gapon authored
      The code that iterates a list of interrupt handlers for a (shared)
      interrupt, whether in the ISR context or in the context of an interrupt
      thread, does so in a lock-free fashion.   Thus, the routines that modify
      the list need to take special steps to ensure that the iterating code
      has a consistent view of the list.  Previously, those routines tried to
      play nice only with the code running in the ithread context.  The
      iteration in the ISR context was left to a chance.
      
      After commit r336635 atomic operations and memory fences are used to
      ensure that ie_handlers list is always safe to navigate with respect to
      inserting and removal of list elements.
      
      There is still a question of when it is safe to actually free a removed
      element.
      
      The idea of this change is somewhat similar to the idea of the epoch
      based reclamation.  There are some simplifications comparing to the
      general epoch based reclamation.  All writers are serialized using a
      mutex, so we do not need to worry about concurrent modifications.  Also,
      all read accesses from the open context are serialized too.
      
      So, we can get away just two epochs / phases.  When a thread removes an
      element it switches the global phase from the current phase to the other
      and then drains the previous phase.  Only after the draining the removed
      element gets actually freed. The code that iterates the list in the ISR
      context takes a snapshot of the global phase and then increments the use
      count of that phase before iterating the list.  The use count (in the
      same phase) is decremented after the iteration.  This should ensure that
      there should be no iteration over the removed element when its gets
      freed.
      
      This commit also simplifies the coordination with the interrupt thread
      context.  Now we always schedule the interrupt thread when removing one
      of handlers for its interrupt.  This makes the code both simpler and
      safer as the interrupt thread masks the interrupt thus ensuring that
      there is no interaction with the ISR context.
      
      P.S.  This change matters only for shared interrupts and I realize that
      those are becoming a thing of the past (and quickly).  I also understand
      that the problem that I am trying to solve is extremely rare.
      
      PR:		229106
      Reviewed by:	cem
      Discussed with:	Samy Al Bahra
      MFC after:	5 weeks
      Differential Revision: https://reviews.freebsd.org/D15905
      e0fa977e
  13. 23 Jul, 2018 1 commit
    • Andriy Gapon's avatar
      change interrupt event's list of handlers from TAILQ to CK_SLIST · 111b043c
      Andriy Gapon authored
      The primary reason for this commit is to separate mechanical and nearly
      mechanical code changes from an upcoming fix for unsafe teardown of
      shared interrupt handlers that have only filters (see D15905).
      
      The technical rationale is that SLIST is sufficient.  The only operation
      that gets worse performance -- O(n) instead of O(1) is a removal of a
      handler,  but it is not a critical operation and the list is expected to
      be rather short.
      
      Additionally, it is easier to reason about SLIST when considering the
      concurrent lock-free access to the list from the interrupt context and
      the interrupt thread.
      
      CK_SLIST is used because the upcoming change depends on the memory order
      provided by CK_SLIST insert and the fact that CL_SLIST remove does not
      trash the linkage in a removed element.
      
      While here, I also fixed a couple of whitespace issues, made code under
      ifdef notyet compilable, added a lock assertion to ithread_update() and
      made intr_event_execute_handlers() static as it had no external callers.
      
      Reviewed by:	cem (earlier version)
      MFC after:	4 weeks
      Differential Revision: https://reviews.freebsd.org/D16016
      111b043c
  14. 27 Nov, 2017 1 commit
    • Pedro F. Giffuni's avatar
      sys/sys: further adoption of SPDX licensing ID tags. · c4e20cad
      Pedro F. Giffuni authored
      Mainly focus on files that use BSD 2-Clause license, however the tool I
      was using misidentified many licenses so this was mostly a manual - error
      prone - task.
      
      The Software Package Data Exchange (SPDX) group provides a specification
      to make it easier for automated tools to detect and summarize well known
      opensource licenses. We are gradually adopting the specification, noting
      that the tags are considered only advisory and do not, in any way,
      superceed or replace the license texts.
      c4e20cad
  15. 03 May, 2017 1 commit
  16. 17 Sep, 2014 1 commit
  17. 18 Jul, 2011 1 commit
    • Attilio Rao's avatar
      - Remove the eintrcnt/eintrnames usage and introduce the concept of · 521ea19d
      Attilio Rao authored
        sintrcnt/sintrnames which are symbols containing the size of the 2
        tables.
      - For amd64/i386 remove the storage of intr* stuff from assembly files.
        This area can be widely improved by applying the same to other
        architectures and likely finding an unified approach among them and
        move the whole code to be MI. More work in this area is expected to
        happen fairly soon.
      
      No MFC is previewed for this patch.
      
      Tested by:	pluknet
      Reviewed by:	jhb
      Approved by:	re (kib)
      521ea19d
  18. 21 Mar, 2011 1 commit
  19. 03 Nov, 2010 1 commit
  20. 19 Oct, 2010 1 commit
  21. 21 Jan, 2010 2 commits
    • John Baldwin's avatar
      MFC 198411: · 49cc1344
      John Baldwin authored
      - Fix several off-by-one errors when using MAXCOMLEN.  The p_comm[] and
        td_name[] arrays are actually MAXCOMLEN + 1 in size and a few places that
        created shadow copies of these arrays were just using MAXCOMLEN.
      - Prefer using sizeof() of an array type to explicit constants for the
        array length in a few places.
      - Ensure that all of p_comm[] and td_name[] is always zero'd during
        execve() to guard against any possible information leaks.  Previously
        trailing garbage in p_comm[] could be leaked to userland in ktrace
        record headers via td_name[].
      49cc1344
    • John Baldwin's avatar
      MFC 198134,198149,198170,198171,198391,200948: · 7b10638c
      John Baldwin authored
      Add a facility for associating optional descriptions with active interrupt
      handlers.  This is primarily intended as a way to allow devices that use
      multiple interrupts (e.g. MSI) to meaningfully distinguish the various
      interrupt handlers.
      - Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate
        a description with an active interrupt handler setup by BUS_SETUP_INTR.
        It has a default method (bus_generic_describe_intr()) which simply passes
        the request up to the parent device.
      - Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports
        printf(9) style formatting using var args.
      - Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of
        an interrupt handler and copy the name passed to intr_event_add_handler()
        into that buffer instead of just saving the pointer to the name.
      - Add a new intr_event_describe_handler() which appends a description string
        to an interrupt handler's name.
      - Implement support for interrupt descriptions on amd64, i386, and sparc64 by
        having the nexus(4) driver supply a custom bus_describe_intr method that
        invokes a new intr_describe() MD routine which in turn looks up the
        associated interrupt event and invokes intr_event_describe_handler().
      7b10638c
  22. 23 Oct, 2009 2 commits
  23. 15 Oct, 2009 1 commit
    • John Baldwin's avatar
      Add a facility for associating optional descriptions with active interrupt · 37b8ef16
      John Baldwin authored
      handlers.  This is primarily intended as a way to allow devices that use
      multiple interrupts (e.g. MSI) to meaningfully distinguish the various
      interrupt handlers.
      - Add a new BUS_DESCRIBE_INTR() method to the bus interface to associate
        a description with an active interrupt handler setup by BUS_SETUP_INTR.
        It has a default method (bus_generic_describe_intr()) which simply passes
        the request up to the parent device.
      - Add a bus_describe_intr() wrapper around BUS_DESCRIBE_INTR() that supports
        printf(9) style formatting using var args.
      - Reserve MAXCOMLEN bytes in the intr_handler structure to hold the name of
        an interrupt handler and copy the name passed to intr_event_add_handler()
        into that buffer instead of just saving the pointer to the name.
      - Add a new intr_event_describe_handler() which appends a description string
        to an interrupt handler's name.
      - Implement support for interrupt descriptions on amd64 and i386 by having
        the nexus(4) driver supply a custom bus_describe_intr method that invokes
        a new intr_describe() MD routine which in turn looks up the associated
        interrupt event and invokes intr_event_describe_handler().
      
      Requested by:	many
      Reviewed by:	scottl
      MFC after:	2 weeks
      37b8ef16
  24. 20 May, 2009 1 commit
  25. 19 Oct, 2008 1 commit
  26. 15 Sep, 2008 1 commit
  27. 18 Jul, 2008 2 commits
  28. 11 Apr, 2008 1 commit
    • Jeff Roberson's avatar
      - Add the interrupt vector number to intr_event_create so MI code can · 9b33b154
      Jeff Roberson authored
         lookup hard interrupt events by number.  Ignore the irq# for soft intrs.
       - Add support to cpuset for binding hardware interrupts.  This has the
         side effect of binding any ithread associated with the hard interrupt.
         As per restrictions imposed by MD code we can only bind interrupts to
         a single cpu presently.  Interrupts can be 'unbound' by binding them
         to all cpus.
      
      Reviewed by:	jhb
      Sponsored by:	Nokia
      9b33b154
  29. 05 Apr, 2008 1 commit
    • John Baldwin's avatar
      Add a MI intr_event_handle() routine for the non-INTR_FILTER case. This · 1ee1b687
      John Baldwin authored
      allows all the INTR_FILTER #ifdef's to be removed from the MD interrupt
      code.
      - Rename the intr_event 'eoi', 'disable', and 'enable' hooks to
        'post_filter', 'pre_ithread', and 'post_ithread' to be less x86-centric.
        Also, add a comment describe what the MI code expects them to do.
      - On amd64, i386, and powerpc this is effectively a NOP.
      - On arm, don't bother masking the interrupt unless the ithread is
        scheduled in the non-INTR_FILTER case to match what INTR_FILTER did.
        Also, don't bother unmasking the interrupt in the post_filter case if
        we never masked it.  The INTR_FILTER case had been doing this by having
        arm_unmask_irq for the post_filter (formerly 'eoi') hook.
      - On ia64, stray interrupts are now masked for the non-INTR_FILTER case.
        They were already masked in the INTR_FILTER case.
      - On sparc64, use the a NULL pre_ithread hook and use intr_enable_eoi() for
        both the 'post_filter' and 'post_ithread' hooks to match what the
        non-INTR_FILTER code did.
      - On sun4v, retire the ithread wrapper hack by using an appropriate
        'post_ithread' hook instead (it's what 'post_ithread'/'enable' was
        designed to do even in 5.x).
      
      Glanced at by:	piso
      Reviewed by:	marius
      Requested by:	marius [1], [5]
      Tested on:	amd64, i386, arm, sparc64
      1ee1b687
  30. 17 Mar, 2008 1 commit
    • John Baldwin's avatar
      Simplify the interrupt code a bit: · 6d2d1c04
      John Baldwin authored
      - Always include the ie_disable and ie_eoi methods in 'struct intr_event'
        and collapse down to one intr_event_create() routine.  The disable and
        eoi hooks simply aren't used currently in the !INTR_FILTER case.
      - Expand 'disab' to 'disable' in a few places.
      - Use function casts for arm and i386:intr_eoi_src() instead of wrapper
        routines since to trim one extra indirection.
      
      Compiled on:	{arm,amd64,i386,ia64,ppc,sparc64} x {FILTER, !FILTER}
      Tested on:	{amd64,i386} x {FILTER, !FILTER}
      6d2d1c04
  31. 14 Mar, 2008 1 commit
    • John Baldwin's avatar
      Add preliminary support for binding interrupts to CPUs: · eaf86d16
      John Baldwin authored
      - Add a new intr_event method ie_assign_cpu() that is invoked when the MI
        code wishes to bind an interrupt source to an individual CPU.  The MD
        code may reject the binding with an error.  If an assign_cpu function
        is not provided, then the kernel assumes the platform does not support
        binding interrupts to CPUs and fails all requests to do so.
      - Bind ithreads to CPUs on their next execution loop once an interrupt
        event is bound to a CPU.  Only shared ithreads are bound.  We currently
        leave private ithreads for drivers using filters + ithreads in the
        INTR_FILTER case unbound.
      - A new intr_event_bind() routine is used to bind an interrupt event to
        a CPU.
      - Implement binding on amd64 and i386 by way of the existing pic_assign_cpu
        PIC method.
      - For x86, provide a 'intr_bind(IRQ, cpu)' wrapper routine that looks up
        an interrupt source and binds its interrupt event to the specified CPU.
        MI code can currently (ab)use this by doing:
      
      	intr_bind(rman_get_start(irq_res), cpu);
      
        however, I plan to add a truly MI interface (probably a bus_bind_intr(9))
        where the implementation in the x86 nexus(4) driver would end up calling
        intr_bind() internally.
      
      Requested by:	kmacy, gallatin, jeff
      Tested on:	{amd64, i386} x {regular, INTR_FILTER}
      eaf86d16
  32. 06 May, 2007 1 commit
    • Paolo Pisati's avatar
      Bring in the reminaing bits to make interrupt filtering work: · bafe5a31
      Paolo Pisati authored
      o push much of the i386 and amd64 MD interrupt handling code
        (intr_machdep.c::intr_execute_handlers()) into MI code
        (kern_intr.c::ithread_loop())
      o move filter handling to kern_intr.c::intr_filter_loop()
      o factor out the code necessary to mask and ack an interrupt event
        (intr_machdep.c::intr_eoi_src() and intr_machdep.c::intr_disab_eoi_src()),
        and make them part of 'struct intr_event', passing them as arguments to
        kern_intr.c::intr_event_create().
      o spawn a private ithread per handler (struct intr_handler::ih_thread)
        with filter and ithread functions.
      
      Approved by: re (implicit?)
      bafe5a31
  33. 19 Apr, 2007 1 commit
    • Nate Lawson's avatar
      Bump the interrupt storm detection counter to 1000. My slow fileserver · 0ae62c18
      Nate Lawson authored
      gets a bogus irq storm detected when periodic daily kicks off at 3 am
      and disconnects the disk.  Change the print logic to print once per second
      when the storm is occurring instead of only once.  Otherwise, it appeared
      that something else was causing the errors each night at 3 am since the
      print only occurred the first time.
      
      Reviewed by:	jhb
      MFC after:	1 week
      0ae62c18
  34. 23 Feb, 2007 1 commit
  35. 12 Dec, 2006 1 commit
  36. 26 Oct, 2005 1 commit