1. 11 Mar, 2021 10 commits
    • Warner Losh's avatar
      man: Remove obsolete info from hosts man page · c22076b5
      Warner Losh authored
      The NIC no longer provides a host database, and hasn't for quite some
      time. Remove that paragraph, it's not been relevant for many years. Also, hosts
      appeared in 4.1c, not 4.2, so correct that too.
      Noticed by: Henry Bent
    • Warner Losh's avatar
      nvme: use config_intrhook_drain to avoid removable card races · 8423f5d4
      Warner Losh authored
      nvme drives are configured early in boot. However, a number of the configuration
      steps takes which take a while, so we defer those to a config intrhook that runs
      before the root filesystem is mounted. At the same time, the PCI hot plug wakes
      up and tests the status of the card. It may decide that the card has gone away
      and deletes the child. As part of that process nvme_detach is called. If this
      call happens after the config_intrhook starts to run, but before it is finished,
      there's a race where we can tear down the device's soft state while the
      config_intrhook is still using it. Use the new config_intrhook_drain to
      disestablish the hook. Either it will be removed w/o running, or the routine
      will wait for it to finish. This closes the race and allows safe hotplug at any
      time, even very early in boot.
      Sponsored by:		Netflix, Inc
      Reviewed by:		jhb, mav
      Differential Revision:	https://reviews.freebsd.org/D29006
    • Warner Losh's avatar
      config_intrhook: provide config_intrhook_drain · e5236836
      Warner Losh authored
      config_intrhook_drain will remove the hook from the list as
      config_intrhook_disestablish does if the hook hasn't been called.  If it has,
      config_intrhook_drain will wait for the hook to be disestablished in the normal
      course (or expedited, it's up to the driver to decide how and when
      to call config_intrhook_disestablish).
      This is intended for removable devices that use config_intrhook and might be
      attached early in boot, but that may be removed before the kernel can call the
      config_intrhook or before it ends. To prevent all races, the detach routine will
      need to call config_intrhook_train.
      Sponsored by:		Netflix, Inc
      Reviewed by:		jhb, mav, gde (in D29006 for man page)
      Differential Revision:	https://reviews.freebsd.org/D29005
    • Edward Tomasz Napierala's avatar
      linsysfs: create /sys/bus/ and /sys/subsystem/ · dc0119c2
      Edward Tomasz Napierala authored
      This looks like a no-op, but it prevents udevadm(8) with failing
      loudly, which in turn unbreaks installation of libfprint-2-2, which
      in Focal is a dependency for make-4.2.1-1.2.
      One might wonder why installing a build utility involves messing
      with device handling...
      Sponsored By:	The FreeBSD Foundation
      Differential Revision:	https://reviews.freebsd.org/D29133
    • Mark Johnston's avatar
      vm_reserv: Fix list locking in vm_reserv_reclaim_contig() · 968079f2
      Mark Johnston authored
      The per-domain partpop queue is locked by the combination of the
      per-domain lock and individual reservation mutexes.
      vm_reserv_reclaim_contig() scans the queue looking for partially
      populated reservations that can be reclaimed in order to satisfy the
      caller's allocation.
      During the scan, we drop the per-domain lock.  At this point, the rvn
      pointer may be invalidated.  Take care to load rvn after re-acquiring
      the per-domain lock.
      While here, simplify the condition used to check whether a reservation
      was dequeued while the per-domain lock was dropped.
      Reviewed by:	alc, kib
      Reported by:	gallatin
      MFC after:	3 days
      Sponsored by:	The FreeBSD Foundation
      Differential Revision:	https://reviews.freebsd.org/D29203
    • Warner Losh's avatar
      usb: tiny formatting nit · 1645a4ae
      Warner Losh authored
      Format 300 baud like all the others here. No functional change.
    • Kristof Provost's avatar
      pf: Remove redundant kif != NULL checks · 913e7dc3
      Kristof Provost authored
      pf_kkif_free() already checks for NULL, so we don't have to check before
      we call it.
      Reviewed by:	melifaro@
      MFC after:	1 week
      Sponsored by:	Rubicon Communications, LLC ("Netgate")
      Differential Revision:	https://reviews.freebsd.org/D29195
    • Kristof Provost's avatar
      pf: Factor out pf_krule_free() · 5e9dae8e
      Kristof Provost authored
      Reviewed by:	melifaro@
      MFC after:	1 week
      Sponsored by:	Rubicon Communications, LLC ("Netgate")
      Differential Revision:	https://reviews.freebsd.org/D29194
    • Oskar Holmund's avatar
      usr.sbin/pwm/pwm add support for flags · 17b14d8f
      Oskar Holmund authored
      The pwm utility cant set the only flag defined (PWM_POLARITY_INVERTED) so this
      patch add the option -I (capital letter i) to send it to the drivers.
      None of existing PWM driver have implemented support for flags.
      But soon:ish I will put up an review of a pwm driver using TI OMAP DMTimer.
      Differential Revision: https://reviews.freebsd.org/D29137
      MFC after:   2 weeks
    • Oskar Holmund's avatar
      share/man/man9/pwmbus.9 fix types in arguments · 7d4a5de8
      Oskar Holmund authored
      Fix the types of period and duty in share/man/man9/pwmbus.9 to match the one in sys/dev/pmw/pwmbus.c.
      Reviewed By: rpokala
      Differential Revision: https://reviews.freebsd.org/D29139
      MFC after:   3 days
  2. 10 Mar, 2021 12 commits
    • Greg V's avatar
      kern.mk: fix -Wno-error style to fix build with Clang 12 · 15565e0a
      Greg V authored
      Clang 12 no longer supports -Wno-error-..., only the -Wno-error=...
      style (which is already used everywhere else in the tree).
      Differential Revision:	https://reviews.freebsd.org/D29157
    • Alexander V. Chernikov's avatar
      Flush remaining routes from the routing table during VNET shutdown. · b1d63265
      Alexander V. Chernikov authored
      This fixes rtentry leak for the cloned interfaces created inside the
      PR:	253998
      Reported by:	rashey at superbox.pl
      MFC after:	3 days
      Loopback teardown order is `SI_SUB_INIT_IF`, which happens after `SI_SUB_PROTO_DOMAIN` (route table teardown).
      Thus, any route table operations are too late to schedule.
      As the intent of the vnet teardown procedures to minimise the amount of effort by doing global cleanups instead of per-interface ones, address this by adding a relatively light-weight routing table cleanup function, `rib_flush_routes()`.
      It removes all remaining routes from the routing table and schedules the deletion, which will happen later, when `rtables_destroy()` waits for the current epoch to finish.
      Test Plan:
      set_skip:set_skip_group_lo  ->  passed  [0.053s]
      tail -n 200 /var/log/messages | grep rtentry
      Reviewers: #network, kp, bz
      Reviewed By: kp
      Subscribers: imp, ae
      Differential Revision: https://reviews.freebsd.org/D29116
    • John Baldwin's avatar
      ktls: Fix non-inplace TLS 1.3 encryption. · 3fa03421
      John Baldwin authored
      Copy the iovec for the trailer from the proper place.  This is the same
      fix for CBC encryption from ff6a7e4b.
      Reported by:	gallatin
      Reviewed by:	gallatin, markj
      Fixes:		49f6925c
      Sponsored by:	Netflix
      Differential Revision:	https://reviews.freebsd.org/D29177
    • Alexander Motin's avatar
      Move time math out of disabled interrupts sections. · 2cee045b
      Alexander Motin authored
      We don't need the result before next sleep time, so no reason to
      additionally increase interrupt latency.
      While there, remove extra PM ticks to microseconds conversion, making
      C2/C3 sleep times look 4 times smaller than really.  The conversion
      is already done by AcpiGetTimerDuration().  Now I see reported sleep
      times up to 0.5s, just as expected for planned 2 wakeups per second.
      MFC after:	1 month
    • Olivier Houchard's avatar
      arm64: Fix COMPAT_FREEBSD32. · c328f64d
      Olivier Houchard authored
      The ENTRY() macro was modified by commit
      28d94520 to add an optional NOP instruction
      at the beginning of the function. It is of course an arm64 instruction, so
      unsuitable for the 32bits sigcode. So just use EENTRY() instead for
      aarch32_sigcode. This should fix receiving signals when running 32bits
      binaries on FreeBSD/arm64.
      MFC After: 1 week
    • Dag-Erling Smørgrav's avatar
      Fix post-start check when unbound.conf has moved. · 409388cf
      Dag-Erling Smørgrav authored
      Reported by:	phk@
      MFC after:	1 week
    • Dag-Erling Smørgrav's avatar
      Fix local-unbound setup for some IPv6 deployments. · e5f02c14
      Dag-Erling Smørgrav authored
      PR:		250984
      MFC after:	1 week
    • Mitchell Horne's avatar
      ns8250: don't drop IER_TXRDY on bus_grab/ungrab · 7e7f7bee
      Mitchell Horne authored
      It has been observed that some systems are often unable to resume from
      ddb after entering with debug.kdb.enter=1. Checking the status further
      shows the terminal is blocked waiting in tty_drain(), but it never makes
      progress in clearing the output queue, because sc->sc_txbusy is high.
      I noticed that when entering polling mode for the debugger, IER_TXRDY is
      set in the failure case. Since this bit is never tracked by the softc,
      it will not be restored by ns8250_bus_ungrab(). This creates a race in
      which a TX interrupt can be lost, creating the hang described above.
      Ensuring that this bit is restored is enough to prevent this, and resume
      from ddb as expected.
      The solution is to track this bit in the sc->ier field, for the same
      lifetime that TX interrupts are enabled.
      PR:		223917, 240122
      Reviewed by:	imp, manu
      Tested by:	bz
      MFC after:	5 days
      Sponsored by:	The FreeBSD Foundation
      Differential Revision:	https://reviews.freebsd.org/D29130
    • Alex Richardson's avatar
      Arch64: Clear VFP state on execve() · 953a7d7c
      Alex Richardson authored
      I noticed that many of the math-related tests were failing on AArch64.
      After a lot of debugging, I noticed that the floating point exception flags
      were not being reset when starting a new process. This change resets the
      VFP inside exec_setregs() to ensure no VFP register state is leaked from
      parent processes to children.
      This commit also moves the clearing of fpcr that was added in 65618fdd
      from fork() to execve() since that makes more sense: fork() can retain
      current register values, but execve() should result in a well-defined
      clean state.
      Reviewed By:	andrew
      MFC after:	1 week
      Differential Revision: https://reviews.freebsd.org/D29060
    • Hans Petter Selasky's avatar
      Allocating the LinuxKPI current structure from a software interrupt thread · dfb33cb0
      Hans Petter Selasky authored
      must be done using the M_NOWAIT flag after 1ae20f7c .
      MFC after:	1 week
      Sponsored by:	Mellanox Technologies // NVIDIA Networking
    • Hans Petter Selasky's avatar
      Use the word "LinuxKPI" instead of "Linux compatibility", to not confuse with · 6eb60f5b
      Hans Petter Selasky authored
      user-space Linux compatibility support. No functional change.
      MFC after:	1 week
      Sponsored by:	Mellanox Technologies // NVIDIA Networking
    • Hans Petter Selasky's avatar
      Allocating the LinuxKPI current structure from an interrupt thread must be · d1cbe790
      Hans Petter Selasky authored
      done using the M_NOWAIT flag after 1ae20f7c .
      MFC after:	1 week
      Sponsored by:	Mellanox Technologies // NVIDIA Networking
  3. 09 Mar, 2021 14 commits
    • Kyle Evans's avatar
      wg(4): note the persistent-keepalive ifconfig(8) option · ce53f92e
      Kyle Evans authored
      MFC after:	3 days
      Fixes:	b3dac391
    • Hans Petter Selasky's avatar
      Implement basic support for allocating memory from a specific numa node · ebe5cf35
      Hans Petter Selasky authored
      in the LinuxKPI.
      Differential Revision:	https://reviews.freebsd.org/D29077
      Reviewed by:	markj@ and kib@
      MFC after:	1 week
      Sponsored by:	Mellanox Technologies // NVIDIA Networking
    • Kyle Evans's avatar
      if_wg: export tx_bytes, rx_bytes, and last_handshake · 94dddbfd
      Kyle Evans authored
      The names are self-explanatory; these are currently only used by the
      wg(8) tool, but they are handy data points to have.
      Reviewed by:	grehan
      MFC after:	3 days
      Discussed with:	decke
      Differential Revision:	https://reviews.freebsd.org/D29143
    • Kyle Evans's avatar
      iflib: allow clone detach if not yet init · 0dd691b4
      Kyle Evans authored
      If we hit an error during init, then we'll unwind our state and attempt
      to detach the device -- don't block it.
      This was discovered by creating a wg0 with missing parameters; said
      failure ended up leaving this orphaned device in place and ended up
      panicking the system upon enumeration of the dev.* sysctl space.
      Reviewed by:	gallatin, markj
      MFC after:	3 days
      Differential Revision:	https://reviews.freebsd.org/D29145
    • Kyle Evans's avatar
      if_wg: wg_input: remove a couple locals (NFC) · 299f8977
      Kyle Evans authored
      We have no use for the udphdr or this hlen local, just spell out the
      addition inline.
      MFC after:	3 days
      Reviewed by:	grehan, markj
      Differential Revision:	https://reviews.freebsd.org/D29142
    • Jason A. Harmening's avatar
      amd64 pmap: convert to counter(9), add PV and pagetable page counts · e4b8deb2
      Jason A. Harmening authored
      This change converts most of the counters in the amd64 pmap from
      global atomics to scalable counter(9) counters.  Per discussion
      with kib@, it also removes the handrolled per-CPU PCID save count
      as it isn't considered generally useful.
      The bulk of these counters remain guarded by PV_STATS, as it seems
      unlikely that they will be useful outside of very specific debugging
      scenarios.  However, this change does add two new counters that
      are available without PV_STATS.  pt_page_count and pv_page_count
      track the number of active physical-to-virtual list pages and page
      table pages, respectively.  These will be useful in evaluating
      the memory footprint of pmap structures under various workloads,
      which will help to guide future changes in this area.
      Reviewed by:	kib
      Differential Revision:	https://reviews.freebsd.org/D28923
    • Leandro Lupori's avatar
      ofwfb: fix boot on LE · 043577b7
      Leandro Lupori authored
      Some framebuffer properties obtained from the device tree were not being
      properly converted to host endian.
      Replace OF_getprop calls by OF_getencprop where needed to fix this.
      This fixes boot on PowerPC64 LE, when using ofwfb as the system console.
      Reviewed by:    bdragon
      Sponsored by:   Eldorado Research Institute (eldorado.org.br)
      MFC after:      1 week
      Differential Revision:  https://reviews.freebsd.org/D27475
    • Baptiste Daroussin's avatar
      Revert "rc: implement parallel boot" · f61831d2
      Baptiste Daroussin authored
      This is not ready yet for prime time
      This reverts commit 763db589.
      This reverts commit f1ab7999.
      This reverts commit 6e822e99.
      This reverts commit 77e1ccbe.
    • Kyle Evans's avatar
      ifconfig: allow displaying/setting persistent-keepalive · b3dac391
      Kyle Evans authored
      The kernel-side already accepted a persistent-keepalive-interval, so
      just add a verb to ifconfig(8) for it and start exporting it so that
      ifconfig(8) can view it.
      PR:		253790
      MFC after:	3 days
      Discussed with:	decke
    • Kyle Evans's avatar
      ifconfig: wg: stop requiring peer endpoints · 172a8241
      Kyle Evans authored
      The way that wireguard is designed does not actually require all peers
      to have endpoints. In an architecture that might mimic a traditional
      VPN server <-> client, the wg interface on a server would have a number
      of peers without set endpoints -- the expectation is that the "clients"
      will connect to the "server" peer, which will authenticate the
      connection as a known peer and learn the endpoint from there.
      MFC after:	3 days
      Discussed with:	decke, grehan (independently)
    • Kyle Evans's avatar
      kern: malloc: fix panic on M_WAITOK during THREAD_NO_SLEEPING() · 1ae20f7c
      Kyle Evans authored
      Simple condition flip; we wanted to panic here after epoch_trace_list().
      Reviewed by:	glebius, markj
      MFC after:	3 days
      Differential Revision:	https://reviews.freebsd.org/D29125
    • Kyle Evans's avatar
      if_wg: avoid sleeping under the net epoch · e80e371d
      Kyle Evans authored
      No sleeping allowed here, so avoid it.  Collect the subset of data we
      want inside of the epoch, as we'll need extra allocations when we add
      items to the nvlist.
      Reviewed by:	grehan (earlier version), markj
      MFC after:	3 days
      Differential Revision:	https://reviews.freebsd.org/D29124
    • Kyle Evans's avatar
      if_wg: return to m_defrag() of incoming mbuf, sans leak · bae59285
      Kyle Evans authored
      This partially reverts df554850 but still fixes the leak. It was
      overlooked (sigh) that some packets will exceed MHLEN and cannot be
      physically contiguous without clustering, but we don't actually need
      it to be. m_defrag() should pull up enough for any of the headers that
      we do need to be accessible.
      Fixes:	df554850
      Pointy hat;	kevans
    • Rick Macklem's avatar
      mountd(8): generate a syslog message when the "V4:" line is missing · 09673fc0
      Rick Macklem authored
      Daniel reported that NFSv4 mounts were not working despite having
      set "nfsv4_server_enable=YES" in /etc/rc.conf.  Mountd was logging a
      message that there was no /etc/exports file.
      He noted that creating a /etc/exports file with a "V4:" line in it
      was needed make NFSv4 mounts work.
      At least one "V4:" line in one of the exports(5) file(s) is needed to
      make NFSv4 mounts work. This patch fixes mountd.c so that it logs a
      message indicting that there is no "V4:" line in any exports(5)
      file when NFSv4 mounts are enabled.
      To avoid this message being generated erroneously, /etc/rc.d/mountd
      is updated to make sure vfs.nfsd.server_max_nfsvers is properly set
      before mountd(8) is started.
      Reported by:	debdrup
      PR:	253901
      MFC after:	2 weeks
  4. 08 Mar, 2021 4 commits
    • Alexander Motin's avatar
      Do not read timer extra time when MWAIT is used. · 075e4807
      Alexander Motin authored
      When we enter C2+ state via memory read, it may take chipset some
      time to stop CPU.  Extra register read covers that time.  But MWAIT
      makes CPU stop immediately, so we don't need to waste time after
      wakeup with interrupts still disabled, increasing latency.
      On my system it reduces ping localhost latency, waking up all CPUs
      once a second, from 277us to 242us.
      MFC after:	1 month
    • Alexander Motin's avatar
      Change mwait_bm_avoidance use to match Linux. · 45521967
      Alexander Motin authored
      Even though the information is very limited, it seems the intent of
      this flag is to control ACPI_BITREG_BUS_MASTER_STATUS use for C3,
      not force ACPI_BITREG_ARB_DISABLE manipulations for C2, where it was
      never needed, and which register not really doing anything for years.
      It wasted lots of CPU time on congested global ACPI hardware lock
      when many CPU cores were trying to enter/exit deep C-states same time.
      On idle 80-core system it pushed ping localhost latency up to 20ms,
      since badport_bandlim() via counter_ratecheck() wakes up all CPUs
      same time once a second just to synchronously reset the counters.
      Now enabling C-states increases the latency from 0.1 to just 0.25ms.
      Discussed with:	kib
      MFC after:	1 month
    • Warner Losh's avatar
    • Warner Losh's avatar
      config_intrhook: Move from TAILQ to STAILQ and padding · 88a55912
      Warner Losh authored
      config_intrhook doesn't need to be a two-pointer TAILQ. We rarely add/delete
      from this and so those need not be optimized. Instaed, use the one-pointer
      STAILQ plus a uintptr_t to be used as a flags word. This will allow these
      changes to be MFC'd to 12 and 13 to fix a race in removable devices.
      Feedback from: jhb
      Reviewed by: mav
      Differential Revision:	https://reviews.freebsd.org/D29004