1. 22 May, 2018 40 commits
    • Andrew Turner's avatar
      Handle reserved memory with the no-map property. · 1442afc1
      Andrew Turner authored
      We shouldn't be mapping this memory, so we need to find it so it
      can be excluded from the phys_avail map.
      
      Reviewed by:	manu
      Obtained from:	ABT Systems Ltd
      Sponsored by:	Turing Robotic Industries
      Differential Revision:	https://reviews.freebsd.org/D15518
      1442afc1
    • Mark Johnston's avatar
      Initialize the dumper struct before calling set_dumper(). · 9f78e2b8
      Mark Johnston authored
      Fields owned by the generic code were being left uninitialized,
      causing problems in clear_dumper() if an error occurred.
      
      Coverity CID:	1391200
      X-MFC with:	r333283
      9f78e2b8
    • Fabien Thomas's avatar
      Add a SPD cache to speed up lookups. · f8e73c47
      Fabien Thomas authored
      When large SPDs are used, we face two problems:
      
      - too many CPU cycles are spent during the linear searches in the SPD
        for each packet
      - too much contention on multi socket systems, since we use a single
        shared lock.
      
      Main changes:
      
      - added the sysctl tree 'net.key.spdcache' to control the SPD cache
        (disabled by default).
      - cache the sp indexes that are used to perform SP lookups.
      - use a range of dedicated mutexes to protect the cache lines.
      
      Submitted by: Emeric Poupon <emeric.poupon@stormshield.eu>
      Reviewed by: ae
      Sponsored by:	Stormshield
      Differential Revision: https://reviews.freebsd.org/D15050
      f8e73c47
    • John Baldwin's avatar
      Use __SCCSID for SCCS IDs in libkvm sources. · 993d074b
      John Baldwin authored
      Rather than using #ifdef's around a static char array, use the
      existing helper macro from <sys/cdefs.h> for SCCS IDs.  To
      preserve existing behavior, add -DNO__SCCSID to CFLAGS to not
      include SCCS IDs in the built library by default.
      
      Reviewed by:	brooks, dab (older version)
      Reviewed by:	rgrimes
      Differential Revision:	https://reviews.freebsd.org/D15459
      993d074b
    • Andrew Turner's avatar
      Revert r334035 for now. It breaks the boot on some boards as er expect to · 84cac654
      Andrew Turner authored
      be able to read UEFI RuntimeData memory via the DMAP region.
      84cac654
    • Mark Johnston's avatar
      Typo. · be9292a8
      Mark Johnston authored
      Reported by:	rgrimes, vangyzen
      X-MFC with:	r334050
      be9292a8
    • Mark Johnston's avatar
      Flush caches before initiating a microcode update on Intel CPUs. · 6030b0c6
      Mark Johnston authored
      This apparently works around issues with updates of certain Broadwell
      CPUs.
      
      Reviewed by:	emaste, kib, sbruno
      MFC after:	3 days
      Differential Revision:	https://reviews.freebsd.org/D15520
      6030b0c6
    • Mark Johnston's avatar
      Simplify lagg_input(). · db5a36bd
      Mark Johnston authored
      No functional change intended.
      
      MFC after:	2 weeks
      db5a36bd
    • Mateusz Guzik's avatar
      sx: fixup a braino in r334024 · ee252fc9
      Mateusz Guzik authored
      If a thread waiting on sx dropped Giant it would not be properly
      reacquired on exit from the routine, later resulting in panics
      indicating Giant is not held (when it should be).
      
      The bug was not present in the original patch sent to pho, I wittingly
      added it just prior to the commit and only smoke-tested it.
      
      Reported by:	pho
      ee252fc9
    • Ed Maste's avatar
      intel-ucode-split: add -n flag to skip creating output files · f2b600b2
      Ed Maste authored
      Sponsored by:	The FreeBSD Foundation
      f2b600b2
    • Andrew Turner's avatar
      Pass the array length into regions_to_avail. · 3a327967
      Andrew Turner authored
      On arm64 we will need to get the phys_avail array from before the kernel
      is excluded to create teh DMAP region. In preperation for this pass in the
      array length into regions_to_avail.
      3a327967
    • Konstantin Belousov's avatar
      Use local unique labels inside most often used macros. · 82a4284d
      Konstantin Belousov authored
      Discussed with:	bde
      Sponsored by:	The FreeBSD Foundation
      MFC after:	1 week
      82a4284d
    • Emmanuel Vadot's avatar
      bus_dma(9): Correct arm64 BUS_DMA_COHERENT implementation note · 435b87a9
      Emmanuel Vadot authored
      BUS_DMA_COHERENT isn't supported in bus_dmamap_create but bus_dma_tag_create.
      Document it properly.
      
      Submitted by:	andrew
      435b87a9
    • Konstantin Belousov's avatar
      Fix double-load of %cr3 and double-copy of the stack frame for the · a3c7cd11
      Konstantin Belousov authored
      kernel entry from userspace vm86.
      
      Sponsored by:	The FreeBSD Foundation
      MFC after:	1 week
      a3c7cd11
    • Andrey V. Elsukov's avatar
      Restore the ability to keep states after parent rule deletion. · 67ad3c0b
      Andrey V. Elsukov authored
      This feature is disabled by default and was removed when dynamic states
      implementation changed to be lockless. Now it is reimplemented with small
      differences - when dyn_keep_states sysctl variable is enabled,
      dyn_match_ipv[46]_state() function doesn't match child states of deleted
      rule. And thus they are keept alive until expired. ipfw_dyn_lookup_state()
      function does check that state was not orphaned, and if so, it returns
      pointer to default_rule and its position in the rules map. The main visible
      difference is that orphaned states still have the same rule number that
      they have before parent rule deleted, because now a state has many fields
      related to rule and changing them all atomically to point to default_rule
      seems hard enough.
      
      Reported by:	<lantw44 at gmail.com>
      MFC after:	2 days
      67ad3c0b
    • Konstantin Belousov's avatar
      Enable IBRS when entering an interrupt handler from usermode. · 14f7050d
      Konstantin Belousov authored
      Sponsored by:	The FreeBSD Foundation
      MFC after:	1 week
      14f7050d
    • Andrew Turner's avatar
      Only set realmem based on memory where the EXFLAG_NOALLOC is unset. This · 5a00bf53
      Andrew Turner authored
      will allow us to query the maps at any time without disturbing this value.
      
      Obtained from:	ABT Systems Ltd
      Sponsored by:	Turing Robotic Industries
      5a00bf53
    • Andrew Turner's avatar
      On ThunderX2 we need to be careful to only map the memory the firmware · 89b5faf8
      Andrew Turner authored
      lists in the EFI memory map. As such we need to reduce the mappings to
      restrict them to not be the full 1G block. For now reduce this to a 2M
      block, however this may be further restricted to be 4k page aligned as
      other SoCs may require.
      
      This allows ThunderX2 to boot reliably to userspace without performing
      any speculative memory accesses to invalid physical memory.
      
      Sponsored by:	DARPA, AFRL
      89b5faf8
    • Emmanuel Vadot's avatar
      bus_dma(9): arm64 implementation notes · c6231a5f
      Emmanuel Vadot authored
      Indicate that BUS_DMA_COHERENT is supported for bus_dmamem_alloc and
      bus_dmamem_create in the arm64 implementation.
      c6231a5f
    • Andrew Turner's avatar
      Stop using the DMAP region to map ACPI memory. · 9d0728e0
      Andrew Turner authored
      On some arm64 boards we need to access memory in ACPI tables that is not
      mapped in the DMAP region. To handle this create the needed mappings in
      pmap_mapbios in the KVA space.
      
      Submitted by:	Michal Stanek (mst@semihalf.com)
      Sponsored by:	Cavium
      Differential Revision:	https://reviews.freebsd.org/D15059
      9d0728e0
    • Andrew Turner's avatar
      Switch arm64 to use the same physmem code as 32-bit arm. · 79402150
      Andrew Turner authored
      The main advantage of this is to allow us to exclude memory from being
      used by the kernel. This may be from the memreserve property, or ranges
      marked as no-map under the reserved-memory node.
      
      More work is still needed to remove the physmap array. This is still used
      for creating the DMAP region, however other patches need to be committed
      before we can remove this.
      
      Obtained from:	ABT Systems Ltd
      Sponsored by:	Turing Robotic Industries
      79402150
    • Konstantin Belousov's avatar
      Implement printf(3) family %m format string extension. · e95725fe
      Konstantin Belousov authored
      Reviewed by:	ed, dim (code only)
      Sponsored by:	Mellanox Technologies
      MFC after:	1 week
      e95725fe
    • Andrew Turner's avatar
      Allow the 32-bit arm physmem code to work on arm64. · 66971d57
      Andrew Turner authored
      This will help simplify the arm64 code and allow us to properly exclude
      memory that should never be mapped.
      
      Obtained from:	ABT Systems Ltd
      Sponsored by:	Turing Robotic Industries
      66971d57
    • Andrew Turner's avatar
      Coalesce adjacent physical mappings. · 89ae4d7f
      Andrew Turner authored
      This reduces the overhead when we have many small mappings, e.g. on some
      EFI systems. This is to help use this code on arm64 where we may have a
      large number of entries from the EFI firmware.
      
      Obtained from:	ABT Systems Ltd
      Sponsored by:	Turing Robotic Industries
      Differential Revision:	https://reviews.freebsd.org/D15477
      89ae4d7f
    • Roger Pau Monné's avatar
      xen-blkback: do not use state 3 (XenbusStateInitialised) · ffe4446b
      Roger Pau Monné authored
      Linux will not connect to a backend that's in state 3
      (XenbusStateInitialised), it needs to be in state 2
      (XenbusStateInitWait) for Linux to attempt to connect to the backend.
      
      The protocol seems to suggest that the backend should indeed wait in
      state 2 for the frontend to connect, which makes state 3 unusable for
      disk backends.
      
      Also make sure blkback will connect to the frontend if the frontend
      reaches state 3 (XenbusStateInitialised) before blkback has processed
      the results from the hotplug script (Submitted by Nathan Friess).
      
      MFC after:	1 week
      ffe4446b
    • Mateusz Guzik's avatar
      Reduce sdt-related branch-fest in mi_switch. · 99ece3a9
      Mateusz Guzik authored
      The code was evaluating flags before resorting to checking if dtrace is
      enabled. This was inducing forward jumps in the common case.
      99ece3a9
    • Eitan Adler's avatar
      top(1): increase size of 'status' buffer · dbcdf411
      Eitan Adler authored
      This corrects a warning issues by gcc9:
      /srv/src/freebsd/head/usr.bin/top/machine.c:988:22: warning: '%5zu'
      directive writing between 5 and 20 bytes into a
       region of size 15 [-Wformat-overflow=]
           sprintf(status, "?%5zu", state);
      dbcdf411
    • Mateusz Guzik's avatar
      sx: port over writer starvation prevention measures from rwlock · 2466d12b
      Mateusz Guzik authored
      A constant stream of readers could completely starve writers and this is not
      a hypothetical scenario.
      
      The 'poll2_threads' test from the will-it-scale suite reliably starves writers
      even with concurrency < 10 threads.
      
      The problem was run into and diagnosed by dillon@backplane.com
      
      There was next to no change in lock contention profile during -j 128 pkg build,
      despite an sx lock being at the top.
      
      Tested by:	pho
      2466d12b
    • Mateusz Guzik's avatar
      rw: decrease writer starvation · 9feec7ef
      Mateusz Guzik authored
      Writers waiting on readers to finish can set the RW_LOCK_WRITE_SPINNER
      bit. This prevents most new readers from coming on. However, the last
      reader to unlock also clears the bit which means new readers can sneak
      in and the cycle starts over.
      
      Change the code to keep the bit after last unlock.
      
      Note that starvation potential is still there: no matter how many write
      spinners are there, there is one bit. After the writer unlocks, the lock
      is free to get raided by readers again. It is good enough for the time
      being.
      
      The real fix would include counting writers.
      
      This runs into a caveat: the writer which set the bit may now be preempted.
      In order to get rid of the problem all attempts to set the bit are preceeded
      with critical_enter.
      
      The bit gets cleared when the thread which set it goes to sleep. This way
      an invariant holds that if the bit is set, someone is actively spinning and
      will grab the lock soon. In particular this means that readers which find
      the lock in this transient state can safely spin until the lock finds itself
      an owner (i.e. they don't need to block nor speculate how long to spin
      speculatively).
      
      Tested by:	pho
      9feec7ef
    • Cy Schubert's avatar
      Conform to Berne Convention. · c76af090
      Cy Schubert authored
      MFC after:	3 days
      c76af090
    • Marcelo Araujo's avatar
      Revert: r334016 · 92046bf1
      Marcelo Araujo authored
      Revert for now this change, it in somehow breaks init_pci.
      92046bf1
    • Matt Macy's avatar
      df58dad5
    • Marcelo Araujo's avatar
      Include atkbdc header where there are declared the prototype functions · 2d03aa59
      Marcelo Araujo authored
      atkbdc_event and atkbdc_init.
      
      MFC after:	4 weeks.
      Sponsored by:	iXsystems Inc.
      2d03aa59
    • Matt Macy's avatar
      fix i386 builds after r334005 and r334009 · 137fd41b
      Matt Macy authored
      r334005: add pc_ibpb_set as it is now referenced by common code
      (although presumably not needed on i386 since it has been there
      since the first spectre mitigation work on amd64)
      
      r334009: there is no amd64 rflags -> i386 eflags
      137fd41b
    • Matt Macy's avatar
      pmcstat: add option to not decode the leaf function in top mode · 821a352a
      Matt Macy authored
      -I will allow the user to see the hot instruction in question
      as opposed getting the name of the function
      821a352a
    • Marcelo Araujo's avatar
      We must free the variable str. · b5e3928d
      Marcelo Araujo authored
      Spotted by:	clang's static analyzer
      Submitted by:	Tom Rix <trix_juniper.net>
      Reviewed by:	grehan
      MFC after:	4 weeks
      Sponsored by:	iXsystems Inc.
      Differential Revision:	https://reviews.freebsd.org/D10009
      b5e3928d
    • Justin Hibbits's avatar
      Add an IPMI attachment for PowerNV systems · 1a3eaf6c
      Justin Hibbits authored
      IPMI access on PowerNV systems is done through the OPAL firmware.  This adds a
      simple attachment for communicating with the FSP/BMC on these machines.  This
      has been tested on a Talos POWER9 workstation, only in the bootup phase, noting
      the successful attachment messages:
      
      ...
      ipmi0: IPMI device rev. 0, firmware rev. 2.00, version 2.0, device support mask 0
      ipmi0: Number of channels 2
      ...
      
      The ipmi device has not been added to GENERIC64, but may be after further
      testing.  It may also eventually be added to the ipmi module at that point.
      1a3eaf6c
    • Justin Hibbits's avatar
      Add a comment explaining the need of a global temporary variable · 5272c9bd
      Justin Hibbits authored
      cpu_xirr is used only as a temporary location for the OPAL call in
      PIC_DISPATCH().
      
      Requested by:	nwhitehorn
      5272c9bd
    • Justin Hibbits's avatar
      Basic OPAL sensor support for POWER9 platforms · 9c6ba29d
      Justin Hibbits authored
      Summary:
      PowerNV architectures (in the test case POWER9) export sensors via the device
      tree, which are accessed via OPAL calls.  This adds sysctl nodes for each
      device in a generic fashion.  New sysctl nodes are:
      
      dev.opal_sensor.N.sensor
      dev.opal_sensor.N.sensor_min
      dev.opal_sensor.N.sensor_max
      dev.opal_sensor.N.type
      dev.opal_sensor.N.label
      
      These are rooted at a parent attachment under opal, called opalsens.  This does
      not add support for the "sensor groups" defined in the device tree.
      
      Reviewed by:	breno.leitao_gmail.com
      Differential Revision: https://reviews.freebsd.org/D15362
      9c6ba29d
    • Eitan Adler's avatar
      top(1): unbreak build with gcc7; fix varargs · bc875b45
      Eitan Adler authored
      - use correct function for varargs argument
      - allow build to complete with gcc7 at current WARNS
      
      Reported by:	jhibbits, ian
      bc875b45