For ARM64, ONLY "acpi=off", "acpi=on" or "acpi=force"
are available
- See also Documentation/power/runtime_pm.txt, pci=noacpi
+ See also Documentation/power/runtime_pm.rst, pci=noacpi
acpi_apic_instance= [ACPI, IOAPIC]
Format: <int>
ACPI_DEBUG_PRINT statements, e.g.,
ACPI_DEBUG_PRINT((ACPI_DB_INFO, ...
The debug_level mask defaults to "info". See
- Documentation/acpi/debug.txt for more information about
+ Documentation/firmware-guide/acpi/debug.rst for more information about
debug layers and levels.
Enable processor driver info messages:
acpi_sleep= [HW,ACPI] Sleep options
Format: { s3_bios, s3_mode, s3_beep, s4_nohwsig,
old_ordering, nonvs, sci_force_enable, nobl }
- See Documentation/power/video.txt for information on
+ See Documentation/power/video.rst for information on
s3_bios and s3_mode.
s3_beep is for debugging; it makes the PC's speaker beep
as soon as the kernel's real-mode entry point is called.
blkdevparts= Manual partition parsing of block device(s) for
embedded devices based on command line input.
- See Documentation/block/cmdline-partition.txt
+ See Documentation/block/cmdline-partition.rst
boot_delay= Milliseconds to delay each printk during boot.
Values larger than 10 seconds (10000) are changed to
others).
ccw_timeout_log [S390]
- See Documentation/s390/CommonIO for details.
+ See Documentation/s390/common_io.rst for details.
cgroup_disable= [KNL] Disable a particular controller
Format: {name of the controller(s) to disable}
/selinux/checkreqprot.
cio_ignore= [S390]
- See Documentation/s390/CommonIO for details.
+ See Documentation/s390/common_io.rst for details.
clk_ignore_unused
[CLK]
Prevents the clock framework from automatically gating
[KNL, x86_64] select a region under 4G first, and
fall back to reserve region above 4G when '@offset'
hasn't been specified.
- See Documentation/kdump/kdump.txt for further details.
+ See Documentation/admin-guide/kdump/kdump.rst for further details.
crashkernel=range1:size1[,range2:size2,...][@offset]
[KNL] Same as above, but depends on the memory
in the running system. The syntax of range is
start-[end] where start and end are both
a memory unit (amount[KMG]). See also
- Documentation/kdump/kdump.txt for an example.
+ Documentation/admin-guide/kdump/kdump.rst for an example.
crashkernel=size[KMG],high
[KNL, x86_64] range could be above 4G. Allow kernel
tracking down these problems.
debug_pagealloc=
- [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this
- parameter enables the feature at boot time. In
- default, it is disabled. We can avoid allocating huge
- chunk of memory for debug pagealloc if we don't enable
- it at boot time and the system will work mostly same
- with the kernel built without CONFIG_DEBUG_PAGEALLOC.
+ [KNL] When CONFIG_DEBUG_PAGEALLOC is set, this parameter
+ enables the feature at boot time. By default, it is
+ disabled and the system will work mostly the same as a
+ kernel built without CONFIG_DEBUG_PAGEALLOC.
+ Note: to get most of debug_pagealloc error reports, it's
+ useful to also enable the page_owner functionality.
on: enable the feature
debugpat [X86] Enable PAT debugging
disable_radix [PPC]
Disable RADIX MMU mode on POWER9
+ disable_tlbie [PPC]
+ Disable TLBIE instruction. Currently does not work
+ with KVM, with HASH MMU, or with coherent accelerators.
+
disable_cpu_apicid= [X86,APIC,SMP]
Format: <int>
The number of initial APIC ID for the
edid/1680x1050.bin, or edid/1920x1080.bin is given
and no file with the same name exists. Details and
instructions how to build your own EDID data are
- available in Documentation/EDID/HOWTO.txt. An EDID
+ available in Documentation/driver-api/edid.rst. An EDID
data set will only be used for a particular connector,
if its name and a colon are prepended to the EDID
name. Each connector may use a unique EDID data
for details.
nompx [X86] Disables Intel Memory Protection Extensions.
- See Documentation/x86/intel_mpx.txt for more
+ See Documentation/x86/intel_mpx.rst for more
information about the feature.
nopku [X86] Disable Memory Protection Keys CPU feature found
specified address. The serial port must already be
setup and configured. Options are not yet supported.
+ sbi
+ Use RISC-V SBI (Supervisor Binary Interface) for early
+ console.
+
smh Use ARM semihosting calls for early console.
s3c2410,<addr>
the framebuffer, pass the 'ram' option so that it is
mapped with the correct attributes.
+ linflex,<addr>
+ Use early console provided by Freescale LinFlex UART
+ serial driver for NXP S32V234 SoCs. A valid base
+ address must be provided, and the serial port must
+ already be setup and configured.
+
earlyprintk= [X86,SH,ARM,M68k,S390]
earlyprintk=vga
earlyprintk=sclp
that is to be dynamically loaded by Linux. If there are
multiple variables with the same name but with different
vendor GUIDs, all of them will be loaded. See
- Documentation/acpi/ssdt-overlays.txt for details.
+ Documentation/admin-guide/acpi/ssdt-overlays.rst for details.
eisa_irq_edge= [PARISC,HW]
See comment before function elanfreq_setup() in
arch/x86/kernel/cpu/cpufreq/elanfreq.c.
- elevator= [IOSCHED]
- Format: { "mq-deadline" | "kyber" | "bfq" }
- See Documentation/block/deadline-iosched.txt,
- Documentation/block/kyber-iosched.txt and
- Documentation/block/bfq-iosched.txt for details.
-
elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
Specifies physical address of start of kernel core
image elf header and optionally the size. Generally
kexec loader will pass this option to capture kernel.
- See Documentation/kdump/kdump.txt for details.
+ See Documentation/admin-guide/kdump/kdump.rst for details.
enable_mtrr_cleanup [X86]
The kernel tries to adjust MTRR layout from continuous
See also Documentation/fault-injection/.
floppy= [HW]
- See Documentation/blockdev/floppy.txt.
+ See Documentation/admin-guide/blockdev/floppy.rst.
force_pal_cache_flush
[IA-64] Avoid check_sal_cache_flush which may hang on
Valid parameters: "on", "off"
Default: "on"
- hisax= [HW,ISDN]
- See Documentation/isdn/README.HiSax.
-
hlt [BUGS=ARM,SH]
hpet= [X86-32,HPET] option to control HPET usage
Format: =0.0 to prevent dma on hda, =0.1 hdb =1.0 hdc
.vlb_clock .pci_clock .noflush .nohpa .noprobe .nowerr
.cdrom .chs .ignore_cable are additional options
- See Documentation/ide/ide.txt.
+ See Documentation/ide/ide.rst.
ide-generic.probe-mask= [HW] (E)IDE subsystem
Format: <int>
initrd= [BOOT] Specify the location of the initial ramdisk
+ init_on_alloc= [MM] Fill newly allocated pages and heap objects with
+ zeroes.
+ Format: 0 | 1
+ Default set by CONFIG_INIT_ON_ALLOC_DEFAULT_ON.
+
+ init_on_free= [MM] Fill freed pages and heap objects with zeroes.
+ Format: 0 | 1
+ Default set by CONFIG_INIT_ON_FREE_DEFAULT_ON.
+
init_pkru= [x86] Specify the default memory protection keys rights
register contents for all processes. 0x55555554 by
default (disallow access to all but pkey 0). Can
Note that using this option lowers the security
provided by tboot because it makes the system
vulnerable to DMA attacks.
+ nobounce [Default off]
+ Disable bounce buffer for unstrusted devices such as
+ the Thunderbolt devices. This will treat the untrusted
+ devices as the trusted ones, hence might expose security
+ risks of DMA attacks.
intel_idle.max_cstate= [KNL,HW,ACPI,X86]
0 disables intel_idle and fall back on acpi_idle.
synchronously.
iommu.passthrough=
- [ARM64] Configure DMA to bypass the IOMMU by default.
+ [ARM64, X86] Configure DMA to bypass the IOMMU by default.
Format: { "0" | "1" }
0 - Use IOMMU translation for DMA.
1 - Bypass the IOMMU for DMA.
Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
the default is off.
+ kprobe_event=[probe-list]
+ [FTRACE] Add kprobe events and enable at boot time.
+ The probe-list is a semicolon delimited list of probe
+ definitions. Each definition is same as kprobe_events
+ interface, but the parameters are comma delimited.
+ For example, to add a kprobe event on vfs_read with
+ arg1 and arg2, add to the command line;
+
+ kprobe_event=p,vfs_read,$arg1,$arg2
+
+ See also Documentation/trace/kprobetrace.rst "Kernel
+ Boot Parameter" section.
+
kpti= [ARM64] Control page table isolation of user
and kernel address spaces.
Default: enabled on cores which need mitigation.
memblock=debug [KNL] Enable memblock debug messages.
load_ramdisk= [RAM] List of ramdisks to load from floppy
- See Documentation/blockdev/ramdisk.txt.
+ See Documentation/admin-guide/blockdev/ramdisk.rst.
lockd.nlm_grace_period=P [NFS] Assign grace period.
Format: <integer>
lockd.nlm_udpport=M [NFS] Assign UDP port.
Format: <integer>
+ lockdown= [SECURITY]
+ { integrity | confidentiality }
+ Enable the kernel lockdown feature. If set to
+ integrity, kernel features that allow userland to
+ modify the running kernel are disabled. If set to
+ confidentiality, kernel features that allow userland
+ to extract confidential information from the kernel
+ are also disabled.
+
locktorture.nreaders_stress= [KNL]
Set the number of locking read-acquisition kthreads.
Defaults to being automatically set based on the
machvec= [IA-64] Force the use of a particular machine-vector
(machvec) in a generic kernel.
- Example: machvec=hpzx1_swiotlb
+ Example: machvec=hpzx1
machtype= [Loongson] Share the same kernel image file between different
yeeloong laptop.
mce [X86-32] Machine Check Exception
- mce=option [X86-64] See Documentation/x86/x86_64/boot-options.txt
+ mce=option [X86-64] See Documentation/x86/x86_64/boot-options.rst
md= [HW] RAID subsystems devices and level
See Documentation/admin-guide/md.rst.
set according to the
CONFIG_MEMORY_HOTPLUG_DEFAULT_ONLINE kernel config
option.
- See Documentation/memory-hotplug.txt.
+ See Documentation/admin-guide/mm/memory-hotplug.rst.
memmap=exactmap [KNL,X86] Enable setting of an exact
E820 memory map, as specified by the user.
mem_encrypt=on: Activate SME
mem_encrypt=off: Do not activate SME
- Refer to Documentation/x86/amd-memory-encryption.txt
+ Refer to Documentation/virt/kvm/amd-memory-encryption.rst
for details on when memory encryption can be activated.
mem_sleep_default= [SUSPEND] Default system suspend mode:
expose users to several CPU vulnerabilities.
Equivalent to: nopti [X86,PPC]
kpti=0 [ARM64]
- nospectre_v1 [PPC]
+ nospectre_v1 [X86,PPC]
nobp=0 [S390]
nospectre_v2 [X86,PPC,S390,ARM64]
spectre_v2_user=off [X86]
0 - turn hardlockup detector in nmi_watchdog off
1 - turn hardlockup detector in nmi_watchdog on
When panic is specified, panic when an NMI watchdog
- timeout occurs (or 'nopanic' to override the opposite
- default). To disable both hard and soft lockup detectors,
+ timeout occurs (or 'nopanic' to not panic on an NMI
+ watchdog, if CONFIG_BOOTPARAM_HARDLOCKUP_PANIC is set)
+ To disable both hard and soft lockup detectors,
please see 'nowatchdog'.
This is useful when you use a panic=... timeout and
need the box quickly up again.
/sys/module/printk/parameters/console_suspend) to
turn on/off it dynamically.
+ novmcoredd [KNL,KDUMP]
+ Disable device dump. Device dump allows drivers to
+ append dump data to vmcore so you can collect driver
+ specified debug info. Drivers can append the data
+ without any limit and this data is stored in memory,
+ so this may cause significant memory stress. Disabling
+ device dump can help save memory but the driver debug
+ data will be no longer available. This parameter
+ is only available when CONFIG_PROC_VMCORE_DEVICE_DUMP
+ is set.
+
noaliencache [MM, NUMA, SLAB] Disables the allocation of alien
caches in the slab allocator. Saves per-node memory,
but will impact performance.
register save and restore. The kernel will only save
legacy floating-point registers on task switch.
- nohugeiomap [KNL,x86] Disable kernel huge I/O mappings.
+ nohugeiomap [KNL,x86,PPC] Disable kernel huge I/O mappings.
nosmt [KNL,S390] Disable symmetric multithreading (SMT).
Equivalent to smt=1.
nosmt=force: Force disable SMT, cannot be undone
via the sysfs control file.
- nospectre_v1 [PPC] Disable mitigations for Spectre Variant 1 (bounds
- check bypass). With this option data leaks are possible
- in the system.
+ nospectre_v1 [X86,PPC] Disable mitigations for Spectre Variant 1
+ (bounds check bypass). With this option data leaks are
+ possible in the system.
nospectre_v2 [X86,PPC_FSL_BOOK3E,ARM64] Disable all mitigations for
the Spectre variant 2 (indirect branch prediction)
numa_zonelist_order= [KNL, BOOT] Select zonelist order for NUMA.
'node', 'default' can be specified
This can be set from sysctl after boot.
- See Documentation/sysctl/vm.txt for details.
+ See Documentation/admin-guide/sysctl/vm.rst for details.
ohci1394_dma=early [HW] enable debugging via the ohci1394 driver.
See Documentation/debugging-via-ohci1394.txt for more
pcd. [PARIDE]
See header of drivers/block/paride/pcd.c.
- See also Documentation/blockdev/paride.txt.
+ See also Documentation/admin-guide/blockdev/paride.rst.
pci=option[,option...] [PCI] various PCI subsystem options.
specify the device is described above.
If <order of align> is not specified,
PAGE_SIZE is used as alignment.
- PCI-PCI bridge can be specified, if resource
+ A PCI-PCI bridge can be specified if resource
windows need to be expanded.
To specify the alignment for several
instances of a device, the PCI vendor,
device, subvendor, and subdevice may be
- specified, e.g., 4096@pci:8086:9c22:103c:198f
+ specified, e.g., 12@pci:8086:9c22:103c:198f
+ for 4096-byte alignment.
ecrc= Enable/disable PCIe ECRC (transaction layer
end-to-end CRC checking).
bios: Use BIOS/firmware settings. This is the
needed on a platform with proper driver support.
pd. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/admin-guide/blockdev/paride.rst.
pdcchassis= [PARISC,HW] Disable/Enable PDC Chassis Status codes at
boot time.
and performance comparison.
pf. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/admin-guide/blockdev/paride.rst.
pg. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/admin-guide/blockdev/paride.rst.
pirq= [SMP,APIC] Manual mp-table setup
- See Documentation/x86/i386/IO-APIC.txt.
+ See Documentation/x86/i386/IO-APIC.rst.
plip= [PPT,NET] Parallel port network link
Format: { parport<nr> | timid | 0 }
prompt_ramdisk= [RAM] List of RAM disks to prompt for floppy disk
before loading.
- See Documentation/blockdev/ramdisk.txt.
+ See Documentation/admin-guide/blockdev/ramdisk.rst.
psi= [KNL] Enable or disable pressure stall information
tracking.
pstore.backend= Specify the name of the pstore backend to use
pt. [PARIDE]
- See Documentation/blockdev/paride.txt.
+ See Documentation/admin-guide/blockdev/paride.rst.
pti= [X86_64] Control Page Table Isolation of user and
kernel address spaces. Disabling this feature
See Documentation/admin-guide/md.rst.
ramdisk_size= [RAM] Sizes of RAM disks in kilobytes
- See Documentation/blockdev/ramdisk.txt.
+ See Documentation/admin-guide/blockdev/ramdisk.rst.
random.trust_cpu={on,off}
[KNL] Enable or disable trusting the use of the
the propagation of recent CPU-hotplug changes up
the rcu_node combining tree.
+ rcutree.use_softirq= [KNL]
+ If set to zero, move all RCU_SOFTIRQ processing to
+ per-CPU rcuc kthreads. Defaults to a non-zero
+ value, meaning that RCU_SOFTIRQ is used by default.
+ Specify rcutree.use_softirq=0 to use rcuc kthreads.
+
rcutree.rcu_fanout_exact= [KNL]
Disable autobalancing of the rcu_node combining
tree. This is used by rcutorture, and might
RCU_BOOST is not set, valid values are 0-99 and
the default is zero (non-realtime operation).
- rcutree.rcu_nocb_leader_stride= [KNL]
- Set the number of NOCB kthread groups, which
- defaults to the square root of the number of
- CPUs. Larger numbers reduces the wakeup overhead
- on the per-CPU grace-period kthreads, but increases
- that same overhead on each group's leader.
+ rcutree.rcu_nocb_gp_stride= [KNL]
+ Set the number of NOCB callback kthreads in
+ each group, which defaults to the square root
+ of the number of CPUs. Larger numbers reduce
+ the wakeup overhead on the global grace-period
+ kthread, but increases that same overhead on
+ each group's NOCB grace-period kthread.
rcutree.qhimark= [KNL]
Set threshold of queued RCU callbacks beyond which
rcutorture.verbose= [KNL]
Enable additional printk() statements.
+ rcupdate.rcu_cpu_stall_ftrace_dump= [KNL]
+ Dump ftrace buffer after reporting RCU CPU
+ stall warning.
+
rcupdate.rcu_cpu_stall_suppress= [KNL]
Suppress RCU CPU stall warning messages.
Run specified binary instead of /init from the ramdisk,
used for early userspace startup. See initrd.
+ rdrand= [X86]
+ force - Override the decision by the kernel to hide the
+ advertisement of RDRAND support (this affects
+ certain AMD processors because of buggy BIOS
+ support, specifically around the suspend/resume
+ path).
+
rdt= [HW,X86,RDT]
Turn on/off individual RDT features. List is:
cmt, mbmtotal, mbmlocal, l3cat, l3cdp, l2cat, l2cdp,
relax_domain_level=
[KNL, SMP] Set scheduler's default relax_domain_level.
- See Documentation/cgroup-v1/cpusets.txt.
+ See Documentation/admin-guide/cgroup-v1/cpusets.rst.
reserve= [KNL,BUGS] Force kernel to ignore I/O ports or memory
Format: <base1>,<size1>[,<base2>,<size2>,...]
Specify the offset from the beginning of the partition
given by "resume=" at which the swap header is located,
in <PAGE_SIZE> units (needed only for swap files).
- See Documentation/power/swsusp-and-swap-files.txt
+ See Documentation/power/swsusp-and-swap-files.rst
resumedelay= [HIBERNATION] Delay (in seconds) to pause before attempting to
read the resume files
Format: <integer>
sonypi.*= [HW] Sony Programmable I/O Control Device driver
- See Documentation/laptops/sonypi.txt
+ See Documentation/admin-guide/laptops/sonypi.rst
spectre_v2= [X86] Control mitigation of Spectre variant 2
(indirect branch speculation) vulnerability.
/sys/power/pm_test). Only available when CONFIG_PM_DEBUG
is set. Default value is 5.
+ svm= [PPC]
+ Format: { on | off | y | n | 1 | 0 }
+ This parameter controls use of the Protected
+ Execution Facility on pSeries.
+
swapaccount=[0|1]
[KNL] Enable accounting of swap in memory resource
controller if no parameter or 1 is given or disable
- it if 0 is given (See Documentation/cgroup-v1/memory.txt)
+ it if 0 is given (See Documentation/admin-guide/cgroup-v1/memory.rst)
swiotlb= [ARM,IA-64,PPC,MIPS,X86]
Format: { <int> | force | noforce }
Force threading of all interrupt handlers except those
marked explicitly IRQF_NO_THREAD.
- tmem [KNL,XEN]
- Enable the Transcendent memory driver if built-in.
-
- tmem.cleancache=0|1 [KNL, XEN]
- Default is on (1). Disable the usage of the cleancache
- API to send anonymous pages to the hypervisor.
-
- tmem.frontswap=0|1 [KNL, XEN]
- Default is on (1). Disable the usage of the frontswap
- API to send swap pages to the hypervisor. If disabled
- the selfballooning and selfshrinking are force disabled.
-
- tmem.selfballooning=0|1 [KNL, XEN]
- Default is on (1). Disable the driving of swap pages
- to the hypervisor.
-
- tmem.selfshrinking=0|1 [KNL, XEN]
- Default is on (1). Partial swapoff that immediately
- transfers pages from Xen hypervisor back to the
- kernel based on different criteria.
-
topology= [S390]
Format: {off | on}
Specify if the kernel should make use of the cpu
vector=percpu: enable percpu vector domain
video= [FB] Frame buffer configuration
- See Documentation/fb/modedb.txt.
+ See Documentation/fb/modedb.rst.
video.brightness_switch_enabled= [0,1]
If set to 1, on receiving an ACPI notify event
Can be used multiple times for multiple devices.
vga= [BOOT,X86-32] Select a particular video mode
- See Documentation/x86/boot.txt and
- Documentation/svga.txt.
+ See Documentation/x86/boot.rst and
+ Documentation/admin-guide/svga.rst.
Use vga=ask for menu.
This is actually a boot loader parameter; the value is
passed to the kernel using a special protocol.
targets for exploits that can control RIP.
emulate [default] Vsyscalls turn into traps and are
- emulated reasonably safely.
+ emulated reasonably safely. The vsyscall
+ page is readable.
- native Vsyscalls are native syscall instructions.
- This is a little bit faster than trapping
- and makes a few dynamic recompilers work
- better than they would in emulation mode.
- It also makes exploits much easier to write.
+ xonly Vsyscalls turn into traps and are
+ emulated reasonably safely. The vsyscall
+ page is not readable.
none Vsyscalls don't work at all. This makes
them quite hard to use for exploits but
Default: 3 = cyan.
watchdog timers [HW,WDT] For information on watchdog timers,
- see Documentation/watchdog/watchdog-parameters.txt
+ see Documentation/watchdog/watchdog-parameters.rst
or other driver-specific files in the
Documentation/watchdog/ directory.
xen_nopv [X86]
Disables the PV optimizations forcing the HVM guest to
run as generic HVM guest with no PV drivers.
+ This option is obsoleted by the "nopv" option, which
+ has equivalent effect for XEN platform.
xen_scrub_pages= [XEN]
Boolean option to control scrubbing pages before giving them back
improve timer resolution at the expense of processing
more timer interrupts.
+ nopv= [X86,XEN,KVM,HYPER_V,VMWARE]
+ Disables the PV optimizations forcing the guest to run
+ as generic guest with no PV drivers. Currently support
+ XEN HVM, KVM, HYPER_V and VMWARE guest.
+
xirc2ps_cs= [NET,PCMCIA]
Format:
<irq>,<irq_mask>,<io>,<full_duplex>,<do_sound>,<lockup_hack>[,<irq2>[,<irq3>[,<irq4>]]]
+ xive= [PPC]
+ By default on POWER9 and above, the kernel will
+ natively use the XIVE interrupt controller. This option
+ allows the fallback firmware mode to be used:
+
+ off Fallback to firmware control of XIVE interrupt
+ controller on both pseries and powernv
+ platforms. Only useful on POWER9 and above.
+
xhci-hcd.quirks [USB,KNL]
A hex value specifying bitmask with supplemental xhci
host controller quirks. Meaning of each bit can be
consulted in header drivers/usb/host/xhci.h.
+
+ xmon [PPC]
+ Format: { early | on | rw | ro | off }
+ Controls if xmon debugger is enabled. Default is off.
+ Passing only "xmon" is equivalent to "xmon=early".
+ early Call xmon as early as possible on boot; xmon
+ debugger is called from setup_arch().
+ on xmon debugger hooks will be installed so xmon
+ is only called on a kernel crash. Default mode,
+ i.e. either "ro" or "rw" mode, is controlled
+ with CONFIG_XMON_DEFAULT_RO_MODE.
+ rw xmon debugger hooks will be installed so xmon
+ is called only on a kernel crash, mode is write,
+ meaning SPR registers, memory and, other data
+ can be written using xmon commands.
+ ro same as "rw" option above but SPR registers,
+ memory, and other data can't be written using
+ xmon commands.
+ off xmon is disabled.
select ARCH_HAS_DEBUG_VIRTUAL
select ARCH_HAS_DEVMEM_IS_ALLOWED
select ARCH_HAS_DMA_COHERENT_TO_PFN
- select ARCH_HAS_DMA_MMAP_PGPROT
select ARCH_HAS_DMA_PREP_COHERENT
select ARCH_HAS_ACPI_TABLE_UPGRADE if ACPI
- select ARCH_HAS_ELF_RANDOMIZE
select ARCH_HAS_FAST_MULTIPLIER
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOV
select ARCH_HAS_KEEPINITRD
select ARCH_HAS_MEMBARRIER_SYNC_CORE
+ select ARCH_HAS_PTE_DEVMAP
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SETUP_DMA_OPS
+ select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_SUPPORTS_INT128 if GCC_VERSION >= 50000 || CC_IS_CLANG
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_WANT_COMPAT_IPC_PARSE_VERSION if COMPAT
+ select ARCH_WANT_DEFAULT_TOPDOWN_MMAP_LAYOUT
select ARCH_WANT_FRAME_POINTERS
+ select ARCH_WANT_HUGE_PMD_SHARE if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
select ARCH_HAS_UBSAN_SANITIZE_ALL
select ARM_AMBA
select ARM_ARCH_TIMER
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
+ select GENERIC_GETTIMEOFDAY
+ select GENERIC_COMPAT_VDSO if (!CPU_BIG_ENDIAN && COMPAT)
select HANDLE_DOMAIN_IRQ
select HARDIRQS_SW_RESEND
select HAVE_PCI
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_ARCH_VMAP_STACK
select HAVE_ARM_SMCCC
+ select HAVE_ASM_MODVERSIONS
select HAVE_EBPF_JIT
select HAVE_C_RECORDMCOUNT
select HAVE_CMPXCHG_DOUBLE
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_EFFICIENT_UNALIGNED_ACCESS
+ select HAVE_FAST_GUP
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
+ select HAVE_FUNCTION_ERROR_INJECTION
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_GCC_PLUGINS
select HAVE_HW_BREAKPOINT if PERF_EVENTS
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_KPROBES
select HAVE_KRETPROBES
+ select HAVE_GENERIC_VDSO
select IOMMU_DMA if IOMMU_SUPPORT
select IRQ_DOMAIN
select IRQ_FORCED_THREADING
def_bool y
config ZONE_DMA32
- def_bool y
-
-config HAVE_GENERIC_GUP
- def_bool y
+ bool "Support DMA32 zone" if EXPERT
+ default y
config ARCH_ENABLE_MEMORY_HOTPLUG
def_bool y
int
default 2 if ARM64_16K_PAGES && ARM64_VA_BITS_36
default 2 if ARM64_64K_PAGES && ARM64_VA_BITS_42
- default 3 if ARM64_64K_PAGES && (ARM64_VA_BITS_48 || ARM64_USER_VA_BITS_52)
+ default 3 if ARM64_64K_PAGES && (ARM64_VA_BITS_48 || ARM64_VA_BITS_52)
default 3 if ARM64_4K_PAGES && ARM64_VA_BITS_39
default 3 if ARM64_16K_PAGES && ARM64_VA_BITS_47
default 4 if !ARM64_64K_PAGES && ARM64_VA_BITS_48
config ARCH_PROC_KCORE_TEXT
def_bool y
+config KASAN_SHADOW_OFFSET
+ hex
+ depends on KASAN
+ default 0xdfffa00000000000 if (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) && !KASAN_SW_TAGS
+ default 0xdfffd00000000000 if ARM64_VA_BITS_47 && !KASAN_SW_TAGS
+ default 0xdffffe8000000000 if ARM64_VA_BITS_42 && !KASAN_SW_TAGS
+ default 0xdfffffd000000000 if ARM64_VA_BITS_39 && !KASAN_SW_TAGS
+ default 0xdffffffa00000000 if ARM64_VA_BITS_36 && !KASAN_SW_TAGS
+ default 0xefff900000000000 if (ARM64_VA_BITS_48 || ARM64_VA_BITS_52) && KASAN_SW_TAGS
+ default 0xefffc80000000000 if ARM64_VA_BITS_47 && KASAN_SW_TAGS
+ default 0xeffffe4000000000 if ARM64_VA_BITS_42 && KASAN_SW_TAGS
+ default 0xefffffc800000000 if ARM64_VA_BITS_39 && KASAN_SW_TAGS
+ default 0xeffffff900000000 if ARM64_VA_BITS_36 && KASAN_SW_TAGS
+ default 0xffffffffffffffff
+
source "arch/arm64/Kconfig.platforms"
menu "Kernel Features"
config ARM64_VA_BITS_48
bool "48-bit"
-config ARM64_USER_VA_BITS_52
- bool "52-bit (user)"
+config ARM64_VA_BITS_52
+ bool "52-bit"
depends on ARM64_64K_PAGES && (ARM64_PAN || !ARM64_SW_TTBR0_PAN)
help
Enable 52-bit virtual addressing for userspace when explicitly
- requested via a hint to mmap(). The kernel will continue to
- use 48-bit virtual addresses for its own mappings.
+ requested via a hint to mmap(). The kernel will also use 52-bit
+ virtual addresses for its own mappings (provided HW support for
+ this feature is available, otherwise it reverts to 48-bit).
NOTE: Enabling 52-bit virtual addressing in conjunction with
ARMv8.3 Pointer Authentication will result in the PAC being
config ARM64_FORCE_52BIT
bool "Force 52-bit virtual addresses for userspace"
- depends on ARM64_USER_VA_BITS_52 && EXPERT
+ depends on ARM64_VA_BITS_52 && EXPERT
help
For systems with 52-bit userspace VAs enabled, the kernel will attempt
to maintain compatibility with older software by providing 48-bit VAs
default 39 if ARM64_VA_BITS_39
default 42 if ARM64_VA_BITS_42
default 47 if ARM64_VA_BITS_47
- default 48 if ARM64_VA_BITS_48 || ARM64_USER_VA_BITS_52
+ default 48 if ARM64_VA_BITS_48
+ default 52 if ARM64_VA_BITS_52
choice
prompt "Physical address space size"
def_bool y
config ARCH_WANT_HUGE_PMD_SHARE
- def_bool y if ARM64_4K_PAGES || (ARM64_16K_PAGES && !ARM64_VA_BITS_36)
config ARCH_HAS_CACHE_LINE_SIZE
def_bool y
config PARAVIRT_TIME_ACCOUNTING
bool "Paravirtual steal time accounting"
select PARAVIRT
- default n
help
Select this option to enable fine granularity task steal time
accounting. Time spent executing other tasks in parallel with
for kernel and initramfs as opposed to list of segments as
accepted by previous system call.
- config KEXEC_VERIFY_SIG
+ config KEXEC_SIG
bool "Verify kernel signature during kexec_file_load() syscall"
depends on KEXEC_FILE
help
config KEXEC_IMAGE_VERIFY_SIG
bool "Enable Image signature verification support"
default y
- depends on KEXEC_VERIFY_SIG
+ depends on KEXEC_SIG
depends on EFI && SIGNED_PE_FILE_VERIFICATION
help
Enable Image signature verification support.
comment "Support for PE file signature verification disabled"
- depends on KEXEC_VERIFY_SIG
+ depends on KEXEC_SIG
depends on !EFI || !SIGNED_PE_FILE_VERIFICATION
config CRASH_DUMP
reserved region and then later executed after a crash by
kdump/kexec.
- For more details see Documentation/kdump/kdump.txt
+ For more details see Documentation/admin-guide/kdump/kdump.rst
config XEN_DOM0
def_bool y
zeroed area and reserved ASID. The user access routines
restore the valid TTBR0_EL1 temporarily.
+config ARM64_TAGGED_ADDR_ABI
+ bool "Enable the tagged user addresses syscall ABI"
+ default y
+ help
+ When this option is enabled, user applications can opt in to a
+ relaxed ABI via prctl() allowing tagged addresses to be passed
+ to system calls as pointer arguments. For details, see
+ Documentation/arm64/tagged-address-abi.rst.
+
menuconfig COMPAT
bool "Kernel support for 32-bit EL0"
depends on ARM64_4K_PAGES || EXPERT
the system. This permits binaries to be run on ARMv4 through
to ARMv8 without modification.
- See Documentation/arm/kernel_user_helpers.txt for details.
+ See Documentation/arm/kernel_user_helpers.rst for details.
However, the fixed address nature of these helpers can be used
by ROP (return orientated programming) authors when creating
config ARM64_LSE_ATOMICS
bool "Atomic instructions"
+ depends on JUMP_LABEL
default y
help
As part of the Large System Extensions, ARMv8.1 introduces new
KVM in the same kernel image.
config ARM64_MODULE_PLTS
- bool
+ bool "Use PLTs to allow module memory to spill over into vmalloc area"
+ depends on MODULES
select HAVE_MOD_ARCH_SPECIFIC
+ help
+ Allocate PLTs when loading modules so that jumps and calls whose
+ targets are too far away for their relative offsets to be encoded
+ in the instructions themselves can be bounced via veneers in the
+ module's PLT. This allows modules to be allocated in the generic
+ vmalloc area after the dedicated module memory area has been
+ exhausted.
+
+ When running with address space randomization (KASLR), the module
+ region itself may be too far away for ordinary relative jumps and
+ calls, and so in that case, module PLTs are required and cannot be
+ disabled.
+
+ Specific errata workaround(s) might also force module PLTs to be
+ enabled (ARM64_ERRATUM_843419).
config ARM64_PSEUDO_NMI
bool "Support for NMI-like interrupts"
- depends on BROKEN # 1556553607-46531-1-git-send-email-julien.thierry@arm.com
select CONFIG_ARM_GIC_V3
help
Adds support for mimicking Non-Maskable Interrupts through the use of
If unsure, say N
+if ARM64_PSEUDO_NMI
+config ARM64_DEBUG_PRIORITY_MASKING
+ bool "Debug interrupt priority masking"
+ help
+ This adds runtime checks to functions enabling/disabling
+ interrupts when using priority masking. The additional checks verify
+ the validity of ICC_PMR_EL1 when calling concerned functions.
+
+ If unsure, say N
+endif
+
config RELOCATABLE
bool
+ select ARCH_HAS_RELR
help
This builds the kernel as a Position Independent Executable (PIE),
which retains all relocation metadata required to relocate the
def_bool y
config GENERIC_LOCKBREAK
- def_bool y if SMP && PREEMPT
+ def_bool y if PREEMPT
config PGSTE
def_bool y if KVM
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
+ select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_MEMORY
select ARCH_HAS_STRICT_KERNEL_RWX
select ARCH_INLINE_WRITE_UNLOCK_IRQRESTORE
select ARCH_KEEP_MEMBLOCK
select ARCH_SAVE_PAGE_KEYS if HIBERNATION
+ select ARCH_STACKWALK
select ARCH_SUPPORTS_ATOMIC_RMW
select ARCH_SUPPORTS_NUMA_BALANCING
select ARCH_USE_BUILTIN_BSWAP
select DYNAMIC_FTRACE if FUNCTION_TRACER
select GENERIC_CLOCKEVENTS
select GENERIC_CPU_AUTOPROBE
- select GENERIC_CPU_DEVICES if !SMP
select GENERIC_CPU_VULNERABILITIES
select GENERIC_FIND_FIRST_BIT
select GENERIC_SMP_IDLE_THREAD
select HAVE_ARCH_TRACEHOOK
select HAVE_ARCH_TRANSPARENT_HUGEPAGE
select HAVE_ARCH_VMAP_STACK
+ select HAVE_ASM_MODVERSIONS
select HAVE_EBPF_JIT if PACK_STACK && HAVE_MARCH_Z196_FEATURES
select HAVE_CMPXCHG_DOUBLE
select HAVE_CMPXCHG_LOCAL
select HAVE_DMA_CONTIGUOUS
select HAVE_DYNAMIC_FTRACE
select HAVE_DYNAMIC_FTRACE_WITH_REGS
+ select HAVE_FAST_GUP
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_FENTRY
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_TRACER
select HAVE_FUTEX_CMPXCHG if FUTEX
select HAVE_GCC_PLUGINS
- select HAVE_GENERIC_GUP
select HAVE_KERNEL_BZIP2
select HAVE_KERNEL_GZIP
select HAVE_KERNEL_LZ4
select VIRT_CPU_ACCOUNTING
select ARCH_HAS_SCALED_CPUTIME
select HAVE_NMI
+ select ARCH_HAS_FORCE_DMA_UNENCRYPTED
+ select SWIOTLB
+ select GENERIC_ALLOCATOR
config SCHED_OMIT_FRAME_POINTER
def_bool n
select HAVE_MARCH_Z13_FEATURES
+config HAVE_MARCH_Z15_FEATURES
+ def_bool n
+ select HAVE_MARCH_Z14_FEATURES
+
choice
prompt "Processor type"
default MARCH_Z196
and 3906 series). The kernel will be slightly faster but will not
work on older machines.
+config MARCH_Z15
+ bool "IBM z15"
+ select HAVE_MARCH_Z15_FEATURES
+ help
+ Select this to enable optimizations for IBM z15 (8562
+ and 8561 series). The kernel will be slightly faster but will not
+ work on older machines.
+
endchoice
config MARCH_Z900_TUNE
config MARCH_Z14_TUNE
def_bool TUNE_Z14 || MARCH_Z14 && TUNE_DEFAULT
+config MARCH_Z15_TUNE
+ def_bool TUNE_Z15 || MARCH_Z15 && TUNE_DEFAULT
+
choice
prompt "Tune code generation"
default TUNE_DEFAULT
config TUNE_Z14
bool "IBM z14"
+config TUNE_Z15
+ bool "IBM z15"
+
endchoice
config 64BIT
config SMP
def_bool y
- prompt "Symmetric multi-processing support"
- ---help---
- This enables support for systems with more than one CPU. If you have
- a system with only one CPU, like most personal computers, say N. If
- you have a system with more than one CPU, say Y.
-
- If you say N here, the kernel will run on uni- and multiprocessor
- machines, but will use only one CPU of a multiprocessor machine. If
- you say Y here, the kernel will run on many, but not all,
- uniprocessor machines. On a uniprocessor machine, the kernel
- will run faster if you say N here.
-
- See also the SMP-HOWTO available at
- <http://www.tldp.org/docs.html#howto>.
-
- Even if you don't know what to do here, say Y.
config NR_CPUS
int "Maximum number of CPUs (2-512)"
range 2 512
- depends on SMP
default "64"
help
This allows you to specify the maximum number of CPUs which this
config HOTPLUG_CPU
def_bool y
- prompt "Support for hot-pluggable CPUs"
- depends on SMP
- help
- Say Y here to be able to turn CPUs off and on. CPUs
- can be controlled through /sys/devices/system/cpu/cpu#.
- Say N if you want to disable CPU hotplug.
# Some NUMA nodes have memory ranges that span
# other nodes. Even though a pfn is valid and
config NUMA
bool "NUMA support"
- depends on SMP && SCHED_TOPOLOGY
+ depends on SCHED_TOPOLOGY
default n
help
Enable NUMA support
config SCHED_TOPOLOGY
def_bool y
prompt "Topology scheduler support"
- depends on SMP
select SCHED_SMT
select SCHED_MC
select SCHED_BOOK
def_bool y
depends on KEXEC_FILE
- config KEXEC_VERIFY_SIG
+ config KEXEC_SIG
bool "Verify kernel signature during kexec_file_load() syscall"
- depends on KEXEC_FILE && SYSTEM_DATA_VERIFICATION
+ depends on KEXEC_FILE && MODULE_SIG_FORMAT
help
This option makes kernel signature verification mandatory for
the kexec_file_load() syscall.
config ARCH_SPARSEMEM_DEFAULT
def_bool y
-config ARCH_SELECT_MEMORY_MODEL
- def_bool y
-
config ARCH_ENABLE_MEMORY_HOTPLUG
def_bool y if SPARSEMEM
This allows you to specify the maximum number of PCI functions which
this kernel will support.
-endif # PCI
+endif # PCI
config HAS_IOMEM
def_bool PCI
config CRASH_DUMP
bool "kernel crash dumps"
- depends on SMP
select KEXEC
help
Generate crash dump after being started by kexec.
Crash dump kernels are loaded in the main kernel with kexec-tools
into a specially reserved region and then later executed after
a crash by kdump/kexec.
- Refer to <file:Documentation/s390/zfcpdump.txt> for more details on this.
+ Refer to <file:Documentation/s390/zfcpdump.rst> for more details on this.
This option also enables s390 zfcpdump.
- See also <file:Documentation/s390/zfcpdump.txt>
+ See also <file:Documentation/s390/zfcpdump.rst>
endmenu
#include <linux/elf.h>
#include <linux/errno.h>
#include <linux/kexec.h>
-#include <linux/module.h>
+#include <linux/module_signature.h>
#include <linux/verification.h>
#include <asm/boot_data.h>
#include <asm/ipl.h>
NULL,
};
- #ifdef CONFIG_KEXEC_VERIFY_SIG
+ #ifdef CONFIG_KEXEC_SIG
-/*
- * Module signature information block.
- *
- * The constituents of the signature section are, in order:
- *
- * - Signer's name
- * - Key identifier
- * - Signature data
- * - Information block
- */
-struct module_signature {
- u8 algo; /* Public-key crypto algorithm [0] */
- u8 hash; /* Digest algorithm [0] */
- u8 id_type; /* Key identifier type [PKEY_ID_PKCS7] */
- u8 signer_len; /* Length of signer's name [0] */
- u8 key_id_len; /* Length of key identifier [0] */
- u8 __pad[3];
- __be32 sig_len; /* Length of signature data */
-};
-
-#define PKEY_ID_PKCS7 2
-
int s390_verify_sig(const char *kernel, unsigned long kernel_len)
{
const unsigned long marker_len = sizeof(MODULE_SIG_STRING) - 1;
VERIFYING_MODULE_SIGNATURE,
NULL, NULL);
}
- #endif /* CONFIG_KEXEC_VERIFY_SIG */
+ #endif /* CONFIG_KEXEC_SIG */
static int kexec_file_update_purgatory(struct kimage *image,
struct s390_load_data *data)
select HAVE_DEBUG_STACKOVERFLOW
select MODULES_USE_ELF_REL
select OLD_SIGACTION
+ select GENERIC_VDSO_32
config X86_64
def_bool y
select ARCH_HAS_FORTIFY_SOURCE
select ARCH_HAS_GCOV_PROFILE_ALL
select ARCH_HAS_KCOV if X86_64
+ select ARCH_HAS_MEM_ENCRYPT
select ARCH_HAS_MEMBARRIER_SYNC_CORE
select ARCH_HAS_PMEM_API if X86_64
+ select ARCH_HAS_PTE_DEVMAP if X86_64
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_REFCOUNT
select ARCH_HAS_UACCESS_FLUSHCACHE if X86_64
select ARCH_HAS_STRICT_MODULE_RWX
select ARCH_HAS_SYNC_CORE_BEFORE_USERMODE
select ARCH_HAS_UBSAN_SANITIZE_ALL
- select ARCH_HAS_ZONE_DEVICE if X86_64
select ARCH_HAVE_NMI_SAFE_CMPXCHG
select ARCH_MIGHT_HAVE_ACPI_PDC if ACPI
select ARCH_MIGHT_HAVE_PC_PARPORT
select ARCH_USE_QUEUED_SPINLOCKS
select ARCH_WANT_BATCHED_UNMAP_TLB_FLUSH
select ARCH_WANTS_DYNAMIC_TASK_STRUCT
+ select ARCH_WANT_HUGE_PMD_SHARE
select ARCH_WANTS_THP_SWAP if X86_64
select BUILDTIME_EXTABLE_SORT
select CLKEVT_I8253
select GENERIC_STRNCPY_FROM_USER
select GENERIC_STRNLEN_USER
select GENERIC_TIME_VSYSCALL
+ select GENERIC_GETTIMEOFDAY
+ select GUP_GET_PTE_LOW_HIGH if X86_PAE
select HARDLOCKUP_CHECK_TIMESTAMP if X86_64
select HAVE_ACPI_APEI if ACPI
select HAVE_ACPI_APEI_NMI if ACPI
select HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD if X86_64
select HAVE_ARCH_VMAP_STACK if X86_64
select HAVE_ARCH_WITHIN_STACK_FRAMES
+ select HAVE_ASM_MODVERSIONS
select HAVE_CMPXCHG_DOUBLE
select HAVE_CMPXCHG_LOCAL
select HAVE_CONTEXT_TRACKING if X86_64
select HAVE_EFFICIENT_UNALIGNED_ACCESS
select HAVE_EISA
select HAVE_EXIT_THREAD
+ select HAVE_FAST_GUP
select HAVE_FENTRY if X86_64 || DYNAMIC_FTRACE
select HAVE_FTRACE_MCOUNT_RECORD
select HAVE_FUNCTION_GRAPH_TRACER
select HAVE_SYSCALL_TRACEPOINTS
select HAVE_UNSTABLE_SCHED_CLOCK
select HAVE_USER_RETURN_NOTIFIER
+ select HAVE_GENERIC_VDSO
select HOTPLUG_SMT if SMP
select IRQ_FORCED_THREADING
select NEED_SG_DMA_LENGTH
select USER_STACKTRACE_SUPPORT
select VIRT_TO_BUS
select X86_FEATURE_NAMES if PROC_FS
+ select PROC_PID_ARCH_STATUS if PROC_FS
config INSTRUCTION_DECODER
def_bool y
config ARCH_SUSPEND_POSSIBLE
def_bool y
-config ARCH_WANT_HUGE_PMD_SHARE
- def_bool y
-
config ARCH_WANT_GENERAL_HUGETLB
def_bool y
Y to "Enhanced Real Time Clock Support", below. The "Advanced Power
Management" code will be disabled if you say Y here.
- See also <file:Documentation/x86/i386/IO-APIC.txt>,
- <file:Documentation/lockup-watchdogs.txt> and the SMP-HOWTO available at
+ See also <file:Documentation/x86/i386/IO-APIC.rst>,
+ <file:Documentation/admin-guide/lockup-watchdogs.rst> and the SMP-HOWTO available at
<http://www.tldp.org/docs.html#howto>.
If you don't know what to do here, say N.
If you are unsure how to answer this question, answer Y.
+config X86_HV_CALLBACK_VECTOR
+ def_bool n
+
source "arch/x86/xen/Kconfig"
config KVM_GUEST
bool "KVM Guest support (including kvmclock)"
depends on PARAVIRT
select PARAVIRT_CLOCK
+ select ARCH_CPUIDLE_HALTPOLL
default y
---help---
This option enables various optimizations for running under the KVM
underlying device model, the host provides the guest with
timing infrastructure such as time of day, and system time
+config ARCH_CPUIDLE_HALTPOLL
+ def_bool n
+ prompt "Disable host haltpoll when loading haltpoll driver"
+ help
+ If virtualized under KVM, disable host haltpoll.
+
config PVH
bool "Support for running PVH guests"
---help---
cell. You can leave this option disabled if you only want to start
Jailhouse and run Linux afterwards in the root cell.
+config ACRN_GUEST
+ bool "ACRN Guest support"
+ depends on X86_64
+ select X86_HV_CALLBACK_VECTOR
+ help
+ This option allows to run Linux as guest in the ACRN hypervisor. ACRN is
+ a flexible, lightweight reference open-source hypervisor, built with
+ real-time and safety-criticality in mind. It is built for embedded
+ IOT with small footprint and real-time features. More details can be
+ found in https://projectacrn.org/.
+
endif #HYPERVISOR_GUEST
source "arch/x86/Kconfig.cpu"
the Linux kernel.
The preferred method to load microcode from a detached initrd is described
- in Documentation/x86/microcode.txt. For that you need to enable
+ in Documentation/x86/microcode.rst. For that you need to enable
CONFIG_BLK_DEV_INITRD in order for the loader to be able to scan the
initrd for microcode blobs.
It is inadequate because it runs too late to be able to properly
load microcode on a machine and it needs special tools. Instead, you
should've switched to the early loading method with the initrd or
- builtin microcode by now: Documentation/x86/microcode.txt
+ builtin microcode by now: Documentation/x86/microcode.rst
config X86_MSR
tristate "/dev/cpu/*/msr - Model-specific register support"
A kernel with the option enabled can be booted on machines that
support 4- or 5-level paging.
- See Documentation/x86/x86_64/5level-paging.txt for more
+ See Documentation/x86/x86_64/5level-paging.rst for more
information.
Say N if unsure.
config X86_DIRECT_GBPAGES
def_bool y
- depends on X86_64 && !DEBUG_PAGEALLOC
+ depends on X86_64
---help---
Certain kernel features effectively disable kernel
linear 1 GB mappings (even if the CPU otherwise
helps to determine the effectiveness of preserving large and huge
page mappings when mapping protections are changed.
-config ARCH_HAS_MEM_ENCRYPT
- def_bool y
-
config AMD_MEM_ENCRYPT
bool "AMD Secure Memory Encryption (SME) support"
depends on X86_64 && CPU_SUP_AMD
select DYNAMIC_PHYSICAL_MASK
select ARCH_USE_MEMREMAP_PROT
+ select ARCH_HAS_FORCE_DMA_UNENCRYPTED
---help---
Say yes to enable support for the encryption of system memory.
This requires an AMD processor that supports Secure Memory
depends on X86_64 && MEMORY_HOTPLUG
help
This option enables a sysfs memory/probe interface for testing.
- See Documentation/memory-hotplug.txt for more information.
+ See Documentation/admin-guide/mm/memory-hotplug.rst for more information.
If you are unsure how to answer this question, answer N.
config ARCH_PROC_KCORE_TEXT
You can safely say Y even if your machine doesn't have MTRRs, you'll
just add about 9 KB to your kernel.
- See <file:Documentation/x86/mtrr.txt> for more information.
+ See <file:Documentation/x86/mtrr.rst> for more information.
config MTRR_SANITIZER
def_bool y
process and adds some branches to paths used during
exec() and munmap().
- For details, see Documentation/x86/intel_mpx.txt
+ For details, see Documentation/x86/intel_mpx.rst
If unsure, say N.
page-based protections, but without requiring modification of the
page tables when an application changes protection domains.
- For details, see Documentation/x86/protection-keys.txt
+ For details, see Documentation/core-api/protection-keys.rst
If unsure, say y.
This kernel feature allows a bzImage to be loaded directly
by EFI firmware without the use of a bootloader.
- See Documentation/efi-stub.txt for more information.
+ See Documentation/admin-guide/efi-stub.rst for more information.
config EFI_MIXED
bool "EFI mixed-mode support"
config ARCH_HAS_KEXEC_PURGATORY
def_bool KEXEC_FILE
- config KEXEC_VERIFY_SIG
+ config KEXEC_SIG
bool "Verify kernel signature during kexec_file_load() syscall"
depends on KEXEC_FILE
---help---
- This option makes kernel signature verification mandatory for
- the kexec_file_load() syscall.
- In addition to that option, you need to enable signature
+ This option makes the kexec_file_load() syscall check for a valid
+ signature of the kernel image. The image can still be loaded without
+ a valid signature unless you also enable KEXEC_SIG_FORCE, though if
+ there's a signature that we can check, then it must be valid.
+
+ In addition to this option, you need to enable signature
verification for the corresponding kernel image type being
loaded in order for this to work.
+ config KEXEC_SIG_FORCE
+ bool "Require a valid signature in kexec_file_load() syscall"
+ depends on KEXEC_SIG
+ ---help---
+ This option makes kernel signature verification mandatory for
+ the kexec_file_load() syscall.
+
config KEXEC_BZIMAGE_VERIFY_SIG
bool "Enable bzImage signature verification support"
- depends on KEXEC_VERIFY_SIG
+ depends on KEXEC_SIG
depends on SIGNED_PE_FILE_VERIFICATION
select SYSTEM_TRUSTED_KEYRING
---help---
to a memory address not used by the main kernel or BIOS using
PHYSICAL_START, or it must be built as a relocatable image
(CONFIG_RELOCATABLE=y).
- For more details see Documentation/kdump/kdump.txt
+ For more details see Documentation/admin-guide/kdump/kdump.rst
config KEXEC_JUMP
bool "kexec jump"
the reserved region. In other words, it can be set based on
the "X" value as specified in the "crashkernel=YM@XM"
command line boot parameter passed to the panic-ed
- kernel. Please take a look at Documentation/kdump/kdump.txt
+ kernel. Please take a look at Documentation/admin-guide/kdump/kdump.rst
for more details about crash dumps.
Usage of bzImage for capturing the crash dump is recommended as
choice
prompt "vsyscall table for legacy applications"
depends on X86_64
- default LEGACY_VSYSCALL_EMULATE
+ default LEGACY_VSYSCALL_XONLY
help
Legacy user code that does not know how to find the vDSO expects
to be able to issue three syscalls by calling fixed addresses in
it can be used to assist security vulnerability exploitation.
This setting can be changed at boot time via the kernel command
- line parameter vsyscall=[emulate|none].
+ line parameter vsyscall=[emulate|xonly|none].
On a system with recent enough glibc (2.14 or newer) and no
static binaries, you can say None without a performance penalty
to improve security.
- If unsure, select "Emulate".
+ If unsure, select "Emulate execution only".
config LEGACY_VSYSCALL_EMULATE
- bool "Emulate"
+ bool "Full emulation"
help
- The kernel traps and emulates calls into the fixed
- vsyscall address mapping. This makes the mapping
- non-executable, but it still contains known contents,
- which could be used in certain rare security vulnerability
- exploits. This configuration is recommended when userspace
- still uses the vsyscall area.
+ The kernel traps and emulates calls into the fixed vsyscall
+ address mapping. This makes the mapping non-executable, but
+ it still contains readable known contents, which could be
+ used in certain rare security vulnerability exploits. This
+ configuration is recommended when using legacy userspace
+ that still uses vsyscalls along with legacy binary
+ instrumentation tools that require code to be readable.
+
+ An example of this type of legacy userspace is running
+ Pin on an old binary that still uses vsyscalls.
+
+ config LEGACY_VSYSCALL_XONLY
+ bool "Emulate execution only"
+ help
+ The kernel traps and emulates calls into the fixed vsyscall
+ address mapping and does not allow reads. This
+ configuration is recommended when userspace might use the
+ legacy vsyscall area but support for legacy binary
+ instrumentation of legacy code is not needed. It mitigates
+ certain uses of the vsyscall area as an ASLR-bypassing
+ buffer.
config LEGACY_VSYSCALL_NONE
bool "None"
machines with more than one CPU.
In order to use APM, you will need supporting software. For location
- and more information, read <file:Documentation/power/apm-acpi.txt>
+ and more information, read <file:Documentation/power/apm-acpi.rst>
and the Battery Powered Linux mini-HOWTO, available from
<http://www.tldp.org/docs.html#howto>.
select OF
select OF_PROMTREE
select IRQ_DOMAIN
+ select OLPC_EC
---help---
Add support for detecting the unique features of the OLPC
XO hardware.
config X86_DEV_DMA_OPS
bool
-config HAVE_GENERIC_GUP
- def_bool y
-
source "drivers/firmware/Kconfig"
source "arch/x86/kvm/Kconfig"
*/
#define MAX_ADDR_LEN 19
- static acpi_physical_address get_acpi_rsdp(void)
+ static acpi_physical_address get_cmdline_acpi_rsdp(void)
{
acpi_physical_address addr = 0;
return addr;
}
-/* Search EFI system tables for RSDP. */
-static acpi_physical_address efi_get_rsdp_addr(void)
+/*
+ * Search EFI system tables for RSDP. If both ACPI_20_TABLE_GUID and
+ * ACPI_TABLE_GUID are found, take the former, which has more features.
+ */
+static acpi_physical_address
+__efi_get_rsdp_addr(unsigned long config_tables, unsigned int nr_tables,
+ bool efi_64)
{
acpi_physical_address rsdp_addr = 0;
#ifdef CONFIG_EFI
- unsigned long systab, systab_tables, config_tables;
+ int i;
+
+ /* Get EFI tables from systab. */
+ for (i = 0; i < nr_tables; i++) {
+ acpi_physical_address table;
+ efi_guid_t guid;
+
+ if (efi_64) {
+ efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables + i;
+
+ guid = tbl->guid;
+ table = tbl->table;
+
+ if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
+ debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
+ return 0;
+ }
+ } else {
+ efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables + i;
+
+ guid = tbl->guid;
+ table = tbl->table;
+ }
+
+ if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
+ rsdp_addr = table;
+ else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
+ return table;
+ }
+#endif
+ return rsdp_addr;
+}
+
+/* EFI/kexec support is 64-bit only. */
+#ifdef CONFIG_X86_64
+static struct efi_setup_data *get_kexec_setup_data_addr(void)
+{
+ struct setup_data *data;
+ u64 pa_data;
+
+ pa_data = boot_params->hdr.setup_data;
+ while (pa_data) {
+ data = (struct setup_data *)pa_data;
+ if (data->type == SETUP_EFI)
+ return (struct efi_setup_data *)(pa_data + sizeof(struct setup_data));
+
+ pa_data = data->next;
+ }
+ return NULL;
+}
+
+static acpi_physical_address kexec_get_rsdp_addr(void)
+{
+ efi_system_table_64_t *systab;
+ struct efi_setup_data *esd;
+ struct efi_info *ei;
+ char *sig;
+
+ esd = (struct efi_setup_data *)get_kexec_setup_data_addr();
+ if (!esd)
+ return 0;
+
+ if (!esd->tables) {
+ debug_putstr("Wrong kexec SETUP_EFI data.\n");
+ return 0;
+ }
+
+ ei = &boot_params->efi_info;
+ sig = (char *)&ei->efi_loader_signature;
+ if (strncmp(sig, EFI64_LOADER_SIGNATURE, 4)) {
+ debug_putstr("Wrong kexec EFI loader signature.\n");
+ return 0;
+ }
+
+ /* Get systab from boot params. */
+ systab = (efi_system_table_64_t *) (ei->efi_systab | ((__u64)ei->efi_systab_hi << 32));
+ if (!systab)
+ error("EFI system table not found in kexec boot_params.");
+
+ return __efi_get_rsdp_addr((unsigned long)esd->tables, systab->nr_tables, true);
+}
+#else
+static acpi_physical_address kexec_get_rsdp_addr(void) { return 0; }
+#endif /* CONFIG_X86_64 */
+
+static acpi_physical_address efi_get_rsdp_addr(void)
+{
+#ifdef CONFIG_EFI
+ unsigned long systab, config_tables;
unsigned int nr_tables;
struct efi_info *ei;
bool efi_64;
- int size, i;
char *sig;
ei = &boot_params->efi_info;
config_tables = stbl->tables;
nr_tables = stbl->nr_tables;
- size = sizeof(efi_config_table_64_t);
} else {
efi_system_table_32_t *stbl = (efi_system_table_32_t *)systab;
config_tables = stbl->tables;
nr_tables = stbl->nr_tables;
- size = sizeof(efi_config_table_32_t);
}
if (!config_tables)
error("EFI config tables not found.");
- /* Get EFI tables from systab. */
- for (i = 0; i < nr_tables; i++) {
- acpi_physical_address table;
- efi_guid_t guid;
-
- config_tables += size;
-
- if (efi_64) {
- efi_config_table_64_t *tbl = (efi_config_table_64_t *)config_tables;
-
- guid = tbl->guid;
- table = tbl->table;
-
- if (!IS_ENABLED(CONFIG_X86_64) && table >> 32) {
- debug_putstr("Error getting RSDP address: EFI config table located above 4GB.\n");
- return 0;
- }
- } else {
- efi_config_table_32_t *tbl = (efi_config_table_32_t *)config_tables;
-
- guid = tbl->guid;
- table = tbl->table;
- }
-
- if (!(efi_guidcmp(guid, ACPI_TABLE_GUID)))
- rsdp_addr = table;
- else if (!(efi_guidcmp(guid, ACPI_20_TABLE_GUID)))
- return table;
- }
+ return __efi_get_rsdp_addr(config_tables, nr_tables, efi_64);
+#else
+ return 0;
#endif
- return rsdp_addr;
}
static u8 compute_checksum(u8 *buffer, u32 length)
{
acpi_physical_address pa;
- pa = get_acpi_rsdp();
-
- if (!pa)
- pa = boot_params->acpi_rsdp_addr;
+ pa = boot_params->acpi_rsdp_addr;
+ /*
+ * Try to get EFI data from setup_data. This can happen when we're a
+ * kexec'ed kernel and kexec(1) has passed all the required EFI info to
+ * us.
+ */
+ if (!pa)
+ pa = kexec_get_rsdp_addr();
+
if (!pa)
pa = efi_get_rsdp_addr();
char arg[10];
u8 *entry;
- rsdp = (struct acpi_table_rsdp *)(long)boot_params->acpi_rsdp_addr;
+ /*
+ * Check whether we were given an RSDP on the command line. We don't
+ * stash this in boot params because the kernel itself may have
+ * different ideas about whether to trust a command-line parameter.
+ */
+ rsdp = (struct acpi_table_rsdp *)get_cmdline_acpi_rsdp();
+
+ if (!rsdp)
+ rsdp = (struct acpi_table_rsdp *)(long)
+ boot_params->acpi_rsdp_addr;
+
if (!rsdp)
return 0;
/**
* struct x86_init_acpi - x86 ACPI init functions
+ * @set_root_poitner: set RSDP address
* @get_root_pointer: get RSDP address
* @reduced_hw_early_init: hardware reduced platform early init
*/
struct x86_init_acpi {
+ void (*set_root_pointer)(u64 addr);
u64 (*get_root_pointer)(void);
void (*reduced_hw_early_init)(void);
};
extern void x86_early_init_platform_quirks(void);
extern void x86_init_noop(void);
extern void x86_init_uint_noop(unsigned int unused);
+extern bool bool_x86_init_noop(void);
+extern void x86_op_int_noop(int cpu);
extern bool x86_pnpbios_disabled(void);
#endif
static enum efi_secureboot_mode get_sb_mode(void)
{
efi_char16_t efi_SecureBoot_name[] = L"SecureBoot";
+ efi_char16_t efi_SetupMode_name[] = L"SecureBoot";
efi_guid_t efi_variable_guid = EFI_GLOBAL_VARIABLE_GUID;
efi_status_t status;
unsigned long size;
- u8 secboot;
+ u8 secboot, setupmode;
size = sizeof(secboot);
return efi_secureboot_mode_unknown;
}
- if (secboot == 0) {
+ size = sizeof(setupmode);
+ status = efi.get_variable(efi_SetupMode_name, &efi_variable_guid,
+ NULL, &size, &setupmode);
+
+ if (status != EFI_SUCCESS) /* ignore unknown SetupMode */
+ setupmode = 0;
+
+ if (secboot == 0 || setupmode == 1) {
pr_info("ima: secureboot mode disabled\n");
return efi_secureboot_mode_disabled;
}
/* secureboot arch rules */
static const char * const sb_arch_rules[] = {
- #if !IS_ENABLED(CONFIG_KEXEC_VERIFY_SIG)
+ #if !IS_ENABLED(CONFIG_KEXEC_SIG)
"appraise func=KEXEC_KERNEL_CHECK appraise_type=imasig",
- #endif /* CONFIG_KEXEC_VERIFY_SIG */
+ #endif /* CONFIG_KEXEC_SIG */
"measure func=KEXEC_KERNEL_CHECK",
#if !IS_ENABLED(CONFIG_MODULE_SIG)
"appraise func=MODULE_CHECK appraise_type=imasig",
if (efi_enabled(EFI_OLD_MEMMAP))
return 0;
+ params->secure_boot = boot_params.secure_boot;
ei->efi_loader_signature = current_ei->efi_loader_signature;
ei->efi_systab = current_ei->efi_systab;
ei->efi_systab_hi = current_ei->efi_systab_hi;
return ret;
}
+ if (!(header->xloadflags & XLF_5LEVEL) && pgtable_l5_enabled()) {
+ pr_err("bzImage cannot handle 5-level paging mode.\n");
+ return ret;
+ }
+
/* I've got a bzImage */
pr_debug("It's a relocatable bzImage64\n");
ret = 0;
efi_map_offset = params_cmdline_sz;
efi_setup_data_offset = efi_map_offset + ALIGN(efi_map_sz, 16);
- /* Copy setup header onto bootparams. Documentation/x86/boot.txt */
+ /* Copy setup header onto bootparams. Documentation/x86/boot.rst */
setup_header_size = 0x0202 + kernel[0x0201] - setup_hdr_offset;
/* Is there a limit on setup header size? */
void __init x86_init_uint_noop(unsigned int unused) { }
static int __init iommu_init_noop(void) { return 0; }
static void iommu_shutdown_noop(void) { }
-static bool __init bool_x86_init_noop(void) { return false; }
-static void x86_op_int_noop(int cpu) { }
+bool __init bool_x86_init_noop(void) { return false; }
+void x86_op_int_noop(int cpu) { }
/*
* The platform setup functions are preset with the default functions
},
.acpi = {
+ .set_root_pointer = x86_default_set_root_pointer,
.get_root_pointer = x86_default_get_root_pointer,
.reduced_hw_early_init = acpi_generic_reduced_hw_init,
},
#include <linux/uaccess.h>
#include <linux/debugfs.h>
#include <linux/acpi.h>
+ #include <linux/security.h>
#include "internal.h"
struct acpi_table_header table;
acpi_status status;
+ int ret;
+
+ ret = security_locked_down(LOCKDOWN_ACPI_TABLES);
+ if (ret)
+ return ret;
if (!(*ppos)) {
/* parse the table header to get the table length */
if ((*ppos > max_size) ||
(*ppos + count > max_size) ||
(*ppos + count < count) ||
- (count > uncopied_bytes))
+ (count > uncopied_bytes)) {
+ kfree(buf);
return -EINVAL;
+ }
if (copy_from_user(buf + (*ppos), user_buf, count)) {
kfree(buf);
add_taint(TAINT_OVERRIDDEN_ACPI_TABLE, LOCKDEP_NOW_UNRELIABLE);
}
+ kfree(buf);
return count;
}
#include <linux/slab.h>
#include <linux/mm.h>
#include <linux/highmem.h>
+#include <linux/lockdep.h>
#include <linux/pci.h>
#include <linux/interrupt.h>
#include <linux/kmod.h>
#include <linux/list.h>
#include <linux/jiffies.h>
#include <linux/semaphore.h>
+ #include <linux/security.h>
#include <asm/io.h>
#include <linux/uaccess.h>
static LIST_HEAD(acpi_ioremaps);
static DEFINE_MUTEX(acpi_ioremap_lock);
+#define acpi_ioremap_lock_held() lock_is_held(&acpi_ioremap_lock.dep_map)
static void __init acpi_request_region (struct acpi_generic_address *gas,
unsigned int length, char *desc)
acpi_physical_address pa;
#ifdef CONFIG_KEXEC
- if (acpi_rsdp)
+ /*
+ * We may have been provided with an RSDP on the command line,
+ * but if a malicious user has done so they may be pointing us
+ * at modified ACPI tables that could alter kernel behaviour -
+ * so, we check the lockdown status before making use of
+ * it. If we trust it then also stash it in an architecture
+ * specific location (if appropriate) so it can be carried
+ * over further kexec()s.
+ */
+ if (acpi_rsdp && !security_locked_down(LOCKDOWN_ACPI_TABLES)) {
+ acpi_arch_set_root_pointer(acpi_rsdp);
return acpi_rsdp;
+ }
#endif
pa = acpi_arch_get_root_pointer();
if (pa)
{
struct acpi_ioremap *map;
- list_for_each_entry_rcu(map, &acpi_ioremaps, list)
+ list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
if (map->phys <= phys &&
phys + size <= map->phys + map->size)
return map;
{
struct acpi_ioremap *map;
- list_for_each_entry_rcu(map, &acpi_ioremaps, list)
+ list_for_each_entry_rcu(map, &acpi_ioremaps, list, acpi_ioremap_lock_held())
if (map->virt <= virt &&
virt + size <= map->virt + map->size)
return map;
* During early init (when acpi_permanent_mmap has not been set yet) this
* routine simply calls __acpi_map_table() to get the job done.
*/
-void __iomem *__ref
-acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
+void __iomem __ref
+*acpi_os_map_iomem(acpi_physical_address phys, acpi_size size)
{
struct acpi_ioremap *map;
void __iomem *virt;
#include <linux/memblock.h>
#include <linux/earlycpio.h>
#include <linux/initrd.h>
+ #include <linux/security.h>
#include "internal.h"
#ifdef CONFIG_ACPI_CUSTOM_DSDT
/* All but ACPI_SIG_RSDP and ACPI_SIG_FACS: */
static const char * const table_sigs[] = {
- ACPI_SIG_BERT, ACPI_SIG_CPEP, ACPI_SIG_ECDT, ACPI_SIG_EINJ,
- ACPI_SIG_ERST, ACPI_SIG_HEST, ACPI_SIG_MADT, ACPI_SIG_MSCT,
- ACPI_SIG_SBST, ACPI_SIG_SLIT, ACPI_SIG_SRAT, ACPI_SIG_ASF,
- ACPI_SIG_BOOT, ACPI_SIG_DBGP, ACPI_SIG_DMAR, ACPI_SIG_HPET,
- ACPI_SIG_IBFT, ACPI_SIG_IVRS, ACPI_SIG_MCFG, ACPI_SIG_MCHI,
- ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI, ACPI_SIG_TCPA,
- ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT, ACPI_SIG_WDDT,
- ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT, ACPI_SIG_PSDT,
- ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT, ACPI_SIG_IORT,
- ACPI_SIG_NFIT, ACPI_SIG_HMAT, ACPI_SIG_PPTT, NULL };
+ ACPI_SIG_BERT, ACPI_SIG_BGRT, ACPI_SIG_CPEP, ACPI_SIG_ECDT,
+ ACPI_SIG_EINJ, ACPI_SIG_ERST, ACPI_SIG_HEST, ACPI_SIG_MADT,
+ ACPI_SIG_MSCT, ACPI_SIG_SBST, ACPI_SIG_SLIT, ACPI_SIG_SRAT,
+ ACPI_SIG_ASF, ACPI_SIG_BOOT, ACPI_SIG_DBGP, ACPI_SIG_DMAR,
+ ACPI_SIG_HPET, ACPI_SIG_IBFT, ACPI_SIG_IVRS, ACPI_SIG_MCFG,
+ ACPI_SIG_MCHI, ACPI_SIG_SLIC, ACPI_SIG_SPCR, ACPI_SIG_SPMI,
+ ACPI_SIG_TCPA, ACPI_SIG_UEFI, ACPI_SIG_WAET, ACPI_SIG_WDAT,
+ ACPI_SIG_WDDT, ACPI_SIG_WDRT, ACPI_SIG_DSDT, ACPI_SIG_FADT,
+ ACPI_SIG_PSDT, ACPI_SIG_RSDT, ACPI_SIG_XSDT, ACPI_SIG_SSDT,
+ ACPI_SIG_IORT, ACPI_SIG_NFIT, ACPI_SIG_HMAT, ACPI_SIG_PPTT,
+ NULL };
#define ACPI_HEADER_SIZE sizeof(struct acpi_table_header)
if (table_nr == 0)
return;
+ if (security_locked_down(LOCKDOWN_ACPI_TABLES)) {
+ pr_notice("kernel is locked down, ignoring table override\n");
+ return;
+ }
+
acpi_tables_addr =
memblock_find_in_range(0, ACPI_TABLE_UPGRADE_MAX_PHYS,
all_tables_size, PAGE_SIZE);
#include <linux/export.h>
#include <linux/io.h>
#include <linux/uio.h>
-
#include <linux/uaccess.h>
+ #include <linux/security.h>
#ifdef CONFIG_IA64
# include <linux/efi.h>
}
#endif
+static inline bool should_stop_iteration(void)
+{
+ if (need_resched())
+ cond_resched();
+ return fatal_signal_pending(current);
+}
+
/*
* This funcion reads the *physical* memory. The f_pos points directly to the
* memory location.
p += sz;
count -= sz;
read += sz;
+ if (should_stop_iteration())
+ break;
}
kfree(bounce);
p += sz;
count -= sz;
written += sz;
+ if (should_stop_iteration())
+ break;
}
*ppos += written;
read += sz;
low_count -= sz;
count -= sz;
+ if (should_stop_iteration()) {
+ count = 0;
+ break;
+ }
}
}
buf += sz;
read += sz;
p += sz;
+ if (should_stop_iteration())
+ break;
}
free_page((unsigned long)kbuf);
}
p += sz;
count -= sz;
written += sz;
+ if (should_stop_iteration())
+ break;
}
*ppos += written;
buf += sz;
virtr += sz;
p += sz;
+ if (should_stop_iteration())
+ break;
}
free_page((unsigned long)kbuf);
}
static int open_port(struct inode *inode, struct file *filp)
{
- return capable(CAP_SYS_RAWIO) ? 0 : -EPERM;
+ if (!capable(CAP_SYS_RAWIO))
+ return -EPERM;
+
+ return security_locked_down(LOCKDOWN_DEV_MEM);
}
#define zero_lseek null_lseek
#include <linux/acpi.h>
#include <linux/ucs2_string.h>
#include <linux/memblock.h>
+ #include <linux/security.h>
#include <asm/early_ioremap.h>
.acpi20 = EFI_INVALID_TABLE_ADDR,
.smbios = EFI_INVALID_TABLE_ADDR,
.smbios3 = EFI_INVALID_TABLE_ADDR,
- .sal_systab = EFI_INVALID_TABLE_ADDR,
.boot_info = EFI_INVALID_TABLE_ADDR,
.hcdp = EFI_INVALID_TABLE_ADDR,
.uga = EFI_INVALID_TABLE_ADDR,
- .uv_systab = EFI_INVALID_TABLE_ADDR,
.fw_vendor = EFI_INVALID_TABLE_ADDR,
.runtime = EFI_INVALID_TABLE_ADDR,
.config_table = EFI_INVALID_TABLE_ADDR,
.mem_attr_table = EFI_INVALID_TABLE_ADDR,
.rng_seed = EFI_INVALID_TABLE_ADDR,
.tpm_log = EFI_INVALID_TABLE_ADDR,
+ .tpm_final_log = EFI_INVALID_TABLE_ADDR,
.mem_reserve = EFI_INVALID_TABLE_ADDR,
};
EXPORT_SYMBOL(efi);
-static unsigned long *efi_tables[] = {
- &efi.mps,
- &efi.acpi,
- &efi.acpi20,
- &efi.smbios,
- &efi.smbios3,
- &efi.sal_systab,
- &efi.boot_info,
- &efi.hcdp,
- &efi.uga,
- &efi.uv_systab,
- &efi.fw_vendor,
- &efi.runtime,
- &efi.config_table,
- &efi.esrt,
- &efi.properties_table,
- &efi.mem_attr_table,
-};
-
struct mm_struct efi_mm = {
.mm_rb = RB_ROOT,
.mm_users = ATOMIC_INIT(2),
static char efivar_ssdt[EFIVAR_SSDT_NAME_MAX] __initdata;
static int __init efivar_ssdt_setup(char *str)
{
+ int ret = security_locked_down(LOCKDOWN_ACPI_TABLES);
+
+ if (ret)
+ return ret;
+
if (strlen(str) < sizeof(efivar_ssdt))
memcpy(efivar_ssdt, str, strlen(str));
else
{ACPI_TABLE_GUID, "ACPI", &efi.acpi},
{HCDP_TABLE_GUID, "HCDP", &efi.hcdp},
{MPS_TABLE_GUID, "MPS", &efi.mps},
- {SAL_SYSTEM_TABLE_GUID, "SALsystab", &efi.sal_systab},
{SMBIOS_TABLE_GUID, "SMBIOS", &efi.smbios},
{SMBIOS3_TABLE_GUID, "SMBIOS 3.0", &efi.smbios3},
{UGA_IO_PROTOCOL_GUID, "UGA", &efi.uga},
{EFI_MEMORY_ATTRIBUTES_TABLE_GUID, "MEMATTR", &efi.mem_attr_table},
{LINUX_EFI_RANDOM_SEED_TABLE_GUID, "RNG", &efi.rng_seed},
{LINUX_EFI_TPM_EVENT_LOG_GUID, "TPMEventLog", &efi.tpm_log},
+ {LINUX_EFI_TPM_FINAL_LOG_GUID, "TPMFinalLog", &efi.tpm_final_log},
{LINUX_EFI_MEMRESERVE_TABLE_GUID, "MEMRESERVE", &efi.mem_reserve},
+#ifdef CONFIG_EFI_RCI2_TABLE
+ {DELLEMC_EFI_RCI2_TABLE_GUID, NULL, &rci2_table_phys},
+#endif
{NULL_GUID, NULL, NULL},
};
return err;
}
-bool efi_is_table_address(unsigned long phys_addr)
-{
- unsigned int i;
-
- if (phys_addr == EFI_INVALID_TABLE_ADDR)
- return false;
-
- for (i = 0; i < ARRAY_SIZE(efi_tables); i++)
- if (*(efi_tables[i]) == phys_addr)
- return true;
-
- return false;
-}
-
static DEFINE_SPINLOCK(efi_mem_reserve_persistent_lock);
static struct linux_efi_memreserve *efi_memreserve_root __ro_after_init;
return -EINVAL;
switch (linkstat & PCI_EXP_LNKSTA_CLS) {
+ case PCI_EXP_LNKSTA_CLS_32_0GB:
+ speed = "32 GT/s";
+ break;
case PCI_EXP_LNKSTA_CLS_16_0GB:
speed = "16 GT/s";
break;
}
return count;
}
-static struct device_attribute dev_rescan_attr = __ATTR(rescan,
- (S_IWUSR|S_IWGRP),
- NULL, dev_rescan_store);
+static DEVICE_ATTR_WO(dev_rescan);
static ssize_t remove_store(struct device *dev, struct device_attribute *attr,
const char *buf, size_t count)
pci_stop_and_remove_bus_device_locked(to_pci_dev(dev));
return count;
}
-static struct device_attribute dev_remove_attr = __ATTR(remove,
- (S_IWUSR|S_IWGRP),
- NULL, remove_store);
+static DEVICE_ATTR_IGNORE_LOCKDEP(remove, 0220, NULL,
+ remove_store);
-static ssize_t dev_bus_rescan_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf, size_t count)
+static ssize_t bus_rescan_store(struct device *dev,
+ struct device_attribute *attr,
+ const char *buf, size_t count)
{
unsigned long val;
struct pci_bus *bus = to_pci_bus(dev);
}
return count;
}
-static DEVICE_ATTR(rescan, (S_IWUSR|S_IWGRP), NULL, dev_bus_rescan_store);
+static DEVICE_ATTR_WO(bus_rescan);
#if defined(CONFIG_PM) && defined(CONFIG_ACPI)
static ssize_t d3cold_allowed_store(struct device *dev,
static DEVICE_ATTR_RO(devspec);
#endif
-#ifdef CONFIG_PCI_IOV
-static ssize_t sriov_totalvfs_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- return sprintf(buf, "%u\n", pci_sriov_get_totalvfs(pdev));
-}
-
-
-static ssize_t sriov_numvfs_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- return sprintf(buf, "%u\n", pdev->sriov->num_VFs);
-}
-
-/*
- * num_vfs > 0; number of VFs to enable
- * num_vfs = 0; disable all VFs
- *
- * Note: SRIOV spec doesn't allow partial VF
- * disable, so it's all or none.
- */
-static ssize_t sriov_numvfs_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf, size_t count)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
- int ret;
- u16 num_vfs;
-
- ret = kstrtou16(buf, 0, &num_vfs);
- if (ret < 0)
- return ret;
-
- if (num_vfs > pci_sriov_get_totalvfs(pdev))
- return -ERANGE;
-
- device_lock(&pdev->dev);
-
- if (num_vfs == pdev->sriov->num_VFs)
- goto exit;
-
- /* is PF driver loaded w/callback */
- if (!pdev->driver || !pdev->driver->sriov_configure) {
- pci_info(pdev, "Driver doesn't support SRIOV configuration via sysfs\n");
- ret = -ENOENT;
- goto exit;
- }
-
- if (num_vfs == 0) {
- /* disable VFs */
- ret = pdev->driver->sriov_configure(pdev, 0);
- goto exit;
- }
-
- /* enable VFs */
- if (pdev->sriov->num_VFs) {
- pci_warn(pdev, "%d VFs already enabled. Disable before enabling %d VFs\n",
- pdev->sriov->num_VFs, num_vfs);
- ret = -EBUSY;
- goto exit;
- }
-
- ret = pdev->driver->sriov_configure(pdev, num_vfs);
- if (ret < 0)
- goto exit;
-
- if (ret != num_vfs)
- pci_warn(pdev, "%d VFs requested; only %d enabled\n",
- num_vfs, ret);
-
-exit:
- device_unlock(&pdev->dev);
-
- if (ret < 0)
- return ret;
-
- return count;
-}
-
-static ssize_t sriov_offset_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- return sprintf(buf, "%u\n", pdev->sriov->offset);
-}
-
-static ssize_t sriov_stride_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- return sprintf(buf, "%u\n", pdev->sriov->stride);
-}
-
-static ssize_t sriov_vf_device_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- return sprintf(buf, "%x\n", pdev->sriov->vf_device);
-}
-
-static ssize_t sriov_drivers_autoprobe_show(struct device *dev,
- struct device_attribute *attr,
- char *buf)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
-
- return sprintf(buf, "%u\n", pdev->sriov->drivers_autoprobe);
-}
-
-static ssize_t sriov_drivers_autoprobe_store(struct device *dev,
- struct device_attribute *attr,
- const char *buf, size_t count)
-{
- struct pci_dev *pdev = to_pci_dev(dev);
- bool drivers_autoprobe;
-
- if (kstrtobool(buf, &drivers_autoprobe) < 0)
- return -EINVAL;
-
- pdev->sriov->drivers_autoprobe = drivers_autoprobe;
-
- return count;
-}
-
-static struct device_attribute sriov_totalvfs_attr = __ATTR_RO(sriov_totalvfs);
-static struct device_attribute sriov_numvfs_attr =
- __ATTR(sriov_numvfs, (S_IRUGO|S_IWUSR|S_IWGRP),
- sriov_numvfs_show, sriov_numvfs_store);
-static struct device_attribute sriov_offset_attr = __ATTR_RO(sriov_offset);
-static struct device_attribute sriov_stride_attr = __ATTR_RO(sriov_stride);
-static struct device_attribute sriov_vf_device_attr = __ATTR_RO(sriov_vf_device);
-static struct device_attribute sriov_drivers_autoprobe_attr =
- __ATTR(sriov_drivers_autoprobe, (S_IRUGO|S_IWUSR|S_IWGRP),
- sriov_drivers_autoprobe_show, sriov_drivers_autoprobe_store);
-#endif /* CONFIG_PCI_IOV */
-
static ssize_t driver_override_store(struct device *dev,
struct device_attribute *attr,
const char *buf, size_t count)
};
static struct attribute *pcibus_attrs[] = {
- &dev_attr_rescan.attr,
+ &dev_attr_bus_rescan.attr,
&dev_attr_cpuaffinity.attr,
&dev_attr_cpulistaffinity.attr,
NULL,
!!(pdev->resource[PCI_ROM_RESOURCE].flags &
IORESOURCE_ROM_SHADOW));
}
-static struct device_attribute vga_attr = __ATTR_RO(boot_vga);
+static DEVICE_ATTR_RO(boot_vga);
static ssize_t pci_read_config(struct file *filp, struct kobject *kobj,
struct bin_attribute *bin_attr, char *buf,
unsigned int size = count;
loff_t init_off = off;
u8 *data = (u8 *) buf;
+ int ret;
+
+ ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+ if (ret)
+ return ret;
if (off > dev->cfg_size)
return 0;
sysfs_bin_attr_init(b->legacy_io);
b->legacy_io->attr.name = "legacy_io";
b->legacy_io->size = 0xffff;
- b->legacy_io->attr.mode = S_IRUSR | S_IWUSR;
+ b->legacy_io->attr.mode = 0600;
b->legacy_io->read = pci_read_legacy_io;
b->legacy_io->write = pci_write_legacy_io;
b->legacy_io->mmap = pci_mmap_legacy_io;
sysfs_bin_attr_init(b->legacy_mem);
b->legacy_mem->attr.name = "legacy_mem";
b->legacy_mem->size = 1024*1024;
- b->legacy_mem->attr.mode = S_IRUSR | S_IWUSR;
+ b->legacy_mem->attr.mode = 0600;
b->legacy_mem->mmap = pci_mmap_legacy_mem;
pci_adjust_legacy_attr(b, pci_mmap_mem);
error = device_create_bin_file(&b->dev, b->legacy_mem);
int bar = (unsigned long)attr->private;
enum pci_mmap_state mmap_type;
struct resource *res = &pdev->resource[bar];
+ int ret;
+
+ ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+ if (ret)
+ return ret;
if (res->flags & IORESOURCE_MEM && iomem_is_exclusive(res->start))
return -EINVAL;
struct bin_attribute *attr, char *buf,
loff_t off, size_t count)
{
+ int ret;
+
+ ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+ if (ret)
+ return ret;
+
return pci_resource_io(filp, kobj, attr, buf, off, count, true);
}
}
}
res_attr->attr.name = res_attr_name;
- res_attr->attr.mode = S_IRUSR | S_IWUSR;
+ res_attr->attr.mode = 0600;
res_attr->size = pci_resource_len(pdev, num);
res_attr->private = (void *)(unsigned long)num;
retval = sysfs_create_bin_file(&pdev->dev.kobj, res_attr);
static const struct bin_attribute pci_config_attr = {
.attr = {
.name = "config",
- .mode = S_IRUGO | S_IWUSR,
+ .mode = 0644,
},
.size = PCI_CFG_SPACE_SIZE,
.read = pci_read_config,
static const struct bin_attribute pcie_config_attr = {
.attr = {
.name = "config",
- .mode = S_IRUGO | S_IWUSR,
+ .mode = 0644,
},
.size = PCI_CFG_SPACE_EXP_SIZE,
.read = pci_read_config,
return count;
}
-static struct device_attribute reset_attr = __ATTR(reset, 0200, NULL, reset_store);
+static DEVICE_ATTR(reset, 0200, NULL, reset_store);
static int pci_create_capabilities_sysfs(struct pci_dev *dev)
{
pcie_aspm_create_sysfs_dev_files(dev);
if (dev->reset_fn) {
- retval = device_create_file(&dev->dev, &reset_attr);
+ retval = device_create_file(&dev->dev, &dev_attr_reset);
if (retval)
goto error;
}
sysfs_bin_attr_init(attr);
attr->size = rom_size;
attr->attr.name = "rom";
- attr->attr.mode = S_IRUSR | S_IWUSR;
+ attr->attr.mode = 0600;
attr->read = pci_read_rom;
attr->write = pci_write_rom;
retval = sysfs_create_bin_file(&pdev->dev.kobj, attr);
pcie_vpd_remove_sysfs_dev_files(dev);
pcie_aspm_remove_sysfs_dev_files(dev);
if (dev->reset_fn) {
- device_remove_file(&dev->dev, &reset_attr);
+ device_remove_file(&dev->dev, &dev_attr_reset);
dev->reset_fn = 0;
}
}
late_initcall(pci_sysfs_init);
static struct attribute *pci_dev_dev_attrs[] = {
- &vga_attr.attr,
+ &dev_attr_boot_vga.attr,
NULL,
};
struct device *dev = kobj_to_dev(kobj);
struct pci_dev *pdev = to_pci_dev(dev);
- if (a == &vga_attr.attr)
+ if (a == &dev_attr_boot_vga.attr)
if ((pdev->class >> 8) != PCI_CLASS_DISPLAY_VGA)
return 0;
}
static struct attribute *pci_dev_hp_attrs[] = {
- &dev_remove_attr.attr,
- &dev_rescan_attr.attr,
+ &dev_attr_remove.attr,
+ &dev_attr_dev_rescan.attr,
NULL,
};
.is_visible = pci_dev_hp_attrs_are_visible,
};
-#ifdef CONFIG_PCI_IOV
-static struct attribute *sriov_dev_attrs[] = {
- &sriov_totalvfs_attr.attr,
- &sriov_numvfs_attr.attr,
- &sriov_offset_attr.attr,
- &sriov_stride_attr.attr,
- &sriov_vf_device_attr.attr,
- &sriov_drivers_autoprobe_attr.attr,
- NULL,
-};
-
-static umode_t sriov_attrs_are_visible(struct kobject *kobj,
- struct attribute *a, int n)
-{
- struct device *dev = kobj_to_dev(kobj);
-
- if (!dev_is_pf(dev))
- return 0;
-
- return a->mode;
-}
-
-static const struct attribute_group sriov_dev_attr_group = {
- .attrs = sriov_dev_attrs,
- .is_visible = sriov_attrs_are_visible,
-};
-#endif /* CONFIG_PCI_IOV */
-
static const struct attribute_group pci_dev_attr_group = {
.attrs = pci_dev_dev_attrs,
.is_visible = pci_dev_attrs_are_visible,
#include <linux/seq_file.h>
#include <linux/capability.h>
#include <linux/uaccess.h>
+ #include <linux/security.h>
#include <asm/byteorder.h>
#include "pci.h"
struct pci_dev *dev = PDE_DATA(ino);
int pos = *ppos;
int size = dev->cfg_size;
- int cnt;
+ int cnt, ret;
+
+ ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+ if (ret)
+ return ret;
if (pos >= size)
return 0;
#endif /* HAVE_PCI_MMAP */
int ret = 0;
+ ret = security_locked_down(LOCKDOWN_PCI_ACCESS);
+ if (ret)
+ return ret;
+
switch (cmd) {
case PCIIOC_CONTROLLER:
ret = pci_domain_nr(dev->bus);
struct pci_filp_private *fpriv = file->private_data;
int i, ret, write_combine = 0, res_bit = IORESOURCE_MEM;
- if (!capable(CAP_SYS_RAWIO))
+ if (!capable(CAP_SYS_RAWIO) ||
+ security_locked_down(LOCKDOWN_PCI_ACCESS))
return -EPERM;
if (fpriv->mmap_state == pci_mmap_io) {
}
seq_putc(m, '\t');
if (drv)
- seq_printf(m, "%s", drv->name);
+ seq_puts(m, drv->name);
seq_putc(m, '\n');
return 0;
}
#include <linux/serial_core.h>
#include <linux/delay.h>
#include <linux/mutex.h>
+ #include <linux/security.h>
#include <linux/irq.h>
#include <linux/uaccess.h>
goto check_and_exit;
}
+ retval = security_locked_down(LOCKDOWN_TIOCSSERIAL);
+ if (retval && (change_irq || change_port))
+ goto exit;
+
/*
* Ask the low level driver to verify the settings.
*/
{
struct uart_state *state = container_of(port, struct uart_state, port);
struct uart_port *uport;
+ int ret;
uport = uart_port_check(state);
if (!uport || uport->flags & UPF_DEAD)
/*
* Start up the serial port.
*/
- return uart_startup(tty, state, 0);
+ ret = uart_startup(tty, state, 0);
+ if (ret > 0)
+ tty_port_set_active(port, 1);
+
+ return ret;
}
static const char *uart_type(struct uart_port *port)
#include <linux/atomic.h>
#include <linux/device.h>
#include <linux/poll.h>
+ #include <linux/security.h>
#include "internal.h"
}
EXPORT_SYMBOL_GPL(debugfs_file_put);
+ /*
+ * Only permit access to world-readable files when the kernel is locked down.
+ * We also need to exclude any file that has ways to write or alter it as root
+ * can bypass the permissions check.
+ */
+ static bool debugfs_is_locked_down(struct inode *inode,
+ struct file *filp,
+ const struct file_operations *real_fops)
+ {
+ if ((inode->i_mode & 07777) == 0444 &&
+ !(filp->f_mode & FMODE_WRITE) &&
+ !real_fops->unlocked_ioctl &&
+ !real_fops->compat_ioctl &&
+ !real_fops->mmap)
+ return false;
+
+ return security_locked_down(LOCKDOWN_DEBUGFS);
+ }
+
static int open_proxy_open(struct inode *inode, struct file *filp)
{
struct dentry *dentry = F_DENTRY(filp);
return r == -EIO ? -ENOENT : r;
real_fops = debugfs_real_fops(filp);
+
+ r = debugfs_is_locked_down(inode, filp, real_fops);
+ if (r)
+ goto out;
+
real_fops = fops_get(real_fops);
if (!real_fops) {
/* Huh? Module did not clean up after itself at exit? */
return r == -EIO ? -ENOENT : r;
real_fops = debugfs_real_fops(filp);
+
+ r = debugfs_is_locked_down(inode, filp, real_fops);
+ if (r)
+ goto out;
+
real_fops = fops_get(real_fops);
if (!real_fops) {
/* Huh? Module did not cleanup after itself at exit? */
* @array as data. If the @mode variable is so set it can be read from.
* Writing is not supported. Seek within the file is also not supported.
* Once array is created its size can not be changed.
- *
- * The function returns a pointer to dentry on success. If an error occurs,
- * %ERR_PTR(-ERROR) or NULL will be returned. If debugfs is not enabled in
- * the kernel, the value %ERR_PTR(-ENODEV) will be returned.
*/
-struct dentry *debugfs_create_u32_array(const char *name, umode_t mode,
- struct dentry *parent,
- u32 *array, u32 elements)
+void debugfs_create_u32_array(const char *name, umode_t mode,
+ struct dentry *parent, u32 *array, u32 elements)
{
struct array_data *data = kmalloc(sizeof(*data), GFP_KERNEL);
if (data == NULL)
- return NULL;
+ return;
data->array = array;
data->elements = elements;
- return debugfs_create_file_unsafe(name, mode, parent, data,
- &u32_array_fops);
+ debugfs_create_file_unsafe(name, mode, parent, data, &u32_array_fops);
}
EXPORT_SYMBOL_GPL(debugfs_create_u32_array);
/*
* inode.c - part of debugfs, a tiny little debug file system
*
- * Copyright (C) 2004 Greg Kroah-Hartman <greg@kroah.com>
+ * Copyright (C) 2004,2019 Greg Kroah-Hartman <greg@kroah.com>
* Copyright (C) 2004 IBM Inc.
+ * Copyright (C) 2019 Linux Foundation <gregkh@linuxfoundation.org>
*
* debugfs is for people to use instead of /proc or /sys.
* See ./Documentation/core-api/kernel-api.rst for more details.
*/
+#define pr_fmt(fmt) "debugfs: " fmt
+
#include <linux/module.h>
#include <linux/fs.h>
#include <linux/mount.h>
#include <linux/parser.h>
#include <linux/magic.h>
#include <linux/slab.h>
+ #include <linux/security.h>
#include "internal.h"
static int debugfs_mount_count;
static bool debugfs_registered;
+ /*
+ * Don't allow access attributes to be changed whilst the kernel is locked down
+ * so that we can use the file mode as part of a heuristic to determine whether
+ * to lock down individual files.
+ */
+ static int debugfs_setattr(struct dentry *dentry, struct iattr *ia)
+ {
+ int ret = security_locked_down(LOCKDOWN_DEBUGFS);
+
+ if (ret && (ia->ia_valid & (ATTR_MODE | ATTR_UID | ATTR_GID)))
+ return ret;
+ return simple_setattr(dentry, ia);
+ }
+
+ static const struct inode_operations debugfs_file_inode_operations = {
+ .setattr = debugfs_setattr,
+ };
+ static const struct inode_operations debugfs_dir_inode_operations = {
+ .lookup = simple_lookup,
+ .setattr = debugfs_setattr,
+ };
+ static const struct inode_operations debugfs_symlink_inode_operations = {
+ .get_link = simple_get_link,
+ .setattr = debugfs_setattr,
+ };
+
static struct inode *debugfs_get_inode(struct super_block *sb)
{
struct inode *inode = new_inode(sb);
struct dentry *dentry;
int error;
- pr_debug("debugfs: creating file '%s'\n",name);
+ pr_debug("creating file '%s'\n", name);
if (IS_ERR(parent))
return parent;
error = simple_pin_fs(&debug_fs_type, &debugfs_mount,
&debugfs_mount_count);
- if (error)
+ if (error) {
+ pr_err("Unable to pin filesystem for file '%s'\n", name);
return ERR_PTR(error);
+ }
/* If the parent is not specified, we create it in the root.
* We need the root dentry to do this, which is in the super
inode_lock(d_inode(parent));
dentry = lookup_one_len(name, parent, strlen(name));
if (!IS_ERR(dentry) && d_really_is_positive(dentry)) {
+ if (d_is_dir(dentry))
+ pr_err("Directory '%s' with parent '%s' already present!\n",
+ name, parent->d_name.name);
+ else
+ pr_err("File '%s' in directory '%s' already present!\n",
+ name, parent->d_name.name);
dput(dentry);
dentry = ERR_PTR(-EEXIST);
}
return dentry;
inode = debugfs_get_inode(dentry->d_sb);
- if (unlikely(!inode))
+ if (unlikely(!inode)) {
+ pr_err("out of free dentries, can not create file '%s'\n",
+ name);
return failed_creating(dentry);
+ }
inode->i_mode = mode;
inode->i_private = data;
+ inode->i_op = &debugfs_file_inode_operations;
inode->i_fop = proxy_fops;
dentry->d_fsdata = (void *)((unsigned long)real_fops |
DEBUGFS_FSDATA_IS_REAL_FOPS_BIT);
return dentry;
inode = debugfs_get_inode(dentry->d_sb);
- if (unlikely(!inode))
+ if (unlikely(!inode)) {
+ pr_err("out of free dentries, can not create directory '%s'\n",
+ name);
return failed_creating(dentry);
+ }
inode->i_mode = S_IFDIR | S_IRWXU | S_IRUGO | S_IXUGO;
- inode->i_op = &simple_dir_inode_operations;
+ inode->i_op = &debugfs_dir_inode_operations;
inode->i_fop = &simple_dir_operations;
/* directory inodes start off with i_nlink == 2 (for "." entry) */
return dentry;
inode = debugfs_get_inode(dentry->d_sb);
- if (unlikely(!inode))
+ if (unlikely(!inode)) {
+ pr_err("out of free dentries, can not create automount '%s'\n",
+ name);
return failed_creating(dentry);
+ }
make_empty_dir_inode(inode);
inode->i_flags |= S_AUTOMOUNT;
inode = debugfs_get_inode(dentry->d_sb);
if (unlikely(!inode)) {
+ pr_err("out of free dentries, can not create symlink '%s'\n",
+ name);
kfree(link);
return failed_creating(dentry);
}
inode->i_mode = S_IFLNK | S_IRWXUGO;
- inode->i_op = &simple_symlink_inode_operations;
+ inode->i_op = &debugfs_symlink_inode_operations;
inode->i_link = link;
d_instantiate(dentry, inode);
return end_creating(dentry);
}
EXPORT_SYMBOL_GPL(debugfs_create_symlink);
-static void __debugfs_remove_file(struct dentry *dentry, struct dentry *parent)
+static void __debugfs_file_removed(struct dentry *dentry)
{
struct debugfs_fsdata *fsd;
- simple_unlink(d_inode(parent), dentry);
- d_delete(dentry);
-
/*
* Paired with the closing smp_mb() implied by a successful
* cmpxchg() in debugfs_file_get(): either
if (simple_positive(dentry)) {
dget(dentry);
- if (!d_is_reg(dentry)) {
- if (d_is_dir(dentry))
- ret = simple_rmdir(d_inode(parent), dentry);
- else
- simple_unlink(d_inode(parent), dentry);
+ if (d_is_dir(dentry)) {
+ ret = simple_rmdir(d_inode(parent), dentry);
if (!ret)
- d_delete(dentry);
+ fsnotify_rmdir(d_inode(parent), dentry);
} else {
- __debugfs_remove_file(dentry, parent);
+ simple_unlink(d_inode(parent), dentry);
+ fsnotify_unlink(d_inode(parent), dentry);
}
+ if (!ret)
+ d_delete(dentry);
+ if (d_is_reg(dentry))
+ __debugfs_file_removed(dentry);
dput(dentry);
}
return ret;
#include <linux/parser.h>
#include <linux/magic.h>
#include <linux/slab.h>
+ #include <linux/security.h>
#define TRACEFS_DEFAULT_MODE 0700
static int tracefs_mount_count;
static bool tracefs_registered;
+ static int default_open_file(struct inode *inode, struct file *filp)
+ {
+ struct dentry *dentry = filp->f_path.dentry;
+ struct file_operations *real_fops;
+ int ret;
+
+ if (!dentry)
+ return -EINVAL;
+
+ ret = security_locked_down(LOCKDOWN_TRACEFS);
+ if (ret)
+ return ret;
+
+ real_fops = dentry->d_fsdata;
+ if (!real_fops->open)
+ return 0;
+ return real_fops->open(inode, filp);
+ }
+
static ssize_t default_read_file(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
return 0;
}
+ static void tracefs_destroy_inode(struct inode *inode)
+ {
+ if (S_ISREG(inode->i_mode))
+ kfree(inode->i_fop);
+ }
+
static int tracefs_remount(struct super_block *sb, int *flags, char *data)
{
int err;
static const struct super_operations tracefs_super_operations = {
.statfs = simple_statfs,
.remount_fs = tracefs_remount,
+ .destroy_inode = tracefs_destroy_inode,
.show_options = tracefs_show_options,
};
struct dentry *parent, void *data,
const struct file_operations *fops)
{
+ struct file_operations *proxy_fops;
struct dentry *dentry;
struct inode *inode;
if (unlikely(!inode))
return failed_creating(dentry);
+ proxy_fops = kzalloc(sizeof(struct file_operations), GFP_KERNEL);
+ if (unlikely(!proxy_fops)) {
+ iput(inode);
+ return failed_creating(dentry);
+ }
+
+ if (!fops)
+ fops = &tracefs_file_operations;
+
+ dentry->d_fsdata = (void *)fops;
+ memcpy(proxy_fops, fops, sizeof(*proxy_fops));
+ proxy_fops->open = default_open_file;
inode->i_mode = mode;
- inode->i_fop = fops ? fops : &tracefs_file_operations;
+ inode->i_fop = proxy_fops;
inode->i_private = data;
d_instantiate(dentry, inode);
fsnotify_create(dentry->d_parent->d_inode, dentry);
switch (dentry->d_inode->i_mode & S_IFMT) {
case S_IFDIR:
ret = simple_rmdir(parent->d_inode, dentry);
+ if (!ret)
+ fsnotify_rmdir(parent->d_inode, dentry);
break;
default:
simple_unlink(parent->d_inode, dentry);
+ fsnotify_unlink(parent->d_inode, dentry);
break;
}
if (!ret)
#endif
#ifdef CONFIG_FTRACE_MCOUNT_RECORD
+#ifdef CC_USING_PATCHABLE_FUNCTION_ENTRY
+#define MCOUNT_REC() . = ALIGN(8); \
+ __start_mcount_loc = .; \
+ KEEP(*(__patchable_function_entries)) \
+ __stop_mcount_loc = .;
+#else
#define MCOUNT_REC() . = ALIGN(8); \
__start_mcount_loc = .; \
KEEP(*(__mcount_loc)) \
__stop_mcount_loc = .;
+#endif
#else
#define MCOUNT_REC()
#endif
__start_lsm_info = .; \
KEEP(*(.lsm_info.init)) \
__end_lsm_info = .;
+ #define EARLY_LSM_TABLE() . = ALIGN(8); \
+ __start_early_lsm_info = .; \
+ KEEP(*(.early_lsm_info.init)) \
+ __end_early_lsm_info = .;
#else
#define LSM_TABLE()
+ #define EARLY_LSM_TABLE()
#endif
#define ___OF_TABLE(cfg, name) _OF_TABLE_##cfg(name)
#define ACPI_PROBE_TABLE(name)
#endif
+#ifdef CONFIG_THERMAL
+#define THERMAL_TABLE(name) \
+ . = ALIGN(8); \
+ __##name##_thermal_table = .; \
+ KEEP(*(__##name##_thermal_table)) \
+ __##name##_thermal_table_end = .;
+#else
+#define THERMAL_TABLE(name)
+#endif
+
#define KERNEL_DTB() \
STRUCT_ALIGN(); \
__dtb_start = .; \
IRQCHIP_OF_MATCH_TABLE() \
ACPI_PROBE_TABLE(irqchip) \
ACPI_PROBE_TABLE(timer) \
+ THERMAL_TABLE(governor) \
EARLYCON_TABLE() \
- LSM_TABLE()
+ LSM_TABLE() \
+ EARLY_LSM_TABLE()
#define INIT_TEXT \
*(.init.text .init.text.*) \
#include <linux/errno.h>
#include <linux/ioport.h> /* for struct resource */
+#include <linux/irqdomain.h>
#include <linux/resource_ext.h>
#include <linux/device.h>
#include <linux/property.h>
void acpi_set_irq_model(enum acpi_irq_model_id model,
struct fwnode_handle *fwnode);
+struct irq_domain *acpi_irq_create_hierarchy(unsigned int flags,
+ unsigned int size,
+ struct fwnode_handle *fwnode,
+ const struct irq_domain_ops *ops,
+ void *host_data);
+
#ifdef CONFIG_X86_IO_APIC
extern int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity);
#else
-#define acpi_get_override_irq(gsi, trigger, polarity) (-1)
+static inline int acpi_get_override_irq(u32 gsi, int *trigger, int *polarity)
+{
+ return -1;
+}
#endif
/*
* This function undoes the effect of one call to acpi_register_gsi().
extern acpi_status wmi_remove_notify_handler(const char *guid);
extern acpi_status wmi_get_event_data(u32 event, struct acpi_buffer *out);
extern bool wmi_has_guid(const char *guid);
+extern char *wmi_get_acpi_device_uid(const char *guid);
#endif /* CONFIG_ACPI_WMI */
int acpi_arch_timer_mem_init(struct arch_timer_mem *timer_mem, int *timer_count);
#endif
+ #ifndef ACPI_HAVE_ARCH_SET_ROOT_POINTER
+ static inline void acpi_arch_set_root_pointer(u64 addr)
+ {
+ }
+ #endif
+
#ifndef ACPI_HAVE_ARCH_GET_ROOT_POINTER
static inline u64 acpi_arch_get_root_pointer(void)
{
#endif
#if defined(CONFIG_ACPI) && defined(CONFIG_PM_SLEEP)
-int acpi_dev_suspend_late(struct device *dev);
int acpi_subsys_prepare(struct device *dev);
void acpi_subsys_complete(struct device *dev);
int acpi_subsys_suspend_late(struct device *dev);
int acpi_subsys_suspend_noirq(struct device *dev);
-int acpi_subsys_resume_noirq(struct device *dev);
-int acpi_subsys_resume_early(struct device *dev);
int acpi_subsys_suspend(struct device *dev);
int acpi_subsys_freeze(struct device *dev);
-int acpi_subsys_freeze_late(struct device *dev);
-int acpi_subsys_freeze_noirq(struct device *dev);
-int acpi_subsys_thaw_noirq(struct device *dev);
+int acpi_subsys_poweroff(struct device *dev);
+void acpi_ec_mark_gpe_for_wake(void);
+void acpi_ec_set_gpe_wake_mask(u8 action);
#else
-static inline int acpi_dev_resume_early(struct device *dev) { return 0; }
static inline int acpi_subsys_prepare(struct device *dev) { return 0; }
static inline void acpi_subsys_complete(struct device *dev) {}
static inline int acpi_subsys_suspend_late(struct device *dev) { return 0; }
static inline int acpi_subsys_suspend_noirq(struct device *dev) { return 0; }
-static inline int acpi_subsys_resume_noirq(struct device *dev) { return 0; }
-static inline int acpi_subsys_resume_early(struct device *dev) { return 0; }
static inline int acpi_subsys_suspend(struct device *dev) { return 0; }
static inline int acpi_subsys_freeze(struct device *dev) { return 0; }
-static inline int acpi_subsys_freeze_late(struct device *dev) { return 0; }
-static inline int acpi_subsys_freeze_noirq(struct device *dev) { return 0; }
-static inline int acpi_subsys_thaw_noirq(struct device *dev) { return 0; }
+static inline int acpi_subsys_poweroff(struct device *dev) { return 0; }
+static inline void acpi_ec_mark_gpe_for_wake(void) {}
+static inline void acpi_ec_set_gpe_wake_mask(u8 action) {}
#endif
#ifdef CONFIG_ACPI
#endif
#endif
-struct acpi_gpio_params {
- unsigned int crs_entry_index;
- unsigned int line_index;
- bool active_low;
-};
-
-struct acpi_gpio_mapping {
- const char *name;
- const struct acpi_gpio_params *data;
- unsigned int size;
-
-/* Ignore IoRestriction field */
-#define ACPI_GPIO_QUIRK_NO_IO_RESTRICTION BIT(0)
-/*
- * When ACPI GPIO mapping table is in use the index parameter inside it
- * refers to the GPIO resource in _CRS method. That index has no
- * distinction of actual type of the resource. When consumer wants to
- * get GpioIo type explicitly, this quirk may be used.
- */
-#define ACPI_GPIO_QUIRK_ONLY_GPIOIO BIT(1)
-
- unsigned int quirks;
-};
-
#if defined(CONFIG_ACPI) && defined(CONFIG_GPIOLIB)
-int acpi_dev_add_driver_gpios(struct acpi_device *adev,
- const struct acpi_gpio_mapping *gpios);
-
-static inline void acpi_dev_remove_driver_gpios(struct acpi_device *adev)
-{
- if (adev)
- adev->driver_gpios = NULL;
-}
-
-int devm_acpi_dev_add_driver_gpios(struct device *dev,
- const struct acpi_gpio_mapping *gpios);
-void devm_acpi_dev_remove_driver_gpios(struct device *dev);
-
bool acpi_gpio_get_irq_resource(struct acpi_resource *ares,
struct acpi_resource_gpio **agpio);
int acpi_dev_gpio_irq_get(struct acpi_device *adev, int index);
#else
-static inline int acpi_dev_add_driver_gpios(struct acpi_device *adev,
- const struct acpi_gpio_mapping *gpios)
-{
- return -ENXIO;
-}
-static inline void acpi_dev_remove_driver_gpios(struct acpi_device *adev) {}
-
-static inline int devm_acpi_dev_add_driver_gpios(struct device *dev,
- const struct acpi_gpio_mapping *gpios)
-{
- return -ENXIO;
-}
-static inline void devm_acpi_dev_remove_driver_gpios(struct device *dev) {}
-
static inline bool acpi_gpio_get_irq_resource(struct acpi_resource *ares,
struct acpi_resource_gpio **agpio)
{
#endif
#ifdef CONFIG_ACPI_PPTT
+int acpi_pptt_cpu_is_thread(unsigned int cpu);
int find_acpi_cpu_topology(unsigned int cpu, int level);
int find_acpi_cpu_topology_package(unsigned int cpu);
+int find_acpi_cpu_topology_hetero_id(unsigned int cpu);
int find_acpi_cpu_cache_topology(unsigned int cpu, int level);
#else
+static inline int acpi_pptt_cpu_is_thread(unsigned int cpu)
+{
+ return -EINVAL;
+}
static inline int find_acpi_cpu_topology(unsigned int cpu, int level)
{
return -EINVAL;
{
return -EINVAL;
}
+static inline int find_acpi_cpu_topology_hetero_id(unsigned int cpu)
+{
+ return -EINVAL;
+}
static inline int find_acpi_cpu_cache_topology(unsigned int cpu, int level)
{
return -EINVAL;
extern int ima_post_read_file(struct file *file, void *buf, loff_t size,
enum kernel_read_file_id id);
extern void ima_post_path_mknod(struct dentry *dentry);
+extern void ima_kexec_cmdline(const void *buf, int size);
#ifdef CONFIG_IMA_KEXEC
extern void ima_add_kexec_buffer(struct kimage *image);
return;
}
+static inline void ima_kexec_cmdline(const void *buf, int size) {}
#endif /* CONFIG_IMA */
#ifndef CONFIG_IMA_KEXEC
return 0;
}
#endif /* CONFIG_IMA_APPRAISE */
+
+ #if defined(CONFIG_IMA_APPRAISE) && defined(CONFIG_INTEGRITY_TRUSTED_KEYRING)
+ extern bool ima_appraise_signature(enum kernel_read_file_id func);
+ #else
+ static inline bool ima_appraise_signature(enum kernel_read_file_id func)
+ {
+ return false;
+ }
+ #endif /* CONFIG_IMA_APPRAISE && CONFIG_INTEGRITY_TRUSTED_KEYRING */
#endif /* _LINUX_IMA_H */
unsigned long cmdline_len);
typedef int (kexec_cleanup_t)(void *loader_data);
- #ifdef CONFIG_KEXEC_VERIFY_SIG
+ #ifdef CONFIG_KEXEC_SIG
typedef int (kexec_verify_sig_t)(const char *kernel_buf,
unsigned long kernel_len);
#endif
kexec_probe_t *probe;
kexec_load_t *load;
kexec_cleanup_t *cleanup;
- #ifdef CONFIG_KEXEC_VERIFY_SIG
+ #ifdef CONFIG_KEXEC_SIG
kexec_verify_sig_t *verify_sig;
#endif
};
bool get_value);
void *kexec_purgatory_get_symbol_addr(struct kimage *image, const char *name);
+int __weak arch_kexec_kernel_image_probe(struct kimage *image, void *buf,
+ unsigned long buf_len);
void * __weak arch_kexec_kernel_image_load(struct kimage *image);
int __weak arch_kexec_apply_relocations_add(struct purgatory_info *pi,
Elf_Shdr *section,
void **addr, unsigned long *sz);
#endif /* CONFIG_KEXEC_FILE */
+#ifdef CONFIG_KEXEC_ELF
+struct kexec_elf_info {
+ /*
+ * Where the ELF binary contents are kept.
+ * Memory managed by the user of the struct.
+ */
+ const char *buffer;
+
+ const struct elfhdr *ehdr;
+ const struct elf_phdr *proghdrs;
+};
+
+int kexec_build_elf_info(const char *buf, size_t len, struct elfhdr *ehdr,
+ struct kexec_elf_info *elf_info);
+
+int kexec_elf_load(struct kimage *image, struct elfhdr *ehdr,
+ struct kexec_elf_info *elf_info,
+ struct kexec_buf *kbuf,
+ unsigned long *lowest_load_addr);
+
+void kexec_free_elf_info(struct kexec_elf_info *elf_info);
+int kexec_elf_probe(const char *buf, unsigned long len);
+#endif
struct kimage {
kimage_entry_t head;
kimage_entry_t *entry;
* state. This is called immediately after commit_creds().
*
* Security hooks for mount using fs_context.
- * [See also Documentation/filesystems/mounting.txt]
+ * [See also Documentation/filesystems/mount_api.txt]
*
* @fs_context_dup:
* Allocate and attach a security structure to sc->security. This pointer
* Check for permission to change root directory.
* @path contains the path structure.
* Return 0 if permission is granted.
+ * @path_notify:
+ * Check permissions before setting a watch on events as defined by @mask,
+ * on an object at @path, whose type is defined by @obj_type.
* @inode_readlink:
* Check the permission to read the symbolic link.
* @dentry contains the dentry structure for the file link.
* @bpf_prog_free_security:
* Clean up the security information stored inside bpf prog.
*
+ * @locked_down
+ * Determine whether a kernel feature that potentially enables arbitrary
+ * code execution in kernel space should be permitted.
+ *
+ * @what: kernel feature being accessed
*/
union security_list_options {
int (*binder_set_context_mgr)(struct task_struct *mgr);
int (*path_chown)(const struct path *path, kuid_t uid, kgid_t gid);
int (*path_chroot)(const struct path *path);
#endif
-
+ /* Needed for inode based security check */
+ int (*path_notify)(const struct path *path, u64 mask,
+ unsigned int obj_type);
int (*inode_alloc_security)(struct inode *inode);
void (*inode_free_security)(struct inode *inode);
int (*inode_init_security)(struct inode *inode, struct inode *dir,
int (*bpf_prog_alloc_security)(struct bpf_prog_aux *aux);
void (*bpf_prog_free_security)(struct bpf_prog_aux *aux);
#endif /* CONFIG_BPF_SYSCALL */
+ int (*locked_down)(enum lockdown_reason what);
};
struct security_hook_heads {
struct hlist_head path_chown;
struct hlist_head path_chroot;
#endif
+ /* Needed for inode based modules as well */
+ struct hlist_head path_notify;
struct hlist_head inode_alloc_security;
struct hlist_head inode_free_security;
struct hlist_head inode_init_security;
struct hlist_head bpf_prog_alloc_security;
struct hlist_head bpf_prog_free_security;
#endif /* CONFIG_BPF_SYSCALL */
+ struct hlist_head locked_down;
} __randomize_layout;
/*
};
extern struct lsm_info __start_lsm_info[], __end_lsm_info[];
+ extern struct lsm_info __start_early_lsm_info[], __end_early_lsm_info[];
#define DEFINE_LSM(lsm) \
static struct lsm_info __lsm_##lsm \
__used __section(.lsm_info.init) \
__aligned(sizeof(unsigned long))
+ #define DEFINE_EARLY_LSM(lsm) \
+ static struct lsm_info __early_lsm_##lsm \
+ __used __section(.early_lsm_info.init) \
+ __aligned(sizeof(unsigned long))
+
#ifdef CONFIG_SECURITY_SELINUX_DISABLE
/*
* Assuring the safety of deleting a security module is up to
LSM_POLICY_CHANGE,
};
+ /*
+ * These are reasons that can be passed to the security_locked_down()
+ * LSM hook. Lockdown reasons that protect kernel integrity (ie, the
+ * ability for userland to modify kernel code) are placed before
+ * LOCKDOWN_INTEGRITY_MAX. Lockdown reasons that protect kernel
+ * confidentiality (ie, the ability for userland to extract
+ * information from the running kernel that would otherwise be
+ * restricted) are placed before LOCKDOWN_CONFIDENTIALITY_MAX.
+ *
+ * LSM authors should note that the semantics of any given lockdown
+ * reason are not guaranteed to be stable - the same reason may block
+ * one set of features in one kernel release, and a slightly different
+ * set of features in a later kernel release. LSMs that seek to expose
+ * lockdown policy at any level of granularity other than "none",
+ * "integrity" or "confidentiality" are responsible for either
+ * ensuring that they expose a consistent level of functionality to
+ * userland, or ensuring that userland is aware that this is
+ * potentially a moving target. It is easy to misuse this information
+ * in a way that could break userspace. Please be careful not to do
+ * so.
+ *
+ * If you add to this, remember to extend lockdown_reasons in
+ * security/lockdown/lockdown.c.
+ */
+ enum lockdown_reason {
+ LOCKDOWN_NONE,
+ LOCKDOWN_MODULE_SIGNATURE,
+ LOCKDOWN_DEV_MEM,
+ LOCKDOWN_KEXEC,
+ LOCKDOWN_HIBERNATION,
+ LOCKDOWN_PCI_ACCESS,
+ LOCKDOWN_IOPORT,
+ LOCKDOWN_MSR,
+ LOCKDOWN_ACPI_TABLES,
+ LOCKDOWN_PCMCIA_CIS,
+ LOCKDOWN_TIOCSSERIAL,
+ LOCKDOWN_MODULE_PARAMETERS,
+ LOCKDOWN_MMIOTRACE,
+ LOCKDOWN_DEBUGFS,
+ LOCKDOWN_INTEGRITY_MAX,
+ LOCKDOWN_KCORE,
+ LOCKDOWN_KPROBES,
+ LOCKDOWN_BPF_READ,
+ LOCKDOWN_PERF,
+ LOCKDOWN_TRACEFS,
+ LOCKDOWN_CONFIDENTIALITY_MAX,
+ };
+
/* These functions are in security/commoncap.c */
extern int cap_capable(const struct cred *cred, struct user_namespace *ns,
int cap, unsigned int opts);
#ifdef CONFIG_SECURITY
-int call_lsm_notifier(enum lsm_event event, void *data);
-int register_lsm_notifier(struct notifier_block *nb);
-int unregister_lsm_notifier(struct notifier_block *nb);
+int call_blocking_lsm_notifier(enum lsm_event event, void *data);
+int register_blocking_lsm_notifier(struct notifier_block *nb);
+int unregister_blocking_lsm_notifier(struct notifier_block *nb);
/* prototypes */
extern int security_init(void);
+ extern int early_security_init(void);
/* Security operations */
int security_binder_set_context_mgr(struct task_struct *mgr);
struct qstr *name,
const struct cred *old,
struct cred *new);
-
+int security_path_notify(const struct path *path, u64 mask,
+ unsigned int obj_type);
int security_inode_alloc(struct inode *inode);
void security_inode_free(struct inode *inode);
int security_inode_init_security(struct inode *inode, struct inode *dir,
int security_secid_to_secctx(u32 secid, char **secdata, u32 *seclen);
int security_secctx_to_secid(const char *secdata, u32 seclen, u32 *secid);
void security_release_secctx(char *secdata, u32 seclen);
-
void security_inode_invalidate_secctx(struct inode *inode);
int security_inode_notifysecctx(struct inode *inode, void *ctx, u32 ctxlen);
int security_inode_setsecctx(struct dentry *dentry, void *ctx, u32 ctxlen);
int security_inode_getsecctx(struct inode *inode, void **ctx, u32 *ctxlen);
+ int security_locked_down(enum lockdown_reason what);
#else /* CONFIG_SECURITY */
-static inline int call_lsm_notifier(enum lsm_event event, void *data)
+static inline int call_blocking_lsm_notifier(enum lsm_event event, void *data)
{
return 0;
}
-static inline int register_lsm_notifier(struct notifier_block *nb)
+static inline int register_blocking_lsm_notifier(struct notifier_block *nb)
{
return 0;
}
-static inline int unregister_lsm_notifier(struct notifier_block *nb)
+static inline int unregister_blocking_lsm_notifier(struct notifier_block *nb)
{
return 0;
}
return 0;
}
+ static inline int early_security_init(void)
+ {
+ return 0;
+ }
+
static inline int security_binder_set_context_mgr(struct task_struct *mgr)
{
return 0;
return 0;
}
+static inline int security_path_notify(const struct path *path, u64 mask,
+ unsigned int obj_type)
+{
+ return 0;
+}
+
static inline int security_inode_alloc(struct inode *inode)
{
return 0;
{
return -EOPNOTSUPP;
}
+ static inline int security_locked_down(enum lockdown_reason what)
+ {
+ return 0;
+ }
#endif /* CONFIG_SECURITY */
#ifdef CONFIG_SECURITY_NETWORK
int
default $(shell,$(srctree)/scripts/clang-version.sh $(CC))
+config CC_CAN_LINK
+ def_bool $(success,$(srctree)/scripts/cc-can-link.sh $(CC))
+
config CC_HAS_ASM_GOTO
def_bool $(success,$(srctree)/scripts/gcc-goto.sh $(CC))
+config TOOLS_SUPPORT_RELR
+ def_bool $(success,env "CC=$(CC)" "LD=$(LD)" "NM=$(NM)" "OBJCOPY=$(OBJCOPY)" $(srctree)/scripts/tools-support-relr.sh)
+
+config CC_HAS_ASM_INLINE
+ def_bool $(success,echo 'void foo(void) { asm inline (""); }' | $(CC) -x c - -c -o /dev/null)
+
config CC_HAS_WARN_MAYBE_UNINITIALIZED
def_bool $(cc-option,-Wmaybe-uninitialized)
help
config CONSTRUCTORS
bool
- depends on !UML
config IRQ_WORK
bool
here. If you are a user/distributor, say N here to exclude useless
drivers to be distributed.
+config HEADER_TEST
+ bool "Compile test headers that should be standalone compilable"
+ help
+ Compile test headers listed in header-test-y target to ensure they are
+ self-contained, i.e. compilable as standalone units.
+
+ If you are a developer or tester and want to ensure the requested
+ headers are self-contained, say Y here. Otherwise, choose N.
+
+config KERNEL_HEADER_TEST
+ bool "Compile test kernel headers"
+ depends on HEADER_TEST
+ help
+ Headers in include/ are used to build external moduls.
+ Compile test them to ensure they are self-contained, i.e.
+ compilable as standalone units.
+
+ If you are a developer or tester and want to ensure the headers
+ in include/ are self-contained, say Y here. Otherwise, choose N.
+
+config UAPI_HEADER_TEST
+ bool "Compile test UAPI headers"
+ depends on HEADER_TEST && HEADERS_INSTALL && CC_CAN_LINK
+ help
+ Compile test headers exported to user-space to ensure they are
+ self-contained, i.e. compilable as standalone units.
+
+ If you are a developer or tester and want to ensure the exported
+ headers are self-contained, say Y here. Otherwise, choose N.
+
config LOCALVERSION
string "Local version - append to kernel release"
help
have cpu.pressure, memory.pressure, and io.pressure files,
which aggregate pressure stalls for the grouped tasks only.
- For more details see Documentation/accounting/psi.txt.
+ For more details see Documentation/accounting/psi.rst.
Say N if unsure.
config GENERIC_SCHED_CLOCK
bool
+menu "Scheduler features"
+
+config UCLAMP_TASK
+ bool "Enable utilization clamping for RT/FAIR tasks"
+ depends on CPU_FREQ_GOV_SCHEDUTIL
+ help
+ This feature enables the scheduler to track the clamped utilization
+ of each CPU based on RUNNABLE tasks scheduled on that CPU.
+
+ With this option, the user can specify the min and max CPU
+ utilization allowed for RUNNABLE tasks. The max utilization defines
+ the maximum frequency a task should use while the min utilization
+ defines the minimum frequency it should use.
+
+ Both min and max utilization clamp values are hints to the scheduler,
+ aiming at improving its frequency selection policy, but they do not
+ enforce or grant any specific bandwidth for tasks.
+
+ If in doubt, say N.
+
+config UCLAMP_BUCKETS_COUNT
+ int "Number of supported utilization clamp buckets"
+ range 5 20
+ default 5
+ depends on UCLAMP_TASK
+ help
+ Defines the number of clamp buckets to use. The range of each bucket
+ will be SCHED_CAPACITY_SCALE/UCLAMP_BUCKETS_COUNT. The higher the
+ number of clamp buckets the finer their granularity and the higher
+ the precision of clamping aggregation and tracking at run-time.
+
+ For example, with the minimum configuration value we will have 5
+ clamp buckets tracking 20% utilization each. A 25% boosted tasks will
+ be refcounted in the [20..39]% bucket and will set the bucket clamp
+ effective value to 25%.
+ If a second 30% boosted task should be co-scheduled on the same CPU,
+ that task will be refcounted in the same bucket of the first task and
+ it will boost the bucket clamp effective value to 30%.
+ The clamp effective value of a bucket is reset to its nominal value
+ (20% in the example above) when there are no more tasks refcounted in
+ that bucket.
+
+ An additional boost/capping margin can be added to some tasks. In the
+ example above the 25% task will be boosted to 30% until it exits the
+ CPU. If that should be considered not acceptable on certain systems,
+ it's always possible to reduce the margin by increasing the number of
+ clamp buckets to trade off used memory for run-time tracking
+ precision.
+
+ If in doubt, use the default value.
+
+endmenu
+
#
# For architectures that want to enable the support for NUMA-affine scheduler
# balancing logic:
use with process control subsystems such as Cpusets, CFS, memory
controls or device isolation.
See
- - Documentation/scheduler/sched-design-CFS.txt (CFS)
- - Documentation/cgroup-v1/ (features for grouping, isolation
+ - Documentation/scheduler/sched-design-CFS.rst (CFS)
+ - Documentation/admin-guide/cgroup-v1/ (features for grouping, isolation
and resource control)
Say N if unsure.
CONFIG_CFQ_GROUP_IOSCHED=y; for enabling throttling policy, set
CONFIG_BLK_DEV_THROTTLING=y.
- See Documentation/cgroup-v1/blkio-controller.txt for more information.
-
-config DEBUG_BLK_CGROUP
- bool "IO controller debugging"
- depends on BLK_CGROUP
- default n
- ---help---
- Enable some debugging help. Currently it exports additional stat
- files in a cgroup which can be useful for debugging.
+ See Documentation/admin-guide/cgroup-v1/blkio-controller.rst for more information.
config CGROUP_WRITEBACK
bool
tasks running within the fair group scheduler. Groups with no limit
set are considered to be unconstrained and will run with no
restriction.
- See Documentation/scheduler/sched-bwc.txt for more information.
+ See Documentation/scheduler/sched-bwc.rst for more information.
config RT_GROUP_SCHED
bool "Group scheduling for SCHED_RR/FIFO"
to task groups. If enabled, it will also make it impossible to
schedule realtime tasks for non-root users until you allocate
realtime bandwidth for them.
- See Documentation/scheduler/sched-rt-group.txt for more information.
+ See Documentation/scheduler/sched-rt-group.rst for more information.
endif #CGROUP_SCHED
+config UCLAMP_TASK_GROUP
+ bool "Utilization clamping per group of tasks"
+ depends on CGROUP_SCHED
+ depends on UCLAMP_TASK
+ default n
+ help
+ This feature enables the scheduler to track the clamped utilization
+ of each CPU based on RUNNABLE tasks currently scheduled on that CPU.
+
+ When this option is enabled, the user can specify a min and max
+ CPU bandwidth which is allowed for each single task in a group.
+ The max bandwidth allows to clamp the maximum frequency a task
+ can use, while the min bandwidth allows to define a minimum
+ frequency a task will always use.
+
+ When task group based utilization clamping is enabled, an eventually
+ specified task-specific clamp value is constrained by the cgroup
+ specified clamp value. Both minimum and maximum task clamping cannot
+ be bigger than the corresponding clamping defined at task group level.
+
+ If in doubt, say N.
+
config CGROUP_PIDS
bool "PIDs controller"
help
default CC_OPTIMIZE_FOR_PERFORMANCE
config CC_OPTIMIZE_FOR_PERFORMANCE
- bool "Optimize for performance"
+ bool "Optimize for performance (-O2)"
help
This is the default optimization level for the kernel, building
with the "-O2" compiler flag for best performance and most
helpful compile-time warnings.
-config CC_OPTIMIZE_FOR_SIZE
- bool "Optimize for size"
+config CC_OPTIMIZE_FOR_PERFORMANCE_O3
+ bool "Optimize more for performance (-O3)"
+ depends on ARC
imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives
help
- Enabling this option will pass "-Os" instead of "-O2" to
- your compiler resulting in a smaller kernel.
+ Choosing this option will pass "-O3" to your compiler to optimize
+ the kernel yet more for performance.
- If unsure, say N.
+config CC_OPTIMIZE_FOR_SIZE
+ bool "Optimize for size (-Os)"
+ imply CC_DISABLE_WARN_MAYBE_UNINITIALIZED # avoid false positives
+ help
+ Choosing this option will pass "-Os" to your compiler resulting
+ in a smaller kernel.
endchoice
help
Many kernel heap attacks try to target slab cache metadata and
other infrastructure. This options makes minor performance
- sacrifies to harden the kernel slab allocator against common
+ sacrifices to harden the kernel slab allocator against common
freelist exploit methods.
config SHUFFLE_PAGE_ALLOCATOR
depends on SLUB && SMP
bool "SLUB per cpu partial cache"
help
- Per cpu partial caches accellerate objects allocation and freeing
+ Per cpu partial caches accelerate objects allocation and freeing
that is local to a processor at the price of more indeterminism
in the latency of the free. On overflow these caches will be cleared
which requires the taking of locks that may cause latency spikes.
default 0 if BASE_FULL
default 1 if !BASE_FULL
+config MODULE_SIG_FORMAT
+ def_bool n
+ select SYSTEM_DATA_VERIFICATION
+
menuconfig MODULES
bool "Enable loadable module support"
option modules
make them incompatible with the kernel you are running. If
unsure, say N.
+config ASM_MODVERSIONS
+ bool
+ default HAVE_ASM_MODVERSIONS && MODVERSIONS
+ help
+ This enables module versioning for exported symbols also from
+ assembly. This can be enabled only when the target architecture
+ supports it.
+
config MODULE_REL_CRCS
bool
depends on MODVERSIONS
config MODULE_SIG
bool "Module signature verification"
- depends on MODULES
- select SYSTEM_DATA_VERIFICATION
+ select MODULE_SIG_FORMAT
help
Check modules for valid signatures upon load: the signature
is simply appended to the module. For more information see
kernel build dependency so that the signing tool can use its crypto
library.
+ You should enable this option if you wish to use either
+ CONFIG_SECURITY_LOCKDOWN_LSM or lockdown functionality imposed via
+ another LSM - otherwise unsigned modules will be loadable regardless
+ of the lockdown policy.
+
!!!WARNING!!! If you enable this option, you MUST make sure that the
module DOES NOT get stripped after being signed. This includes the
debuginfo strip done by some packagers (such as rpmbuild) and
config MODULE_COMPRESS
bool "Compress modules on installation"
- depends on MODULES
help
Compresses kernel modules when 'make modules_install' is run; gzip or
endchoice
+config MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
+ bool "Allow loading of modules with missing namespace imports"
+ help
+ Symbols exported with EXPORT_SYMBOL_NS*() are considered exported in
+ a namespace. A module that makes use of a symbol exported with such a
+ namespace is required to import the namespace via MODULE_IMPORT_NS().
+ There is no technical reason to enforce correct namespace imports,
+ but it creates consistency between symbols defining namespaces and
+ users importing namespaces they make use of. This option relaxes this
+ requirement and lifts the enforcement when loading a module.
+
+ If unsure, say N.
+
+config UNUSED_SYMBOLS
+ bool "Enable unused/obsolete exported symbols"
+ default y if X86
+ help
+ Unused but exported symbols make the kernel needlessly bigger. For
+ that reason most of these unused exports will soon be removed. This
+ option is provided temporarily to provide a transition period in case
+ some external kernel module needs one of these symbols anyway. If you
+ encounter such a case in your module, consider if you are actually
+ using the right API. (rationale: since nobody in the kernel is using
+ this in a module, there is a pretty good chance it's actually the
+ wrong interface to use). If you really need the symbol, please send a
+ mail to the linux kernel mailing list mentioning the symbol and why
+ you really need it, and what the merge plan to the mainline kernel for
+ your module is.
+
config TRIM_UNUSED_KSYMS
bool "Trim unused exported kernel symbols"
- depends on MODULES && !UNUSED_SYMBOLS
+ depends on !UNUSED_SYMBOLS
help
The kernel and some modules make many symbols available for
other modules to use via EXPORT_SYMBOL() and variants. Depending
/*
* Enable might_sleep() and smp_processor_id() checks.
- * They cannot be enabled earlier because with CONFIG_PREEMPT=y
+ * They cannot be enabled earlier because with CONFIG_PREEMPTION=y
* kernel_thread() would trigger might_sleep() splats. With
* CONFIG_PREEMPT_VOLUNTARY=y the init task might have scheduled
* already, but it's stuck on the kthreadd_done completion.
void __init __weak poking_init(void) { }
-void __init __weak pgd_cache_init(void) { }
+void __init __weak pgtable_cache_init(void) { }
bool initcall_debug;
core_param(initcall_debug, initcall_debug, bool, 0644);
}
#endif
+/* Report memory auto-initialization states for this boot. */
+static void __init report_meminit(void)
+{
+ const char *stack;
+
+ if (IS_ENABLED(CONFIG_INIT_STACK_ALL))
+ stack = "all";
+ else if (IS_ENABLED(CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF_ALL))
+ stack = "byref_all";
+ else if (IS_ENABLED(CONFIG_GCC_PLUGIN_STRUCTLEAK_BYREF))
+ stack = "byref";
+ else if (IS_ENABLED(CONFIG_GCC_PLUGIN_STRUCTLEAK_USER))
+ stack = "__user";
+ else
+ stack = "off";
+
+ pr_info("mem auto-init: stack:%s, heap alloc:%s, heap free:%s\n",
+ stack, want_init_on_alloc(GFP_KERNEL) ? "on" : "off",
+ want_init_on_free() ? "on" : "off");
+ if (want_init_on_free())
+ pr_info("mem auto-init: clearing system memory may take some time...\n");
+}
+
/*
* Set up kernel memory allocators
*/
* bigger than MAX_ORDER unless SPARSEMEM.
*/
page_ext_init_flatmem();
+ report_meminit();
mem_init();
kmem_cache_init();
+ kmemleak_init();
pgtable_init();
debug_objects_mem_init();
vmalloc_init();
init_espfix_bsp();
/* Should be run after espfix64 is set up. */
pti_init();
- pgd_cache_init();
}
void __init __weak arch_call_rest_init(void)
boot_cpu_init();
page_address_init();
pr_notice("%s", linux_banner);
+ early_security_init();
setup_arch(&command_line);
- mm_init_cpumask(&init_mm);
setup_command_line(command_line);
setup_nr_cpu_ids();
setup_per_cpu_areas();
initrd_start = 0;
}
#endif
- kmemleak_init();
setup_per_cpu_pageset();
numa_policy_init();
acpi_early_init();
static void __init do_basic_setup(void)
{
cpuset_init_smp();
- shmem_init();
driver_init();
init_irq_proc();
do_ctors();
cpuctx->hrtimer_interval = ns_to_ktime(NSEC_PER_MSEC * interval);
raw_spin_lock_init(&cpuctx->hrtimer_lock);
- hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED);
+ hrtimer_init(timer, CLOCK_MONOTONIC, HRTIMER_MODE_ABS_PINNED_HARD);
timer->function = perf_mux_hrtimer_handler;
}
if (!cpuctx->hrtimer_active) {
cpuctx->hrtimer_active = 1;
hrtimer_forward_now(timer, cpuctx->hrtimer_interval);
- hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED);
+ hrtimer_start_expires(timer, HRTIMER_MODE_ABS_PINNED_HARD);
}
raw_spin_unlock_irqrestore(&cpuctx->hrtimer_lock, flags);
ctx->generation++;
}
+static int
+perf_aux_output_match(struct perf_event *event, struct perf_event *aux_event)
+{
+ if (!has_aux(aux_event))
+ return 0;
+
+ if (!event->pmu->aux_output_match)
+ return 0;
+
+ return event->pmu->aux_output_match(aux_event);
+}
+
+static void put_event(struct perf_event *event);
+static void event_sched_out(struct perf_event *event,
+ struct perf_cpu_context *cpuctx,
+ struct perf_event_context *ctx);
+
+static void perf_put_aux_event(struct perf_event *event)
+{
+ struct perf_event_context *ctx = event->ctx;
+ struct perf_cpu_context *cpuctx = __get_cpu_context(ctx);
+ struct perf_event *iter;
+
+ /*
+ * If event uses aux_event tear down the link
+ */
+ if (event->aux_event) {
+ iter = event->aux_event;
+ event->aux_event = NULL;
+ put_event(iter);
+ return;
+ }
+
+ /*
+ * If the event is an aux_event, tear down all links to
+ * it from other events.
+ */
+ for_each_sibling_event(iter, event->group_leader) {
+ if (iter->aux_event != event)
+ continue;
+
+ iter->aux_event = NULL;
+ put_event(event);
+
+ /*
+ * If it's ACTIVE, schedule it out and put it into ERROR
+ * state so that we don't try to schedule it again. Note
+ * that perf_event_enable() will clear the ERROR status.
+ */
+ event_sched_out(iter, cpuctx, ctx);
+ perf_event_set_state(event, PERF_EVENT_STATE_ERROR);
+ }
+}
+
+static int perf_get_aux_event(struct perf_event *event,
+ struct perf_event *group_leader)
+{
+ /*
+ * Our group leader must be an aux event if we want to be
+ * an aux_output. This way, the aux event will precede its
+ * aux_output events in the group, and therefore will always
+ * schedule first.
+ */
+ if (!group_leader)
+ return 0;
+
+ if (!perf_aux_output_match(event, group_leader))
+ return 0;
+
+ if (!atomic_long_inc_not_zero(&group_leader->refcount))
+ return 0;
+
+ /*
+ * Link aux_outputs to their aux event; this is undone in
+ * perf_group_detach() by perf_put_aux_event(). When the
+ * group in torn down, the aux_output events loose their
+ * link to the aux_event and can't schedule any more.
+ */
+ event->aux_event = group_leader;
+
+ return 1;
+}
+
static void perf_group_detach(struct perf_event *event)
{
struct perf_event *sibling, *tmp;
event->attach_state &= ~PERF_ATTACH_GROUP;
+ perf_put_aux_event(event);
+
/*
* If this is a sibling, remove it from its group.
*/
*
* If event->ctx is a cloned context, callers must make sure that
* every task struct that event->ctx->task could possibly point to
- * remains valid. This condition is satisifed when called through
+ * remains valid. This condition is satisfied when called through
* perf_event_for_each_child or perf_event_for_each because they
* hold the top-level event's child_mutex, so any descendant that
* goes to exit will block in perf_event_exit_event().
return ret;
}
+static bool exclusive_event_installable(struct perf_event *event,
+ struct perf_event_context *ctx);
+
/*
* Attach a performance event to a context.
*
lockdep_assert_held(&ctx->mutex);
+ WARN_ON_ONCE(!exclusive_event_installable(event, ctx));
+
if (event->cpu != -1)
event->cpu = cpu;
if (!ctx->nr_active || !(is_active & EVENT_ALL))
return;
+ /*
+ * If we had been multiplexing, no rotations are necessary, now no events
+ * are active.
+ */
+ ctx->rotate_necessary = 0;
+
perf_pmu_disable(ctx->pmu);
if (is_active & EVENT_PINNED) {
list_for_each_entry_safe(event, tmp, &ctx->pinned_active, active_list)
return 0;
if (group_can_go_on(event, sid->cpuctx, sid->can_add_hw)) {
- if (!group_sched_in(event, sid->cpuctx, sid->ctx))
- list_add_tail(&event->active_list, &sid->ctx->flexible_active);
- else
+ int ret = group_sched_in(event, sid->cpuctx, sid->ctx);
+ if (ret) {
sid->can_add_hw = 0;
+ sid->ctx->rotate_necessary = 1;
+ return 0;
+ }
+ list_add_tail(&event->active_list, &sid->ctx->flexible_active);
}
return 0;
static bool perf_rotate_context(struct perf_cpu_context *cpuctx)
{
struct perf_event *cpu_event = NULL, *task_event = NULL;
- bool cpu_rotate = false, task_rotate = false;
- struct perf_event_context *ctx = NULL;
+ struct perf_event_context *task_ctx = NULL;
+ int cpu_rotate, task_rotate;
/*
* Since we run this from IRQ context, nobody can install new
* events, thus the event count values are stable.
*/
- if (cpuctx->ctx.nr_events) {
- if (cpuctx->ctx.nr_events != cpuctx->ctx.nr_active)
- cpu_rotate = true;
- }
-
- ctx = cpuctx->task_ctx;
- if (ctx && ctx->nr_events) {
- if (ctx->nr_events != ctx->nr_active)
- task_rotate = true;
- }
+ cpu_rotate = cpuctx->ctx.rotate_necessary;
+ task_ctx = cpuctx->task_ctx;
+ task_rotate = task_ctx ? task_ctx->rotate_necessary : 0;
if (!(cpu_rotate || task_rotate))
return false;
perf_pmu_disable(cpuctx->ctx.pmu);
if (task_rotate)
- task_event = ctx_first_active(ctx);
+ task_event = ctx_first_active(task_ctx);
if (cpu_rotate)
cpu_event = ctx_first_active(&cpuctx->ctx);
* As per the order given at ctx_resched() first 'pop' task flexible
* and then, if needed CPU flexible.
*/
- if (task_event || (ctx && cpu_event))
- ctx_sched_out(ctx, cpuctx, EVENT_FLEXIBLE);
+ if (task_event || (task_ctx && cpu_event))
+ ctx_sched_out(task_ctx, cpuctx, EVENT_FLEXIBLE);
if (cpu_event)
cpu_ctx_sched_out(cpuctx, EVENT_FLEXIBLE);
if (task_event)
- rotate_ctx(ctx, task_event);
+ rotate_ctx(task_ctx, task_event);
if (cpu_event)
rotate_ctx(&cpuctx->ctx, cpu_event);
- perf_event_sched_in(cpuctx, ctx, current);
+ perf_event_sched_in(cpuctx, task_ctx, current);
perf_pmu_enable(cpuctx->ctx.pmu);
perf_ctx_unlock(cpuctx, cpuctx->task_ctx);
return NULL;
__perf_event_init_context(ctx);
- if (task) {
- ctx->task = task;
- get_task_struct(task);
- }
+ if (task)
+ ctx->task = get_task_struct(task);
ctx->pmu = pmu;
return ctx;
{
struct pmu *pmu = event->pmu;
- if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUSIVE))
+ if (!is_exclusive_pmu(pmu))
return 0;
/*
{
struct pmu *pmu = event->pmu;
- if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUSIVE))
+ if (!is_exclusive_pmu(pmu))
return;
/* see comment in exclusive_event_init() */
return false;
}
-/* Called under the same ctx::mutex as perf_install_in_context() */
static bool exclusive_event_installable(struct perf_event *event,
struct perf_event_context *ctx)
{
struct perf_event *iter_event;
struct pmu *pmu = event->pmu;
- if (!(pmu->capabilities & PERF_PMU_CAP_EXCLUSIVE))
+ lockdep_assert_held(&ctx->mutex);
+
+ if (!is_exclusive_pmu(pmu))
return true;
list_for_each_entry(iter_event, &ctx->event_list, event_entry) {
if (event->destroy)
event->destroy(event);
- if (event->ctx)
- put_ctx(event->ctx);
-
+ /*
+ * Must be after ->destroy(), due to uprobe_perf_close() using
+ * hw.target.
+ */
if (event->hw.target)
put_task_struct(event->hw.target);
+ /*
+ * perf_event_free_task() relies on put_ctx() being 'last', in particular
+ * all task references must be cleaned up.
+ */
+ if (event->ctx)
+ put_ctx(event->ctx);
+
exclusive_event_destroy(event);
module_put(event->pmu->module);
mutex_unlock(&event->child_mutex);
list_for_each_entry_safe(child, tmp, &free_list, child_list) {
+ void *var = &child->ctx->refcount;
+
list_del(&child->child_list);
free_event(child);
+
+ /*
+ * Wake any perf_event_free_task() waiting for this event to be
+ * freed.
+ */
+ smp_mb(); /* pairs with wait_var_event() */
+ wake_up_var(var);
}
no_ctx:
* Get remaining task size from user stack pointer.
*
* It'd be better to take stack vma map and limit this more
- * precisly, but there's no way to get it safely under interrupt,
+ * precisely, but there's no way to get it safely under interrupt,
* so using TASK_SIZE as limit.
*/
static u64 perf_ustack_task_size(struct pt_regs *regs)
if (sample_type & PERF_SAMPLE_STACK_USER) {
/*
- * Either we need PERF_SAMPLE_STACK_USER bit to be allways
+ * Either we need PERF_SAMPLE_STACK_USER bit to be always
* processed as the last one or have additional check added
* in case new sample type is added, because we could eat
* up the rest of the sample size.
if (event->hw.state & PERF_HES_STOPPED)
return 0;
/*
- * All tracepoints are from kernel-space.
+ * If exclude_kernel, only trace user-space tracepoints (uprobes)
*/
- if (event->attr.exclude_kernel)
+ if (event->attr.exclude_kernel && !user_mode(regs))
return 0;
if (!perf_tp_filter_match(event, data))
period = max_t(u64, 10000, hwc->sample_period);
}
hrtimer_start(&hwc->hrtimer, ns_to_ktime(period),
- HRTIMER_MODE_REL_PINNED);
+ HRTIMER_MODE_REL_PINNED_HARD);
}
static void perf_swevent_cancel_hrtimer(struct perf_event *event)
if (!is_sampling_event(event))
return;
- hrtimer_init(&hwc->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL);
+ hrtimer_init(&hwc->hrtimer, CLOCK_MONOTONIC, HRTIMER_MODE_REL_HARD);
hwc->hrtimer.function = perf_swevent_hrtimer;
/*
if (ret)
goto del_dev;
+ if (pmu->attr_update)
+ ret = sysfs_update_groups(&pmu->dev->kobj, pmu->attr_update);
+
+ if (ret)
+ goto del_dev;
+
out:
return ret;
* and we cannot use the ctx information because we need the
* pmu before we get a ctx.
*/
- get_task_struct(task);
- event->hw.target = task;
+ event->hw.target = get_task_struct(task);
}
event->clock = &local_clock;
goto err_ns;
}
+ if (event->attr.aux_output &&
+ !(pmu->capabilities & PERF_PMU_CAP_AUX_OUTPUT)) {
+ err = -EOPNOTSUPP;
+ goto err_pmu;
+ }
+
err = exclusive_event_init(event);
if (err)
goto err_pmu;
break;
case CLOCK_BOOTTIME:
- event->clock = &ktime_get_boot_ns;
+ event->clock = &ktime_get_boottime_ns;
break;
case CLOCK_TAI:
- event->clock = &ktime_get_tai_ns;
+ event->clock = &ktime_get_clocktai_ns;
break;
default:
perf_paranoid_kernel() && !capable(CAP_SYS_ADMIN))
return -EACCES;
+ err = security_locked_down(LOCKDOWN_PERF);
+ if (err && (attr.sample_type & PERF_SAMPLE_REGS_INTR))
+ /* REGS_INTR can leak data, lockdown must prevent this */
+ return err;
+
+ err = 0;
+
/*
* In cgroup mode, the pid argument is used to pass the fd
* opened to the cgroup directory in cgroupfs. The cpu argument
goto err_alloc;
}
- if ((pmu->capabilities & PERF_PMU_CAP_EXCLUSIVE) && group_leader) {
- err = -EBUSY;
- goto err_context;
- }
-
/*
* Look up the group leader (we will attach this event to it):
*/
move_group = 0;
}
}
+
+ /*
+ * Failure to create exclusive events returns -EBUSY.
+ */
+ err = -EBUSY;
+ if (!exclusive_event_installable(group_leader, ctx))
+ goto err_locked;
+
+ for_each_sibling_event(sibling, group_leader) {
+ if (!exclusive_event_installable(sibling, ctx))
+ goto err_locked;
+ }
} else {
mutex_lock(&ctx->mutex);
}
}
}
+ if (event->attr.aux_output && !perf_get_aux_event(event, group_leader))
+ goto err_locked;
/*
* Must be under the same ctx::mutex as perf_install_in_context(),
* because we need to serialize with concurrent event creation.
*/
if (!exclusive_event_installable(event, ctx)) {
- /* exclusive and group stuff are assumed mutually exclusive */
- WARN_ON_ONCE(move_group);
-
err = -EBUSY;
goto err_locked;
}
goto err_unlock;
}
- perf_install_in_context(ctx, event, cpu);
+ perf_install_in_context(ctx, event, event->cpu);
perf_unpin_context(ctx);
mutex_unlock(&ctx->mutex);
}
/*
- * Free an unexposed, unused context as created by inheritance by
- * perf_event_init_task below, used by fork() in case of fail.
+ * Free a context as created by inheritance by perf_event_init_task() below,
+ * used by fork() in case of fail.
*
- * Not all locks are strictly required, but take them anyway to be nice and
- * help out with the lockdep assertions.
+ * Even though the task has never lived, the context and events have been
+ * exposed through the child_list, so we must take care tearing it all down.
*/
void perf_event_free_task(struct task_struct *task)
{
perf_free_event(event, ctx);
mutex_unlock(&ctx->mutex);
- put_ctx(ctx);
+
+ /*
+ * perf_event_release_kernel() could've stolen some of our
+ * child events and still have them on its free_list. In that
+ * case we must wait for these events to have been freed (in
+ * particular all their references to this task must've been
+ * dropped).
+ *
+ * Without this copy_process() will unconditionally free this
+ * task (irrespective of its reference count) and
+ * _free_event()'s put_task_struct(event->hw.target) will be a
+ * use-after-free.
+ *
+ * Wait for all events to drop their context reference.
+ */
+ wait_var_event(&ctx->refcount, refcount_read(&ctx->refcount) == 1);
+ put_ctx(ctx); /* must be last */
}
}
struct file *perf_event_get(unsigned int fd)
{
- struct file *file;
-
- file = fget_raw(fd);
+ struct file *file = fget(fd);
if (!file)
return ERR_PTR(-EBADF);
return kexec_image_post_load_cleanup_default(image);
}
- #ifdef CONFIG_KEXEC_VERIFY_SIG
+ #ifdef CONFIG_KEXEC_SIG
static int kexec_image_verify_sig_default(struct kimage *image, void *buf,
unsigned long buf_len)
{
image->image_loader_data = NULL;
}
+ #ifdef CONFIG_KEXEC_SIG
+ static int
+ kimage_validate_signature(struct kimage *image)
+ {
+ const char *reason;
+ int ret;
+
+ ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
+ image->kernel_buf_len);
+ switch (ret) {
+ case 0:
+ break;
+
+ /* Certain verification errors are non-fatal if we're not
+ * checking errors, provided we aren't mandating that there
+ * must be a valid signature.
+ */
+ case -ENODATA:
+ reason = "kexec of unsigned image";
+ goto decide;
+ case -ENOPKG:
+ reason = "kexec of image with unsupported crypto";
+ goto decide;
+ case -ENOKEY:
+ reason = "kexec of image with unavailable key";
+ decide:
+ if (IS_ENABLED(CONFIG_KEXEC_SIG_FORCE)) {
+ pr_notice("%s rejected\n", reason);
+ return ret;
+ }
+
+ /* If IMA is guaranteed to appraise a signature on the kexec
+ * image, permit it even if the kernel is otherwise locked
+ * down.
+ */
+ if (!ima_appraise_signature(READING_KEXEC_IMAGE) &&
+ security_locked_down(LOCKDOWN_KEXEC))
+ return -EPERM;
+
+ return 0;
+
+ /* All other errors are fatal, including nomem, unparseable
+ * signatures and signature check failures - even if signatures
+ * aren't required.
+ */
+ default:
+ pr_notice("kernel signature verification failed (%d).\n", ret);
+ }
+
+ return ret;
+ }
+ #endif
+
/*
* In file mode list of segments is prepared by kernel. Copy relevant
* data from user space, do error checking, prepare segment list
const char __user *cmdline_ptr,
unsigned long cmdline_len, unsigned flags)
{
- int ret = 0;
+ int ret;
void *ldata;
loff_t size;
return ret;
image->kernel_buf_len = size;
- /* IMA needs to pass the measurement list to the next kernel. */
- ima_add_kexec_buffer(image);
-
/* Call arch image probe handlers */
ret = arch_kexec_kernel_image_probe(image, image->kernel_buf,
image->kernel_buf_len);
if (ret)
goto out;
- #ifdef CONFIG_KEXEC_VERIFY_SIG
- ret = arch_kexec_kernel_verify_sig(image, image->kernel_buf,
- image->kernel_buf_len);
- if (ret) {
- pr_debug("kernel signature verification failed.\n");
+ #ifdef CONFIG_KEXEC_SIG
+ ret = kimage_validate_signature(image);
+
+ if (ret)
goto out;
- }
- pr_debug("kernel signature verification successful.\n");
#endif
/* It is possible that there no initramfs is being loaded */
if (!(flags & KEXEC_FILE_NO_INITRAMFS)) {
ret = -EINVAL;
goto out;
}
+
+ ima_kexec_cmdline(image->cmdline_buf,
+ image->cmdline_buf_len - 1);
}
+ /* IMA needs to pass the measurement list to the next kernel. */
+ ima_add_kexec_buffer(image);
+
/* Call arch image load handlers */
ldata = arch_kexec_kernel_image_load(image);
#include <linux/export.h>
#include <linux/extable.h>
#include <linux/moduleloader.h>
+#include <linux/module_signature.h>
#include <linux/trace_events.h>
#include <linux/init.h>
#include <linux/kallsyms.h>
/*
* Modules' sections will be aligned on page boundaries
* to ensure complete separation of code and data, but
- * only when CONFIG_STRICT_MODULE_RWX=y
+ * only when CONFIG_ARCH_HAS_STRICT_MODULE_RWX=y
*/
-#ifdef CONFIG_STRICT_MODULE_RWX
+#ifdef CONFIG_ARCH_HAS_STRICT_MODULE_RWX
# define debug_align(X) ALIGN(X, PAGE_SIZE)
#else
# define debug_align(X) (X)
#endif
}
-static int cmp_name(const void *va, const void *vb)
+static const char *kernel_symbol_namespace(const struct kernel_symbol *sym)
+{
+#ifdef CONFIG_HAVE_ARCH_PREL32_RELOCATIONS
+ if (!sym->namespace_offset)
+ return NULL;
+ return offset_to_ptr(&sym->namespace_offset);
+#else
+ return sym->namespace;
+#endif
+}
+
+static int cmp_name(const void *name, const void *sym)
{
- const char *a;
- const struct kernel_symbol *b;
- a = va; b = vb;
- return strcmp(a, kernel_symbol_name(b));
+ return strcmp(name, kernel_symbol_name(sym));
}
static bool find_exported_symbol_in_section(const struct symsearch *syms,
}
#endif /* CONFIG_MODVERSIONS */
+static char *get_modinfo(const struct load_info *info, const char *tag);
+static char *get_next_modinfo(const struct load_info *info, const char *tag,
+ char *prev);
+
+static int verify_namespace_is_imported(const struct load_info *info,
+ const struct kernel_symbol *sym,
+ struct module *mod)
+{
+ const char *namespace;
+ char *imported_namespace;
+
+ namespace = kernel_symbol_namespace(sym);
+ if (namespace) {
+ imported_namespace = get_modinfo(info, "import_ns");
+ while (imported_namespace) {
+ if (strcmp(namespace, imported_namespace) == 0)
+ return 0;
+ imported_namespace = get_next_modinfo(
+ info, "import_ns", imported_namespace);
+ }
+#ifdef CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
+ pr_warn(
+#else
+ pr_err(
+#endif
+ "%s: module uses symbol (%s) from namespace %s, but does not import it.\n",
+ mod->name, kernel_symbol_name(sym), namespace);
+#ifndef CONFIG_MODULE_ALLOW_MISSING_NAMESPACE_IMPORTS
+ return -EINVAL;
+#endif
+ }
+ return 0;
+}
+
+
/* Resolve a symbol for this module. I.e. if we find one, record usage. */
static const struct kernel_symbol *resolve_symbol(struct module *mod,
const struct load_info *info,
goto getname;
}
+ err = verify_namespace_is_imported(info, sym, mod);
+ if (err) {
+ sym = ERR_PTR(err);
+ goto getname;
+ }
+
err = ref_module(mod, owner);
if (err) {
sym = ERR_PTR(err);
for (i = 0; i < info->hdr->e_shnum; i++)
if (!sect_empty(&info->sechdrs[i]))
nloaded++;
- size[0] = ALIGN(sizeof(*sect_attrs)
- + nloaded * sizeof(sect_attrs->attrs[0]),
+ size[0] = ALIGN(struct_size(sect_attrs, attrs, nloaded),
sizeof(sect_attrs->grp.attrs[0]));
size[1] = (nloaded + 1) * sizeof(sect_attrs->grp.attrs[0]);
sect_attrs = kzalloc(size[0] + size[1], GFP_KERNEL);
return ret;
}
+static void module_remove_modinfo_attrs(struct module *mod, int end);
+
static int module_add_modinfo_attrs(struct module *mod)
{
struct module_attribute *attr;
return -ENOMEM;
temp_attr = mod->modinfo_attrs;
- for (i = 0; (attr = modinfo_attrs[i]) && !error; i++) {
+ for (i = 0; (attr = modinfo_attrs[i]); i++) {
if (!attr->test || attr->test(mod)) {
memcpy(temp_attr, attr, sizeof(*temp_attr));
sysfs_attr_init(&temp_attr->attr);
error = sysfs_create_file(&mod->mkobj.kobj,
&temp_attr->attr);
+ if (error)
+ goto error_out;
++temp_attr;
}
}
+
+ return 0;
+
+error_out:
+ if (i > 0)
+ module_remove_modinfo_attrs(mod, --i);
return error;
}
-static void module_remove_modinfo_attrs(struct module *mod)
+static void module_remove_modinfo_attrs(struct module *mod, int end)
{
struct module_attribute *attr;
int i;
for (i = 0; (attr = &mod->modinfo_attrs[i]); i++) {
+ if (end >= 0 && i > end)
+ break;
/* pick a field to test for end of list */
if (!attr->attr.name)
break;
return 0;
out_unreg_modinfo_attrs:
- module_remove_modinfo_attrs(mod);
+ module_remove_modinfo_attrs(mod, -1);
out_unreg_param:
module_param_sysfs_remove(mod);
out_unreg_holders:
{
}
-static void module_remove_modinfo_attrs(struct module *mod)
+static void module_remove_modinfo_attrs(struct module *mod, int end)
{
}
static void mod_sysfs_teardown(struct module *mod)
{
del_usage_links(mod);
- module_remove_modinfo_attrs(mod);
+ module_remove_modinfo_attrs(mod, -1);
module_param_sysfs_remove(mod);
kobject_put(mod->mkobj.drivers_dir);
kobject_put(mod->holders_dir);
mod_sysfs_fini(mod);
}
-#ifdef CONFIG_STRICT_MODULE_RWX
+#ifdef CONFIG_ARCH_HAS_STRICT_MODULE_RWX
/*
* LKM RO/NX protection: protect module's text/ro-data
* from modification and any data from execution.
layout->text_size >> PAGE_SHIFT);
}
+#ifdef CONFIG_STRICT_MODULE_RWX
static void frob_rodata(const struct module_layout *layout,
int (*set_memory)(unsigned long start, int num_pages))
{
set_vm_flush_reset_perms(mod->core_layout.base);
set_vm_flush_reset_perms(mod->init_layout.base);
frob_text(&mod->core_layout, set_memory_ro);
- frob_text(&mod->core_layout, set_memory_x);
frob_rodata(&mod->core_layout, set_memory_ro);
-
frob_text(&mod->init_layout, set_memory_ro);
- frob_text(&mod->init_layout, set_memory_x);
-
frob_rodata(&mod->init_layout, set_memory_ro);
if (after_init)
}
mutex_unlock(&module_mutex);
}
-#else
+#else /* !CONFIG_STRICT_MODULE_RWX */
static void module_enable_nx(const struct module *mod) { }
-#endif
+#endif /* CONFIG_STRICT_MODULE_RWX */
+static void module_enable_x(const struct module *mod)
+{
+ frob_text(&mod->core_layout, set_memory_x);
+ frob_text(&mod->init_layout, set_memory_x);
+}
+#else /* !CONFIG_ARCH_HAS_STRICT_MODULE_RWX */
+static void module_enable_nx(const struct module *mod) { }
+static void module_enable_x(const struct module *mod) { }
+#endif /* CONFIG_ARCH_HAS_STRICT_MODULE_RWX */
+
#ifdef CONFIG_LIVEPATCH
/*
return string;
}
-static char *get_modinfo(struct load_info *info, const char *tag)
+static char *get_next_modinfo(const struct load_info *info, const char *tag,
+ char *prev)
{
char *p;
unsigned int taglen = strlen(tag);
* get_modinfo() calls made before rewrite_section_headers()
* must use sh_offset, as sh_addr isn't set!
*/
- for (p = (char *)info->hdr + infosec->sh_offset; p; p = next_string(p, &size)) {
+ char *modinfo = (char *)info->hdr + infosec->sh_offset;
+
+ if (prev) {
+ size -= prev - modinfo;
+ modinfo = next_string(prev, &size);
+ }
+
+ for (p = modinfo; p; p = next_string(p, &size)) {
if (strncmp(p, tag, taglen) == 0 && p[taglen] == '=')
return p + taglen + 1;
}
return NULL;
}
+static char *get_modinfo(const struct load_info *info, const char *tag)
+{
+ return get_next_modinfo(info, tag, NULL);
+}
+
static void setup_modinfo(struct module *mod, struct load_info *info)
{
struct module_attribute *attr;
return vmalloc_exec(size);
}
+bool __weak module_exit_section(const char *name)
+{
+ return strstarts(name, ".exit");
+}
+
#ifdef CONFIG_DEBUG_KMEMLEAK
static void kmemleak_load_module(const struct module *mod,
const struct load_info *info)
#ifdef CONFIG_MODULE_SIG
static int module_sig_check(struct load_info *info, int flags)
{
- int err = -ENOKEY;
+ int err = -ENODATA;
const unsigned long markerlen = sizeof(MODULE_SIG_STRING) - 1;
+ const char *reason;
const void *mod = info->hdr;
/*
err = mod_verify_sig(mod, info);
}
- if (!err) {
+ switch (err) {
+ case 0:
info->sig_ok = true;
return 0;
- }
- /* Not having a signature is only an error if we're strict. */
- if (err == -ENOKEY && !is_module_sig_enforced())
- err = 0;
+ /* We don't permit modules to be loaded into trusted kernels
+ * without a valid signature on them, but if we're not
+ * enforcing, certain errors are non-fatal.
+ */
+ case -ENODATA:
+ reason = "Loading of unsigned module";
+ goto decide;
+ case -ENOPKG:
+ reason = "Loading of module with unsupported crypto";
+ goto decide;
+ case -ENOKEY:
+ reason = "Loading of module with unavailable key";
+ decide:
+ if (is_module_sig_enforced()) {
+ pr_notice("%s is rejected\n", reason);
+ return -EKEYREJECTED;
+ }
- return err;
+ return security_locked_down(LOCKDOWN_MODULE_SIGNATURE);
+
+ /* All other errors are fatal, including nomem, unparseable
+ * signatures and signature check failures - even if signatures
+ * aren't required.
+ */
+ default:
+ return err;
+ }
}
#else /* !CONFIG_MODULE_SIG */
static int module_sig_check(struct load_info *info, int flags)
#ifndef CONFIG_MODULE_UNLOAD
/* Don't load .exit sections */
- if (strstarts(info->secstrings+shdr->sh_name, ".exit"))
+ if (module_exit_section(info->secstrings+shdr->sh_name))
shdr->sh_flags &= ~(unsigned long)SHF_ALLOC;
#endif
}
sizeof(*mod->tracepoints_ptrs),
&mod->num_tracepoints);
#endif
+#ifdef CONFIG_TREE_SRCU
+ mod->srcu_struct_ptrs = section_objs(info, "___srcu_struct_ptrs",
+ sizeof(*mod->srcu_struct_ptrs),
+ &mod->num_srcu_structs);
+#endif
#ifdef CONFIG_BPF_EVENTS
mod->bpf_raw_events = section_objs(info, "__bpf_raw_tp_map",
sizeof(*mod->bpf_raw_events),
sched_annotate_sleep();
mutex_lock(&module_mutex);
mod = find_module_all(name, strlen(name), true);
- ret = !mod || mod->state == MODULE_STATE_LIVE
- || mod->state == MODULE_STATE_GOING;
+ ret = !mod || mod->state == MODULE_STATE_LIVE;
mutex_unlock(&module_mutex);
return ret;
mutex_lock(&module_mutex);
old = find_module_all(mod->name, strlen(mod->name), true);
if (old != NULL) {
- if (old->state == MODULE_STATE_COMING
- || old->state == MODULE_STATE_UNFORMED) {
+ if (old->state != MODULE_STATE_LIVE) {
/* Wait in case it fails to load. */
mutex_unlock(&module_mutex);
err = wait_event_interruptible(module_wq,
module_enable_ro(mod, false);
module_enable_nx(mod);
+ module_enable_x(mod);
/* Mark state as coming so strong_try_module_get() ignores us,
* but kallsyms etc. can see us. */
#include "trace_probe.h"
#include "trace.h"
+#define bpf_event_rcu_dereference(p) \
+ rcu_dereference_protected(p, lockdep_is_held(&bpf_event_mutex))
+
#ifdef CONFIG_MODULES
struct bpf_trace_module {
struct module *module;
{
int ret;
+ ret = security_locked_down(LOCKDOWN_BPF_READ);
+ if (ret < 0)
+ goto out;
+
ret = probe_kernel_read(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
+ out:
memset(dst, 0, size);
return ret;
{
int ret;
+ ret = security_locked_down(LOCKDOWN_BPF_READ);
+ if (ret < 0)
+ goto out;
+
/*
* The strncpy_from_unsafe() call will likely not fill the entire
* buffer, but that's okay in this circumstance as we're probing
*/
ret = strncpy_from_unsafe(dst, unsafe_ptr, size);
if (unlikely(ret < 0))
+ out:
memset(dst, 0, size);
return ret;
.arg3_type = ARG_ANYTHING,
};
+struct send_signal_irq_work {
+ struct irq_work irq_work;
+ struct task_struct *task;
+ u32 sig;
+};
+
+static DEFINE_PER_CPU(struct send_signal_irq_work, send_signal_work);
+
+static void do_bpf_send_signal(struct irq_work *entry)
+{
+ struct send_signal_irq_work *work;
+
+ work = container_of(entry, struct send_signal_irq_work, irq_work);
+ group_send_sig_info(work->sig, SEND_SIG_PRIV, work->task, PIDTYPE_TGID);
+}
+
+BPF_CALL_1(bpf_send_signal, u32, sig)
+{
+ struct send_signal_irq_work *work = NULL;
+
+ /* Similar to bpf_probe_write_user, task needs to be
+ * in a sound condition and kernel memory access be
+ * permitted in order to send signal to the current
+ * task.
+ */
+ if (unlikely(current->flags & (PF_KTHREAD | PF_EXITING)))
+ return -EPERM;
+ if (unlikely(uaccess_kernel()))
+ return -EPERM;
+ if (unlikely(!nmi_uaccess_okay()))
+ return -EPERM;
+
+ if (in_nmi()) {
+ /* Do an early check on signal validity. Otherwise,
+ * the error is lost in deferred irq_work.
+ */
+ if (unlikely(!valid_signal(sig)))
+ return -EINVAL;
+
+ work = this_cpu_ptr(&send_signal_work);
+ if (work->irq_work.flags & IRQ_WORK_BUSY)
+ return -EBUSY;
+
+ /* Add the current task, which is the target of sending signal,
+ * to the irq_work. The current task may change when queued
+ * irq works get executed.
+ */
+ work->task = current;
+ work->sig = sig;
+ irq_work_queue(&work->irq_work);
+ return 0;
+ }
+
+ return group_send_sig_info(sig, SEND_SIG_PRIV, current, PIDTYPE_TGID);
+}
+
+static const struct bpf_func_proto bpf_send_signal_proto = {
+ .func = bpf_send_signal,
+ .gpl_only = false,
+ .ret_type = RET_INTEGER,
+ .arg1_type = ARG_ANYTHING,
+};
+
static const struct bpf_func_proto *
tracing_func_proto(enum bpf_func_id func_id, const struct bpf_prog *prog)
{
case BPF_FUNC_get_current_cgroup_id:
return &bpf_get_current_cgroup_id_proto;
#endif
+ case BPF_FUNC_send_signal:
+ return &bpf_send_signal_proto;
default:
return NULL;
}
int perf_event_attach_bpf_prog(struct perf_event *event,
struct bpf_prog *prog)
{
- struct bpf_prog_array __rcu *old_array;
+ struct bpf_prog_array *old_array;
struct bpf_prog_array *new_array;
int ret = -EEXIST;
if (event->prog)
goto unlock;
- old_array = event->tp_event->prog_array;
+ old_array = bpf_event_rcu_dereference(event->tp_event->prog_array);
if (old_array &&
bpf_prog_array_length(old_array) >= BPF_TRACE_MAX_PROGS) {
ret = -E2BIG;
void perf_event_detach_bpf_prog(struct perf_event *event)
{
- struct bpf_prog_array __rcu *old_array;
+ struct bpf_prog_array *old_array;
struct bpf_prog_array *new_array;
int ret;
if (!event->prog)
goto unlock;
- old_array = event->tp_event->prog_array;
+ old_array = bpf_event_rcu_dereference(event->tp_event->prog_array);
ret = bpf_prog_array_copy(old_array, event->prog, NULL, &new_array);
if (ret == -ENOENT)
goto unlock;
{
struct perf_event_query_bpf __user *uquery = info;
struct perf_event_query_bpf query = {};
+ struct bpf_prog_array *progs;
u32 *ids, prog_cnt, ids_len;
int ret;
*/
mutex_lock(&bpf_event_mutex);
- ret = bpf_prog_array_copy_info(event->tp_event->prog_array,
- ids,
- ids_len,
- &prog_cnt);
+ progs = bpf_event_rcu_dereference(event->tp_event->prog_array);
+ ret = bpf_prog_array_copy_info(progs, ids, ids_len, &prog_cnt);
mutex_unlock(&bpf_event_mutex);
if (copy_to_user(&uquery->prog_cnt, &prog_cnt, sizeof(prog_cnt)) ||
return err;
}
+static int __init send_signal_irq_work_init(void)
+{
+ int cpu;
+ struct send_signal_irq_work *work;
+
+ for_each_possible_cpu(cpu) {
+ work = per_cpu_ptr(&send_signal_work, cpu);
+ init_irq_work(&work->irq_work, do_bpf_send_signal);
+ }
+ return 0;
+}
+
+subsys_initcall(send_signal_irq_work_init);
+
#ifdef CONFIG_MODULES
static int bpf_event_notify(struct notifier_block *nb, unsigned long op,
void *module)
#include <linux/uaccess.h>
#include <linux/rculist.h>
#include <linux/error-injection.h>
+ #include <linux/security.h>
+#include <asm/setup.h> /* for COMMAND_LINE_SIZE */
+
#include "trace_dynevent.h"
#include "trace_kprobe_selftest.h"
#include "trace_probe.h"
#define KPROBE_EVENT_SYSTEM "kprobes"
#define KRETPROBE_MAXACTIVE_MAX 4096
+#define MAX_KPROBE_CMDLINE_SIZE 1024
+
+/* Kprobe early definition from command line */
+static char kprobe_boot_events_buf[COMMAND_LINE_SIZE] __initdata;
+static bool kprobe_boot_events_enabled __initdata;
+
+static int __init set_kprobe_boot_events(char *str)
+{
+ strlcpy(kprobe_boot_events_buf, str, COMMAND_LINE_SIZE);
+ return 0;
+}
+__setup("kprobe_event=", set_kprobe_boot_events);
static int trace_kprobe_create(int argc, const char **argv);
static int trace_kprobe_show(struct seq_file *m, struct dyn_event *ev);
static int trace_kprobe_release(struct dyn_event *ev);
static bool trace_kprobe_is_busy(struct dyn_event *ev);
static bool trace_kprobe_match(const char *system, const char *event,
- struct dyn_event *ev);
+ int argc, const char **argv, struct dyn_event *ev);
static struct dyn_event_operations trace_kprobe_ops = {
.create = trace_kprobe_create,
return trace_probe_is_enabled(&tk->tp);
}
+static bool trace_kprobe_match_command_head(struct trace_kprobe *tk,
+ int argc, const char **argv)
+{
+ char buf[MAX_ARGSTR_LEN + 1];
+
+ if (!argc)
+ return true;
+
+ if (!tk->symbol)
+ snprintf(buf, sizeof(buf), "0x%p", tk->rp.kp.addr);
+ else if (tk->rp.kp.offset)
+ snprintf(buf, sizeof(buf), "%s+%u",
+ trace_kprobe_symbol(tk), tk->rp.kp.offset);
+ else
+ snprintf(buf, sizeof(buf), "%s", trace_kprobe_symbol(tk));
+ if (strcmp(buf, argv[0]))
+ return false;
+ argc--; argv++;
+
+ return trace_probe_match_command_args(&tk->tp, argc, argv);
+}
+
static bool trace_kprobe_match(const char *system, const char *event,
- struct dyn_event *ev)
+ int argc, const char **argv, struct dyn_event *ev)
{
struct trace_kprobe *tk = to_trace_kprobe(ev);
- return strcmp(trace_event_name(&tk->tp.call), event) == 0 &&
- (!system || strcmp(tk->tp.call.class->system, system) == 0);
+ return strcmp(trace_probe_name(&tk->tp), event) == 0 &&
+ (!system || strcmp(trace_probe_group_name(&tk->tp), system) == 0) &&
+ trace_kprobe_match_command_head(tk, argc, argv);
}
static nokprobe_inline unsigned long trace_kprobe_nhit(struct trace_kprobe *tk)
return nhit;
}
+static nokprobe_inline bool trace_kprobe_is_registered(struct trace_kprobe *tk)
+{
+ return !(list_empty(&tk->rp.kp.list) &&
+ hlist_unhashed(&tk->rp.kp.hlist));
+}
+
/* Return 0 if it fails to find the symbol address */
static nokprobe_inline
unsigned long trace_kprobe_address(struct trace_kprobe *tk)
return addr;
}
+static nokprobe_inline struct trace_kprobe *
+trace_kprobe_primary_from_call(struct trace_event_call *call)
+{
+ struct trace_probe *tp;
+
+ tp = trace_probe_primary_from_call(call);
+ if (WARN_ON_ONCE(!tp))
+ return NULL;
+
+ return container_of(tp, struct trace_kprobe, tp);
+}
+
bool trace_kprobe_on_func_entry(struct trace_event_call *call)
{
- struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
+ struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
- return kprobe_on_func_entry(tk->rp.kp.addr,
+ return tk ? kprobe_on_func_entry(tk->rp.kp.addr,
tk->rp.kp.addr ? NULL : tk->rp.kp.symbol_name,
- tk->rp.kp.addr ? 0 : tk->rp.kp.offset);
+ tk->rp.kp.addr ? 0 : tk->rp.kp.offset) : false;
}
bool trace_kprobe_error_injectable(struct trace_event_call *call)
{
- struct trace_kprobe *tk = (struct trace_kprobe *)call->data;
+ struct trace_kprobe *tk = trace_kprobe_primary_from_call(call);
- return within_error_injection_list(trace_kprobe_address(tk));
+ return tk ? within_error_injection_list(trace_kprobe_address(tk)) :
+ false;
}
static int register_kprobe_event(struct trace_kprobe *tk);
static int kretprobe_dispatcher(struct kretprobe_instance *ri,
struct pt_regs *regs);
+static void free_trace_kprobe(struct trace_kprobe *tk)
+{
+ if (tk) {
+ trace_probe_cleanup(&tk->tp);
+ kfree(tk->symbol);
+ free_percpu(tk->nhit);
+ kfree(tk);
+ }
+}
+
/*
* Allocate new trace_probe and initialize it (including kprobes).
*/
tk->rp.kp.pre_handler = kprobe_dispatcher;
tk->rp.maxactive = maxactive;
+ INIT_HLIST_NODE(&tk->rp.kp.hlist);
+ INIT_LIST_HEAD(&tk->rp.kp.list);
- if (!event || !group) {
- ret = -EINVAL;
- goto error;
- }
-
- tk->tp.call.class = &tk->tp.class;
- tk->tp.call.name = kstrdup(event, GFP_KERNEL);
- if (!tk->tp.call.name)
- goto error;
-
- tk->tp.class.system = kstrdup(group, GFP_KERNEL);
- if (!tk->tp.class.system)
+ ret = trace_probe_init(&tk->tp, event, group);
+ if (ret < 0)
goto error;
dyn_event_init(&tk->devent, &trace_kprobe_ops);
- INIT_LIST_HEAD(&tk->tp.files);
return tk;
error:
- kfree(tk->tp.call.name);
- kfree(tk->symbol);
- free_percpu(tk->nhit);
- kfree(tk);
+ free_trace_kprobe(tk);
return ERR_PTR(ret);
}
-static void free_trace_kprobe(struct trace_kprobe *tk)
-{
- int i;
-
- if (!tk)
- return;
-
- for (i = 0; i < tk->tp.nr_args; i++)
- traceprobe_free_probe_arg(&tk->tp.args[i]);
-
- kfree(tk->tp.call.class->system);
- kfree(tk->tp.call.name);
- kfree(tk->symbol);
- free_percpu(tk->nhit);
- kfree(tk);
-}
-
static struct trace_kprobe *find_trace_kprobe(const char *event,
const char *group)
{
struct trace_kprobe *tk;
for_each_trace_kprobe(tk, pos)
- if (strcmp(trace_event_name(&tk->tp.call), event) == 0 &&
- strcmp(tk->tp.call.class->system, group) == 0)
+ if (strcmp(trace_probe_name(&tk->tp), event) == 0 &&
+ strcmp(trace_probe_group_name(&tk->tp), group) == 0)
return tk;
return NULL;
}
{
int ret = 0;
- if (trace_probe_is_registered(&tk->tp) && !trace_kprobe_has_gone(tk)) {
+ if (trace_kprobe_is_registered(tk) && !trace_kprobe_has_gone(tk)) {
if (trace_kprobe_is_return(tk))
ret = enable_kretprobe(&tk->rp);
else
return ret;
}
+static void __disable_trace_kprobe(struct trace_probe *tp)
+{
+ struct trace_probe *pos;
+ struct trace_kprobe *tk;
+
+ list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
+ tk = container_of(pos, struct trace_kprobe, tp);
+ if (!trace_kprobe_is_registered(tk))
+ continue;
+ if (trace_kprobe_is_return(tk))
+ disable_kretprobe(&tk->rp);
+ else
+ disable_kprobe(&tk->rp.kp);
+ }
+}
+
/*
* Enable trace_probe
* if the file is NULL, enable "perf" handler, or enable "trace" handler.
*/
-static int
-enable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file)
+static int enable_trace_kprobe(struct trace_event_call *call,
+ struct trace_event_file *file)
{
- struct event_file_link *link;
+ struct trace_probe *pos, *tp;
+ struct trace_kprobe *tk;
+ bool enabled;
int ret = 0;
- if (file) {
- link = kmalloc(sizeof(*link), GFP_KERNEL);
- if (!link) {
- ret = -ENOMEM;
- goto out;
- }
+ tp = trace_probe_primary_from_call(call);
+ if (WARN_ON_ONCE(!tp))
+ return -ENODEV;
+ enabled = trace_probe_is_enabled(tp);
- link->file = file;
- list_add_tail_rcu(&link->list, &tk->tp.files);
+ /* This also changes "enabled" state */
+ if (file) {
+ ret = trace_probe_add_file(tp, file);
+ if (ret)
+ return ret;
+ } else
+ trace_probe_set_flag(tp, TP_FLAG_PROFILE);
- tk->tp.flags |= TP_FLAG_TRACE;
- ret = __enable_trace_kprobe(tk);
- if (ret) {
- list_del_rcu(&link->list);
- kfree(link);
- tk->tp.flags &= ~TP_FLAG_TRACE;
- }
+ if (enabled)
+ return 0;
- } else {
- tk->tp.flags |= TP_FLAG_PROFILE;
+ list_for_each_entry(pos, trace_probe_probe_list(tp), list) {
+ tk = container_of(pos, struct trace_kprobe, tp);
+ if (trace_kprobe_has_gone(tk))
+ continue;
ret = __enable_trace_kprobe(tk);
if (ret)
- tk->tp.flags &= ~TP_FLAG_PROFILE;
+ break;
+ enabled = true;
}
- out:
+
+ if (ret) {
+ /* Failed to enable one of them. Roll back all */
+ if (enabled)
+ __disable_trace_kprobe(tp);
+ if (file)
+ trace_probe_remove_file(tp, file);
+ else
+ trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
+ }
+
return ret;
}
* Disable trace_probe
* if the file is NULL, disable "perf" handler, or disable "trace" handler.
*/
-static int
-disable_trace_kprobe(struct trace_kprobe *tk, struct trace_event_file *file)
+static int disable_trace_kprobe(struct trace_event_call *call,
+ struct trace_event_file *file)
{
- struct event_file_link *link = NULL;
- int wait = 0;
- int ret = 0;
+ struct trace_probe *tp;
- if (file) {
- link = find_event_file_link(&tk->tp, file);
- if (!link) {
- ret = -EINVAL;
- goto out;
- }
+ tp = trace_probe_primary_from_call(call);
+ if (WARN_ON_ONCE(!tp))
+ return -ENODEV;
- list_del_rcu(&link->list);
- wait = 1;
- if (!list_empty(&tk->tp.files))
+ if (file) {
+ if (!trace_probe_get_file_link(tp, file))
+ return -ENOENT;
+ if (!trace_probe_has_single_file(tp))
goto out;
-
- tk->tp.flags &= ~TP_FLAG_TRACE;
+ trace_probe_clear_flag(tp, TP_FLAG_TRACE);
} else
- tk->tp.flags &= ~TP_FLAG_PROFILE;
+ trace_probe_clear_flag(tp, TP_FLAG_PROFILE);
- if (!trace_probe_is_enabled(&tk->tp) && trace_probe_is_registered(&tk->tp)) {
- if (trace_kprobe_is_return(tk))
- disable_kretprobe(&tk->rp);
- else
- disable_kprobe(&tk->rp.kp);
- wait = 1;
- }
+ if (!trace_probe_is_enabled(tp))
+ __disable_trace_kprobe(tp);
- /*
- * if tk is not added to any list, it must be a local trace_kprobe
- * created with perf_event_open. We don't need to wait for these
- * trace_kprobes
- */
- if (list_empty(&tk->devent.list))
- wait = 0;
out:
- if (wait) {
+ if (file)
/*
- * Synchronize with kprobe_trace_func/kretprobe_trace_func
- * to ensure disabled (all running handlers are finished).
- * This is not only for kfree(), but also the caller,
- * trace_remove_event_call() supposes it for releasing
- * event_call related objects, which will be accessed in
- * the kprobe_trace_func/kretprobe_trace_func.
+ * Synchronization is done in below function. For perf event,
+ * file == NULL and perf_trace_event_unreg() calls
+ * tracepoint_synchronize_unregister() to ensure synchronize
+ * event. We don't need to care about it.
*/
- synchronize_rcu();
- kfree(link); /* Ignored if link == NULL */
- }
+ trace_probe_remove_file(tp, file);
- return ret;
+ return 0;
}
#if defined(CONFIG_KPROBES_ON_FTRACE) && \
{
int i, ret;
- if (trace_probe_is_registered(&tk->tp))
+ ret = security_locked_down(LOCKDOWN_KPROBES);
+ if (ret)
+ return ret;
+
+ if (trace_kprobe_is_registered(tk))
return -EINVAL;
if (within_notrace_func(tk)) {
else
ret = register_kprobe(&tk->rp.kp);
- if (ret == 0)
- tk->tp.flags |= TP_FLAG_REGISTERED;
return ret;
}
/* Internal unregister function - just handle k*probes and flags */
static void __unregister_trace_kprobe(struct trace_kprobe *tk)
{
- if (trace_probe_is_registered(&tk->tp)) {
+ if (trace_kprobe_is_registered(tk)) {
if (trace_kprobe_is_return(tk))
unregister_kretprobe(&tk->rp);
else
unregister_kprobe(&tk->rp.kp);
- tk->tp.flags &= ~TP_FLAG_REGISTERED;
- /* Cleanup kprobe for reuse */
+ /* Cleanup kprobe for reuse and mark it unregistered */
+ INIT_HLIST_NODE(&tk->rp.kp.hlist);
+ INIT_LIST_HEAD(&tk->rp.kp.list);
if (tk->rp.kp.symbol_name)
tk->rp.kp.addr = NULL;
}
/* Unregister a trace_probe and probe_event */
static int unregister_trace_kprobe(struct trace_kprobe *tk)
{
+ /* If other probes are on the event, just unregister kprobe */
+ if (trace_probe_has_sibling(&tk->tp))
+ goto unreg;
+
/* Enabled event can not be unregistered */
if (trace_probe_is_enabled(&tk->tp))
return -EBUSY;
if (unregister_kprobe_event(tk))
return -EBUSY;
+unreg:
__unregister_trace_kprobe(tk);
dyn_event_remove(&tk->devent);
+ trace_probe_unlink(&tk->tp);
return 0;
}
+static bool trace_kprobe_has_same_kprobe(struct trace_kprobe *orig,
+ struct trace_kprobe *comp)
+{
+ struct trace_probe_event *tpe = orig->tp.event;
+ struct trace_probe *pos;
+ int i;
+
+ list_for_each_entry(pos, &tpe->probes, list) {
+ orig = container_of(pos, struct trace_kprobe, tp);
+ if (strcmp(trace_kprobe_symbol(orig),
+ trace_kprobe_symbol(comp)) ||
+ trace_kprobe_offset(orig) != trace_kprobe_offset(comp))
+ continue;
+
+ /*
+ * trace_probe_compare_arg_type() ensured that nr_args and
+ * each argument name and type are same. Let's compare comm.
+ */
+ for (i = 0; i < orig->tp.nr_args; i++) {
+ if (strcmp(orig->tp.args[i].comm,
+ comp->tp.args[i].comm))
+ break;
+ }
+
+ if (i == orig->tp.nr_args)
+ return true;
+ }
+
+ return false;
+}
+
+static int append_trace_kprobe(struct trace_kprobe *tk, struct trace_kprobe *to)
+{
+ int ret;
+
+ ret = trace_probe_compare_arg_type(&tk->tp, &to->tp);
+ if (ret) {
+ /* Note that argument starts index = 2 */
+ trace_probe_log_set_index(ret + 1);
+ trace_probe_log_err(0, DIFF_ARG_TYPE);
+ return -EEXIST;
+ }
+ if (trace_kprobe_has_same_kprobe(to, tk)) {
+ trace_probe_log_set_index(0);
+ trace_probe_log_err(0, SAME_PROBE);
+ return -EEXIST;
+ }
+
+ /* Append to existing event */
+ ret = trace_probe_append(&tk->tp, &to->tp);
+ if (ret)
+ return ret;
+
+ /* Register k*probe */
+ ret = __register_trace_kprobe(tk);
+ if (ret == -ENOENT && !trace_kprobe_module_exist(tk)) {
+ pr_warn("This probe might be able to register after target module is loaded. Continue.\n");
+ ret = 0;
+ }
+
+ if (ret)
+ trace_probe_unlink(&tk->tp);
+ else
+ dyn_event_add(&tk->devent);
+
+ return ret;
+}
+
/* Register a trace_probe and probe_event */
static int register_trace_kprobe(struct trace_kprobe *tk)
{
mutex_lock(&event_mutex);
- /* Delete old (same name) event if exist */
- old_tk = find_trace_kprobe(trace_event_name(&tk->tp.call),
- tk->tp.call.class->system);
+ old_tk = find_trace_kprobe(trace_probe_name(&tk->tp),
+ trace_probe_group_name(&tk->tp));
if (old_tk) {
- ret = unregister_trace_kprobe(old_tk);
- if (ret < 0)
- goto end;
- free_trace_kprobe(old_tk);
+ if (trace_kprobe_is_return(tk) != trace_kprobe_is_return(old_tk)) {
+ trace_probe_log_set_index(0);
+ trace_probe_log_err(0, DIFF_PROBE_TYPE);
+ ret = -EEXIST;
+ } else {
+ ret = append_trace_kprobe(tk, old_tk);
+ }
+ goto end;
}
/* Register new event */
ret = __register_trace_kprobe(tk);
if (ret)
pr_warn("Failed to re-register probe %s on %s: %d\n",
- trace_event_name(&tk->tp.call),
+ trace_probe_name(&tk->tp),
mod->name, ret);
}
}
goto error; /* This can be -ENOMEM */
}
+ ret = traceprobe_set_print_fmt(&tk->tp, is_return);
+ if (ret < 0)
+ goto error;
+
ret = register_trace_kprobe(tk);
if (ret) {
trace_probe_log_set_index(1);
trace_probe_log_err(0, BAD_INSN_BNDRY);
else if (ret == -ENOENT)
trace_probe_log_err(0, BAD_PROBE_ADDR);
- else if (ret != -ENOMEM)
+ else if (ret != -ENOMEM && ret != -EEXIST)
trace_probe_log_err(0, FAIL_REG_PROBE);
goto error;
}
int i;
seq_putc(m, trace_kprobe_is_return(tk) ? 'r' : 'p');
- seq_printf(m, ":%s/%s", tk->tp.call.class->system,
- trace_event_name(&tk->tp.call));
+ seq_printf(m, ":%s/%s", trace_probe_group_name(&tk->tp),
+ trace_probe_name(&tk->tp));
if (!tk->symbol)
seq_printf(m, " 0x%p", tk->rp.kp.addr);
tk = to_trace_kprobe(ev);
seq_printf(m, " %-44s %15lu %15lu\n",
- trace_event_name(&tk->tp.call),
+ trace_probe_name(&tk->tp),
trace_kprobe_nhit(tk),
tk->rp.kp.nmissed);
return (ret < 0) ? ret : len;
}
+/* Return the length of string -- including null terminal byte */
+static nokprobe_inline int
+fetch_store_strlen_user(unsigned long addr)
+{
+ const void __user *uaddr = (__force const void __user *)addr;
+
+ return strnlen_unsafe_user(uaddr, MAX_STRING_SIZE);
+}
+
/*
* Fetch a null-terminated string. Caller MUST set *(u32 *)buf with max
* length and relative data location.
fetch_store_string(unsigned long addr, void *dest, void *base)
{
int maxlen = get_loc_len(*(u32 *)dest);
- u8 *dst = get_loc_data(dest, base);
+ void *__dest;
long ret;
if (unlikely(!maxlen))
return -ENOMEM;
+
+ __dest = get_loc_data(dest, base);
+
/*
* Try to get string again, since the string can be changed while
* probing.
*/
- ret = strncpy_from_unsafe(dst, (void *)addr, maxlen);
+ ret = strncpy_from_unsafe(__dest, (void *)addr, maxlen);
+ if (ret >= 0)
+ *(u32 *)dest = make_data_loc(ret, __dest - base);
+
+ return ret;
+}
+
+/*
+ * Fetch a null-terminated string from user. Caller MUST set *(u32 *)buf
+ * with max length and relative data location.
+ */
+static nokprobe_inline int
+fetch_store_string_user(unsigned long addr, void *dest, void *base)
+{
+ const void __user *uaddr = (__force const void __user *)addr;
+ int maxlen = get_loc_len(*(u32 *)dest);
+ void *__dest;
+ long ret;
+ if (unlikely(!maxlen))
+ return -ENOMEM;
+
+ __dest = get_loc_data(dest, base);
+
+ ret = strncpy_from_unsafe_user(__dest, uaddr, maxlen);
if (ret >= 0)
- *(u32 *)dest = make_data_loc(ret, (void *)dst - base);
+ *(u32 *)dest = make_data_loc(ret, __dest - base);
+
return ret;
}
return probe_kernel_read(dest, src, size);
}
+static nokprobe_inline int
+probe_mem_read_user(void *dest, void *src, size_t size)
+{
+ const void __user *uaddr = (__force const void __user *)src;
+
+ return probe_user_read(dest, uaddr, size);
+}
+
/* Note that we don't verify it, since the code does not come from user space */
static int
process_fetch_insn(struct fetch_insn *code, struct pt_regs *regs, void *dest,
case FETCH_OP_COMM:
val = (unsigned long)current->comm;
break;
+ case FETCH_OP_DATA:
+ val = (unsigned long)code->data;
+ break;
#ifdef CONFIG_HAVE_FUNCTION_ARG_ACCESS_API
case FETCH_OP_ARG:
val = regs_get_kernel_argument(regs, code->param);
struct ring_buffer *buffer;
int size, dsize, pc;
unsigned long irq_flags;
- struct trace_event_call *call = &tk->tp.call;
+ struct trace_event_call *call = trace_probe_event_call(&tk->tp);
WARN_ON(call != trace_file->event_call);
{
struct event_file_link *link;
- list_for_each_entry_rcu(link, &tk->tp.files, list)
+ trace_probe_for_each_link_rcu(link, &tk->tp)
__kprobe_trace_func(tk, regs, link->file);
}
NOKPROBE_SYMBOL(kprobe_trace_func);
struct ring_buffer *buffer;
int size, pc, dsize;
unsigned long irq_flags;
- struct trace_event_call *call = &tk->tp.call;
+ struct trace_event_call *call = trace_probe_event_call(&tk->tp);
WARN_ON(call != trace_file->event_call);
{
struct event_file_link *link;
- list_for_each_entry_rcu(link, &tk->tp.files, list)
+ trace_probe_for_each_link_rcu(link, &tk->tp)
__kretprobe_trace_func(tk, ri, regs, link->file);
}
NOKPROBE_SYMBOL(kretprobe_trace_func);
struct trace_probe *tp;
field = (struct kprobe_trace_entry_head *)iter->ent;
- tp = container_of(event, struct trace_probe, call.event);
+ tp = trace_probe_primary_from_call(
+ container_of(event, struct trace_event_call, event));
+ if (WARN_ON_ONCE(!tp))
+ goto out;
- trace_seq_printf(s, "%s: (", trace_event_name(&tp->call));
+ trace_seq_printf(s, "%s: (", trace_probe_name(tp));
if (!seq_print_ip_sym(s, field->ip, flags | TRACE_ITER_SYM_OFFSET))
goto out;
struct trace_probe *tp;
field = (struct kretprobe_trace_entry_head *)iter->ent;
- tp = container_of(event, struct trace_probe, call.event);
+ tp = trace_probe_primary_from_call(
+ container_of(event, struct trace_event_call, event));
+ if (WARN_ON_ONCE(!tp))
+ goto out;
- trace_seq_printf(s, "%s: (", trace_event_name(&tp->call));
+ trace_seq_printf(s, "%s: (", trace_probe_name(tp));
if (!seq_print_ip_sym(s, field->ret_ip, flags | TRACE_ITER_SYM_OFFSET))
goto out;
{
int ret;
struct kprobe_trace_entry_head field;
- struct trace_kprobe *tk = (struct trace_kprobe *)event_call->data;
+ struct trace_probe *tp;
+
+ tp = trace_probe_primary_from_call(event_call);
+ if (WARN_ON_ONCE(!tp))
+ return -ENOENT;
DEFINE_FIELD(unsigned long, ip, FIELD_STRING_IP, 0);
- return traceprobe_define_arg_fields(event_call, sizeof(field), &tk->tp);
+ return traceprobe_define_arg_fields(event_call, sizeof(field), tp);
}
static int kretprobe_event_define_fields(struct trace_event_call *event_call)
{
int ret;
struct kretprobe_trace_entry_head field;
- struct trace_kprobe *tk = (struct trace_kprobe *)event_call->data;
+ struct trace_probe *tp;
+
+ tp = trace_probe_primary_from_call(event_call);
+ if (WARN_ON_ONCE(!tp))
+ return -ENOENT;
DEFINE_FIELD(unsigned long, func, FIELD_STRING_FUNC, 0);
DEFINE_FIELD(unsigned long, ret_ip, FIELD_STRING_RETIP, 0);
- return traceprobe_define_arg_fields(event_call, sizeof(field), &tk->tp);
+ return traceprobe_define_arg_fields(event_call, sizeof(field), tp);
}
#ifdef CONFIG_PERF_EVENTS
static int
kprobe_perf_func(struct trace_kprobe *tk, struct pt_regs *regs)
{
- struct trace_event_call *call = &tk->tp.call;
+ struct trace_event_call *call = trace_probe_event_call(&tk->tp);
struct kprobe_trace_entry_head *entry;
struct hlist_head *head;
int size, __size, dsize;
kretprobe_perf_func(struct trace_kprobe *tk, struct kretprobe_instance *ri,
struct pt_regs *regs)
{
- struct trace_event_call *call = &tk->tp.call;
+ struct trace_event_call *call = trace_probe_event_call(&tk->tp);
struct kretprobe_trace_entry_head *entry;
struct hlist_head *head;
int size, __size, dsize;
static int kprobe_register(struct trace_event_call *event,
enum trace_reg type, void *data)
{
- struct trace_kprobe *tk = (struct trace_kprobe *)event->data;
struct trace_event_file *file = data;
switch (type) {
case TRACE_REG_REGISTER:
- return enable_trace_kprobe(tk, file);
+ return enable_trace_kprobe(event, file);
case TRACE_REG_UNREGISTER:
- return disable_trace_kprobe(tk, file);
+ return disable_trace_kprobe(event, file);
#ifdef CONFIG_PERF_EVENTS
case TRACE_REG_PERF_REGISTER:
- return enable_trace_kprobe(tk, NULL);
+ return enable_trace_kprobe(event, NULL);
case TRACE_REG_PERF_UNREGISTER:
- return disable_trace_kprobe(tk, NULL);
+ return disable_trace_kprobe(event, NULL);
case TRACE_REG_PERF_OPEN:
case TRACE_REG_PERF_CLOSE:
case TRACE_REG_PERF_ADD:
raw_cpu_inc(*tk->nhit);
- if (tk->tp.flags & TP_FLAG_TRACE)
+ if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
kprobe_trace_func(tk, regs);
#ifdef CONFIG_PERF_EVENTS
- if (tk->tp.flags & TP_FLAG_PROFILE)
+ if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE))
ret = kprobe_perf_func(tk, regs);
#endif
return ret;
raw_cpu_inc(*tk->nhit);
- if (tk->tp.flags & TP_FLAG_TRACE)
+ if (trace_probe_test_flag(&tk->tp, TP_FLAG_TRACE))
kretprobe_trace_func(tk, ri, regs);
#ifdef CONFIG_PERF_EVENTS
- if (tk->tp.flags & TP_FLAG_PROFILE)
+ if (trace_probe_test_flag(&tk->tp, TP_FLAG_PROFILE))
kretprobe_perf_func(tk, ri, regs);
#endif
return 0; /* We don't tweek kernel, so just return 0 */
.trace = print_kprobe_event
};
-static inline void init_trace_event_call(struct trace_kprobe *tk,
- struct trace_event_call *call)
+static inline void init_trace_event_call(struct trace_kprobe *tk)
{
- INIT_LIST_HEAD(&call->class->fields);
+ struct trace_event_call *call = trace_probe_event_call(&tk->tp);
+
if (trace_kprobe_is_return(tk)) {
call->event.funcs = &kretprobe_funcs;
call->class->define_fields = kretprobe_event_define_fields;
call->flags = TRACE_EVENT_FL_KPROBE;
call->class->reg = kprobe_register;
- call->data = tk;
}
static int register_kprobe_event(struct trace_kprobe *tk)
{
- struct trace_event_call *call = &tk->tp.call;
- int ret = 0;
+ init_trace_event_call(tk);
- init_trace_event_call(tk, call);
-
- if (traceprobe_set_print_fmt(&tk->tp, trace_kprobe_is_return(tk)) < 0)
- return -ENOMEM;
- ret = register_trace_event(&call->event);
- if (!ret) {
- kfree(call->print_fmt);
- return -ENODEV;
- }
- ret = trace_add_event_call(call);
- if (ret) {
- pr_info("Failed to register kprobe event: %s\n",
- trace_event_name(call));
- kfree(call->print_fmt);
- unregister_trace_event(&call->event);
- }
- return ret;
+ return trace_probe_register_event_call(&tk->tp);
}
static int unregister_kprobe_event(struct trace_kprobe *tk)
{
- int ret;
-
- /* tp->event is unregistered in trace_remove_event_call() */
- ret = trace_remove_event_call(&tk->tp.call);
- if (!ret)
- kfree(tk->tp.call.print_fmt);
- return ret;
+ return trace_probe_unregister_event_call(&tk->tp);
}
#ifdef CONFIG_PERF_EVENTS
return ERR_CAST(tk);
}
- init_trace_event_call(tk, &tk->tp.call);
+ init_trace_event_call(tk);
if (traceprobe_set_print_fmt(&tk->tp, trace_kprobe_is_return(tk)) < 0) {
ret = -ENOMEM;
}
ret = __register_trace_kprobe(tk);
- if (ret < 0) {
- kfree(tk->tp.call.print_fmt);
+ if (ret < 0)
goto error;
- }
- return &tk->tp.call;
+ return trace_probe_event_call(&tk->tp);
error:
free_trace_kprobe(tk);
return ERR_PTR(ret);
{
struct trace_kprobe *tk;
- tk = container_of(event_call, struct trace_kprobe, tp.call);
+ tk = trace_kprobe_primary_from_call(event_call);
+ if (unlikely(!tk))
+ return;
if (trace_probe_is_enabled(&tk->tp)) {
WARN_ON(1);
__unregister_trace_kprobe(tk);
- kfree(tk->tp.call.print_fmt);
free_trace_kprobe(tk);
}
#endif /* CONFIG_PERF_EVENTS */
+static __init void enable_boot_kprobe_events(void)
+{
+ struct trace_array *tr = top_trace_array();
+ struct trace_event_file *file;
+ struct trace_kprobe *tk;
+ struct dyn_event *pos;
+
+ mutex_lock(&event_mutex);
+ for_each_trace_kprobe(tk, pos) {
+ list_for_each_entry(file, &tr->events, list)
+ if (file->event_call == trace_probe_event_call(&tk->tp))
+ trace_event_enable_disable(file, 1, 0);
+ }
+ mutex_unlock(&event_mutex);
+}
+
+static __init void setup_boot_kprobe_events(void)
+{
+ char *p, *cmd = kprobe_boot_events_buf;
+ int ret;
+
+ strreplace(kprobe_boot_events_buf, ',', ' ');
+
+ while (cmd && *cmd != '\0') {
+ p = strchr(cmd, ';');
+ if (p)
+ *p++ = '\0';
+
+ ret = trace_run_command(cmd, create_or_delete_trace_kprobe);
+ if (ret)
+ pr_warn("Failed to add event(%d): %s\n", ret, cmd);
+ else
+ kprobe_boot_events_enabled = true;
+
+ cmd = p;
+ }
+
+ enable_boot_kprobe_events();
+}
+
/* Make a tracefs interface for controlling probe points */
static __init int init_kprobe_trace(void)
{
if (!entry)
pr_warn("Could not create tracefs 'kprobe_profile' entry\n");
+
+ setup_boot_kprobe_events();
+
return 0;
}
fs_initcall(init_kprobe_trace);
struct trace_event_file *file;
list_for_each_entry(file, &tr->events, list)
- if (file->event_call == &tk->tp.call)
+ if (file->event_call == trace_probe_event_call(&tk->tp))
return file;
return NULL;
if (tracing_is_disabled())
return -ENODEV;
+ if (kprobe_boot_events_enabled) {
+ pr_info("Skipping kprobe tests due to kprobe_event on cmdline\n");
+ return 0;
+ }
+
target = kprobe_trace_selftest_target;
pr_info("Testing kprobe tracing: ");
pr_warn("error on getting probe file.\n");
warn++;
} else
- enable_trace_kprobe(tk, file);
+ enable_trace_kprobe(
+ trace_probe_event_call(&tk->tp), file);
}
}
pr_warn("error on getting probe file.\n");
warn++;
} else
- enable_trace_kprobe(tk, file);
+ enable_trace_kprobe(
+ trace_probe_event_call(&tk->tp), file);
}
}
pr_warn("error on getting probe file.\n");
warn++;
} else
- disable_trace_kprobe(tk, file);
+ disable_trace_kprobe(
+ trace_probe_event_call(&tk->tp), file);
}
tk = find_trace_kprobe("testprobe2", KPROBE_EVENT_SYSTEM);
pr_warn("error on getting probe file.\n");
warn++;
} else
- disable_trace_kprobe(tk, file);
+ disable_trace_kprobe(
+ trace_probe_event_call(&tk->tp), file);
}
ret = trace_run_command("-:testprobe", create_or_delete_trace_kprobe);
ensuring that the majority of kernel addresses are not mapped
into userspace.
- See Documentation/x86/pti.txt for more details.
+ See Documentation/x86/pti.rst for more details.
config SECURITY_INFINIBAND
bool "Infiniband Security Hooks"
See <http://www.intel.com/technology/security/> for more information
about Intel(R) TXT.
See <http://tboot.sourceforge.net> for more information about tboot.
- See Documentation/intel_txt.txt for a description of how to enable
+ See Documentation/x86/intel_txt.rst for a description of how to enable
Intel TXT support in a kernel boot.
If you are unsure as to whether this is required, answer N.
source "security/loadpin/Kconfig"
source "security/yama/Kconfig"
source "security/safesetid/Kconfig"
+ source "security/lockdown/Kconfig"
source "security/integrity/Kconfig"
config LSM
string "Ordered list of enabled LSMs"
- default "yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor" if DEFAULT_SECURITY_SMACK
- default "yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" if DEFAULT_SECURITY_APPARMOR
- default "yama,loadpin,safesetid,integrity,tomoyo" if DEFAULT_SECURITY_TOMOYO
- default "yama,loadpin,safesetid,integrity" if DEFAULT_SECURITY_DAC
- default "yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
+ default "lockdown,yama,loadpin,safesetid,integrity,smack,selinux,tomoyo,apparmor" if DEFAULT_SECURITY_SMACK
+ default "lockdown,yama,loadpin,safesetid,integrity,apparmor,selinux,smack,tomoyo" if DEFAULT_SECURITY_APPARMOR
+ default "lockdown,yama,loadpin,safesetid,integrity,tomoyo" if DEFAULT_SECURITY_TOMOYO
+ default "lockdown,yama,loadpin,safesetid,integrity" if DEFAULT_SECURITY_DAC
+ default "lockdown,yama,loadpin,safesetid,integrity,selinux,smack,tomoyo,apparmor"
help
A comma-separated list of LSMs, in initialization order.
Any LSMs left off this list will be ignored. This can be
config IMA_ARCH_POLICY
bool "Enable loading an IMA architecture specific policy"
- depends on (KEXEC_VERIFY_SIG && IMA) || IMA_APPRAISE \
- depends on KEXEC_SIG || IMA_APPRAISE && INTEGRITY_ASYMMETRIC_KEYS
++ depends on (KEXEC_SIG && IMA) || IMA_APPRAISE \
+ && INTEGRITY_ASYMMETRIC_KEYS
default n
help
This option enables loading an IMA architecture specific policy
This option enables the different "ima_appraise=" modes
(eg. fix, log) from the boot command line.
+config IMA_APPRAISE_MODSIG
+ bool "Support module-style signatures for appraisal"
+ depends on IMA_APPRAISE
+ depends on INTEGRITY_ASYMMETRIC_KEYS
+ select PKCS7_MESSAGE_PARSER
+ select MODULE_SIG_FORMAT
+ default n
+ help
+ Adds support for signatures appended to files. The format of the
+ appended signature is the same used for signed kernel modules.
+ The modsig keyword can be used in the IMA policy to allow a hook
+ to accept such signatures.
+
config IMA_TRUSTED_KEYRING
bool "Require all keys on the .ima keyring be signed (deprecated)"
depends on IMA_APPRAISE && SYSTEM_TRUSTED_KEYRING
const unsigned char *filename;
struct evm_ima_xattr_data *xattr_value;
int xattr_len;
+ const struct modsig *modsig;
const char *violation;
+ const void *buf;
+ int buf_len;
};
/* IMA template field data definition */
u64 count;
};
+ extern const int read_idmap[];
+
#ifdef CONFIG_HAVE_IMA_KEXEC
void ima_load_kexec_buffer(void);
#else
int ima_init_crypto(void);
void ima_putc(struct seq_file *m, void *data, int datalen);
void ima_print_digest(struct seq_file *m, u8 *digest, u32 size);
+int template_desc_init_fields(const char *template_fmt,
+ const struct ima_template_field ***fields,
+ int *num_fields);
struct ima_template_desc *ima_template_desc_current(void);
+struct ima_template_desc *lookup_template_desc(const char *name);
+bool ima_template_has_modsig(const struct ima_template_desc *ima_template);
int ima_restore_measurement_entry(struct ima_template_entry *entry);
int ima_restore_measurement_list(loff_t bufsize, void *buf);
int ima_measurements_show(struct seq_file *m, void *v);
int ima_init_template(void);
void ima_init_template_list(void);
int __init ima_init_digests(void);
+int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
+ void *lsm_data);
/*
* used to protect h_table and sha_table
hook(KEXEC_KERNEL_CHECK) \
hook(KEXEC_INITRAMFS_CHECK) \
hook(POLICY_CHECK) \
+ hook(KEXEC_CMDLINE) \
hook(MAX_CHECK)
#define __ima_hook_enumify(ENUM) ENUM,
__ima_hooks(__ima_hook_enumify)
};
+extern const char *const func_tokens[];
+
+struct modsig;
+
/* LIM API function definitions */
int ima_get_action(struct inode *inode, const struct cred *cred, u32 secid,
- int mask, enum ima_hooks func, int *pcr);
+ int mask, enum ima_hooks func, int *pcr,
+ struct ima_template_desc **template_desc);
int ima_must_measure(struct inode *inode, int mask, enum ima_hooks func);
int ima_collect_measurement(struct integrity_iint_cache *iint,
struct file *file, void *buf, loff_t size,
- enum hash_algo algo);
+ enum hash_algo algo, struct modsig *modsig);
void ima_store_measurement(struct integrity_iint_cache *iint, struct file *file,
const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len, int pcr);
+ int xattr_len, const struct modsig *modsig, int pcr,
+ struct ima_template_desc *template_desc);
void ima_audit_measurement(struct integrity_iint_cache *iint,
const unsigned char *filename);
int ima_alloc_init_template(struct ima_event_data *event_data,
- struct ima_template_entry **entry);
+ struct ima_template_entry **entry,
+ struct ima_template_desc *template_desc);
int ima_store_template(struct ima_template_entry *entry, int violation,
struct inode *inode,
const unsigned char *filename, int pcr);
/* IMA policy related functions */
int ima_match_policy(struct inode *inode, const struct cred *cred, u32 secid,
- enum ima_hooks func, int mask, int flags, int *pcr);
+ enum ima_hooks func, int mask, int flags, int *pcr,
+ struct ima_template_desc **template_desc);
void ima_init_policy(void);
void ima_update_policy(void);
void ima_update_policy_flag(void);
struct integrity_iint_cache *iint,
struct file *file, const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len);
+ int xattr_len, const struct modsig *modsig);
int ima_must_appraise(struct inode *inode, int mask, enum ima_hooks func);
void ima_update_xattr(struct integrity_iint_cache *iint, struct file *file);
enum integrity_status ima_get_cache_status(struct integrity_iint_cache *iint,
struct file *file,
const unsigned char *filename,
struct evm_ima_xattr_data *xattr_value,
- int xattr_len)
+ int xattr_len,
+ const struct modsig *modsig)
{
return INTEGRITY_UNKNOWN;
}
#endif /* CONFIG_IMA_APPRAISE */
+#ifdef CONFIG_IMA_APPRAISE_MODSIG
+bool ima_hook_supports_modsig(enum ima_hooks func);
+int ima_read_modsig(enum ima_hooks func, const void *buf, loff_t buf_len,
+ struct modsig **modsig);
+void ima_collect_modsig(struct modsig *modsig, const void *buf, loff_t size);
+int ima_get_modsig_digest(const struct modsig *modsig, enum hash_algo *algo,
+ const u8 **digest, u32 *digest_size);
+int ima_get_raw_modsig(const struct modsig *modsig, const void **data,
+ u32 *data_len);
+void ima_free_modsig(struct modsig *modsig);
+#else
+static inline bool ima_hook_supports_modsig(enum ima_hooks func)
+{
+ return false;
+}
+
+static inline int ima_read_modsig(enum ima_hooks func, const void *buf,
+ loff_t buf_len, struct modsig **modsig)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline void ima_collect_modsig(struct modsig *modsig, const void *buf,
+ loff_t size)
+{
+}
+
+static inline int ima_get_modsig_digest(const struct modsig *modsig,
+ enum hash_algo *algo, const u8 **digest,
+ u32 *digest_size)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline int ima_get_raw_modsig(const struct modsig *modsig,
+ const void **data, u32 *data_len)
+{
+ return -EOPNOTSUPP;
+}
+
+static inline void ima_free_modsig(struct modsig *modsig)
+{
+}
+#endif /* CONFIG_IMA_APPRAISE_MODSIG */
+
/* LSM based policy rules require audit */
#ifdef CONFIG_IMA_LSM_RULES
int ima_hash_algo = HASH_ALGO_SHA1;
static int hash_setup_done;
+static struct notifier_block ima_lsm_policy_notifier = {
+ .notifier_call = ima_lsm_policy_change,
+};
+
static int __init hash_setup(char *str)
{
struct ima_template_desc *template_desc = ima_template_desc_current();
}
__setup("ima_hash=", hash_setup);
+/* Prevent mmap'ing a file execute that is already mmap'ed write */
+static int mmap_violation_check(enum ima_hooks func, struct file *file,
+ char **pathbuf, const char **pathname,
+ char *filename)
+{
+ struct inode *inode;
+ int rc = 0;
+
+ if ((func == MMAP_CHECK) && mapping_writably_mapped(file->f_mapping)) {
+ rc = -ETXTBSY;
+ inode = file_inode(file);
+
+ if (!*pathbuf) /* ima_rdwr_violation possibly pre-fetched */
+ *pathname = ima_d_path(&file->f_path, pathbuf,
+ filename);
+ integrity_audit_msg(AUDIT_INTEGRITY_DATA, inode, *pathname,
+ "mmap_file", "mmapped_writers", rc, 0);
+ }
+ return rc;
+}
+
/*
* ima_rdwr_violation_check
*
{
struct inode *inode = file_inode(file);
struct integrity_iint_cache *iint = NULL;
- struct ima_template_desc *template_desc;
+ struct ima_template_desc *template_desc = NULL;
char *pathbuf = NULL;
char filename[NAME_MAX];
const char *pathname = NULL;
int rc = 0, action, must_appraise = 0;
int pcr = CONFIG_IMA_MEASURE_PCR_IDX;
struct evm_ima_xattr_data *xattr_value = NULL;
+ struct modsig *modsig = NULL;
int xattr_len = 0;
bool violation_check;
enum hash_algo hash_algo;
* bitmask based on the appraise/audit/measurement policy.
* Included is the appraise submask.
*/
- action = ima_get_action(inode, cred, secid, mask, func, &pcr);
+ action = ima_get_action(inode, cred, secid, mask, func, &pcr,
+ &template_desc);
violation_check = ((func == FILE_CHECK || func == MMAP_CHECK) &&
(ima_policy_flag & IMA_MEASURE));
if (!action && !violation_check)
/* Nothing to do, just return existing appraised status */
if (!action) {
- if (must_appraise)
- rc = ima_get_cache_status(iint, func);
+ if (must_appraise) {
+ rc = mmap_violation_check(func, file, &pathbuf,
+ &pathname, filename);
+ if (!rc)
+ rc = ima_get_cache_status(iint, func);
+ }
goto out_locked;
}
- template_desc = ima_template_desc_current();
if ((action & IMA_APPRAISE_SUBMASK) ||
- strcmp(template_desc->name, IMA_TEMPLATE_IMA_NAME) != 0)
+ strcmp(template_desc->name, IMA_TEMPLATE_IMA_NAME) != 0) {
/* read 'security.ima' */
xattr_len = ima_read_xattr(file_dentry(file), &xattr_value);
+ /*
+ * Read the appended modsig if allowed by the policy, and allow
+ * an additional measurement list entry, if needed, based on the
+ * template format and whether the file was already measured.
+ */
+ if (iint->flags & IMA_MODSIG_ALLOWED) {
+ rc = ima_read_modsig(func, buf, size, &modsig);
+
+ if (!rc && ima_template_has_modsig(template_desc) &&
+ iint->flags & IMA_MEASURED)
+ action |= IMA_MEASURE;
+ }
+ }
+
hash_algo = ima_get_hash_algo(xattr_value, xattr_len);
- rc = ima_collect_measurement(iint, file, buf, size, hash_algo);
+ rc = ima_collect_measurement(iint, file, buf, size, hash_algo, modsig);
if (rc != 0 && rc != -EBADF && rc != -EINVAL)
goto out_locked;
if (action & IMA_MEASURE)
ima_store_measurement(iint, file, pathname,
- xattr_value, xattr_len, pcr);
+ xattr_value, xattr_len, modsig, pcr,
+ template_desc);
if (rc == 0 && (action & IMA_APPRAISE_SUBMASK)) {
inode_lock(inode);
rc = ima_appraise_measurement(func, iint, file, pathname,
- xattr_value, xattr_len);
+ xattr_value, xattr_len, modsig);
inode_unlock(inode);
+ if (!rc)
+ rc = mmap_violation_check(func, file, &pathbuf,
+ &pathname, filename);
}
if (action & IMA_AUDIT)
ima_audit_measurement(iint, pathname);
rc = -EACCES;
mutex_unlock(&iint->mutex);
kfree(xattr_value);
+ ima_free_modsig(modsig);
out:
if (pathbuf)
__putname(pathbuf);
return 0;
}
- static const int read_idmap[READING_MAX_ID] = {
+ const int read_idmap[READING_MAX_ID] = {
[READING_FIRMWARE] = FIRMWARE_CHECK,
[READING_FIRMWARE_PREALLOC_BUFFER] = FIRMWARE_CHECK,
[READING_MODULE] = MODULE_CHECK,
switch (id) {
case LOADING_KEXEC_IMAGE:
- if (IS_ENABLED(CONFIG_KEXEC_VERIFY_SIG)
+ if (IS_ENABLED(CONFIG_KEXEC_SIG)
&& arch_ima_get_secureboot()) {
pr_err("impossible to appraise a kernel image without a file descriptor; try using kexec_file_load syscall.\n");
return -EACCES;
return 0;
}
+/*
+ * process_buffer_measurement - Measure the buffer to ima log.
+ * @buf: pointer to the buffer that needs to be added to the log.
+ * @size: size of buffer(in bytes).
+ * @eventname: event name to be used for the buffer entry.
+ * @cred: a pointer to a credentials structure for user validation.
+ * @secid: the secid of the task to be validated.
+ *
+ * Based on policy, the buffer is measured into the ima log.
+ */
+static void process_buffer_measurement(const void *buf, int size,
+ const char *eventname,
+ const struct cred *cred, u32 secid)
+{
+ int ret = 0;
+ struct ima_template_entry *entry = NULL;
+ struct integrity_iint_cache iint = {};
+ struct ima_event_data event_data = {.iint = &iint,
+ .filename = eventname,
+ .buf = buf,
+ .buf_len = size};
+ struct ima_template_desc *template_desc = NULL;
+ struct {
+ struct ima_digest_data hdr;
+ char digest[IMA_MAX_DIGEST_SIZE];
+ } hash = {};
+ int violation = 0;
+ int pcr = CONFIG_IMA_MEASURE_PCR_IDX;
+ int action = 0;
+
+ action = ima_get_action(NULL, cred, secid, 0, KEXEC_CMDLINE, &pcr,
+ &template_desc);
+ if (!(action & IMA_MEASURE))
+ return;
+
+ iint.ima_hash = &hash.hdr;
+ iint.ima_hash->algo = ima_hash_algo;
+ iint.ima_hash->length = hash_digest_size[ima_hash_algo];
+
+ ret = ima_calc_buffer_hash(buf, size, iint.ima_hash);
+ if (ret < 0)
+ goto out;
+
+ ret = ima_alloc_init_template(&event_data, &entry, template_desc);
+ if (ret < 0)
+ goto out;
+
+ ret = ima_store_template(entry, violation, NULL, buf, pcr);
+
+ if (ret < 0)
+ ima_free_template_entry(entry);
+
+out:
+ return;
+}
+
+/**
+ * ima_kexec_cmdline - measure kexec cmdline boot args
+ * @buf: pointer to buffer
+ * @size: size of buffer
+ *
+ * Buffers can only be measured, not appraised.
+ */
+void ima_kexec_cmdline(const void *buf, int size)
+{
+ u32 secid;
+
+ if (buf && size != 0) {
+ security_task_getsecid(current, &secid);
+ process_buffer_measurement(buf, size, "kexec-cmdline",
+ current_cred(), secid);
+ }
+}
+
static int __init init_ima(void)
{
int error;
error = ima_init();
}
+ error = register_blocking_lsm_notifier(&ima_lsm_policy_notifier);
+ if (error)
+ pr_warn("Couldn't register LSM notifier, error %d\n", error);
+
if (!error)
ima_update_policy_flag();
* ima_policy.c
* - initialize default measure policy rules
*/
+
+#define pr_fmt(fmt) KBUILD_MODNAME ": " fmt
+
#include <linux/init.h>
#include <linux/list.h>
#include <linux/fs.h>
int type; /* audit type */
} lsm[MAX_LSM_RULES];
char *fsname;
+ struct ima_template_desc *template;
};
/*
};
/* An array of architecture specific rules */
-struct ima_rule_entry *arch_policy_entry __ro_after_init;
+static struct ima_rule_entry *arch_policy_entry __ro_after_init;
static LIST_HEAD(ima_default_rules);
static LIST_HEAD(ima_policy_rules);
}
__setup("ima_appraise_tcb", default_appraise_policy_setup);
+static void ima_lsm_free_rule(struct ima_rule_entry *entry)
+{
+ int i;
+
+ for (i = 0; i < MAX_LSM_RULES; i++) {
+ kfree(entry->lsm[i].rule);
+ kfree(entry->lsm[i].args_p);
+ }
+ kfree(entry);
+}
+
+static struct ima_rule_entry *ima_lsm_copy_rule(struct ima_rule_entry *entry)
+{
+ struct ima_rule_entry *nentry;
+ int i, result;
+
+ nentry = kmalloc(sizeof(*nentry), GFP_KERNEL);
+ if (!nentry)
+ return NULL;
+
+ /*
+ * Immutable elements are copied over as pointers and data; only
+ * lsm rules can change
+ */
+ memcpy(nentry, entry, sizeof(*nentry));
+ memset(nentry->lsm, 0, FIELD_SIZEOF(struct ima_rule_entry, lsm));
+
+ for (i = 0; i < MAX_LSM_RULES; i++) {
+ if (!entry->lsm[i].rule)
+ continue;
+
+ nentry->lsm[i].type = entry->lsm[i].type;
+ nentry->lsm[i].args_p = kstrdup(entry->lsm[i].args_p,
+ GFP_KERNEL);
+ if (!nentry->lsm[i].args_p)
+ goto out_err;
+
+ result = security_filter_rule_init(nentry->lsm[i].type,
+ Audit_equal,
+ nentry->lsm[i].args_p,
+ &nentry->lsm[i].rule);
+ if (result == -EINVAL)
+ pr_warn("ima: rule for LSM \'%d\' is undefined\n",
+ entry->lsm[i].type);
+ }
+ return nentry;
+
+out_err:
+ ima_lsm_free_rule(nentry);
+ return NULL;
+}
+
+static int ima_lsm_update_rule(struct ima_rule_entry *entry)
+{
+ struct ima_rule_entry *nentry;
+
+ nentry = ima_lsm_copy_rule(entry);
+ if (!nentry)
+ return -ENOMEM;
+
+ list_replace_rcu(&entry->list, &nentry->list);
+ synchronize_rcu();
+ ima_lsm_free_rule(entry);
+
+ return 0;
+}
+
/*
* The LSM policy can be reloaded, leaving the IMA LSM based rules referring
* to the old, stale LSM policy. Update the IMA LSM based rules to reflect
- * the reloaded LSM policy. We assume the rules still exist; and BUG_ON() if
- * they don't.
+ * the reloaded LSM policy.
*/
static void ima_lsm_update_rules(void)
{
- struct ima_rule_entry *entry;
- int result;
- int i;
+ struct ima_rule_entry *entry, *e;
+ int i, result, needs_update;
- list_for_each_entry(entry, &ima_policy_rules, list) {
+ list_for_each_entry_safe(entry, e, &ima_policy_rules, list) {
+ needs_update = 0;
for (i = 0; i < MAX_LSM_RULES; i++) {
- if (!entry->lsm[i].rule)
- continue;
- result = security_filter_rule_init(entry->lsm[i].type,
- Audit_equal,
- entry->lsm[i].args_p,
- &entry->lsm[i].rule);
- BUG_ON(!entry->lsm[i].rule);
+ if (entry->lsm[i].rule) {
+ needs_update = 1;
+ break;
+ }
+ }
+ if (!needs_update)
+ continue;
+
+ result = ima_lsm_update_rule(entry);
+ if (result) {
+ pr_err("ima: lsm rule update error %d\n",
+ result);
+ return;
}
}
}
+int ima_lsm_policy_change(struct notifier_block *nb, unsigned long event,
+ void *lsm_data)
+{
+ if (event != LSM_POLICY_CHANGE)
+ return NOTIFY_DONE;
+
+ ima_lsm_update_rules();
+ return NOTIFY_OK;
+}
+
/**
* ima_match_rules - determine whether an inode matches the measure rule.
* @rule: a pointer to a rule
{
int i;
+ if (func == KEXEC_CMDLINE) {
+ if ((rule->flags & IMA_FUNC) && (rule->func == func))
+ return true;
+ return false;
+ }
if ((rule->flags & IMA_FUNC) &&
(rule->func != func && func != POST_SETATTR))
return false;
for (i = 0; i < MAX_LSM_RULES; i++) {
int rc = 0;
u32 osid;
- int retried = 0;
if (!rule->lsm[i].rule)
continue;
-retry:
+
switch (i) {
case LSM_OBJ_USER:
case LSM_OBJ_ROLE:
default:
break;
}
- if ((rc < 0) && (!retried)) {
- retried = 1;
- ima_lsm_update_rules();
- goto retry;
- }
if (!rc)
return false;
}
* @func: IMA hook identifier
* @mask: requested action (MAY_READ | MAY_WRITE | MAY_APPEND | MAY_EXEC)
* @pcr: set the pcr to extend
+ * @template_desc: the template that should be used for this rule
*
* Measure decision based on func/mask/fsmagic and LSM(subj/obj/type)
* conditions.
* than writes so ima_match_policy() is classical RCU candidate.
*/
int ima_match_policy(struct inode *inode, const struct cred *cred, u32 secid,
- enum ima_hooks func, int mask, int flags, int *pcr)
+ enum ima_hooks func, int mask, int flags, int *pcr,
+ struct ima_template_desc **template_desc)
{
struct ima_rule_entry *entry;
int action = 0, actmask = flags | (flags << 1);
+ if (template_desc)
+ *template_desc = ima_template_desc_current();
+
rcu_read_lock();
list_for_each_entry_rcu(entry, ima_rules, list) {
action |= IMA_FAIL_UNVERIFIABLE_SIGS;
}
+
if (entry->action & IMA_DO_MASK)
actmask &= ~(entry->action | entry->action << 1);
else
if ((pcr) && (entry->flags & IMA_PCR))
*pcr = entry->pcr;
+ if (template_desc && entry->template)
+ *template_desc = entry->template;
+
if (!actmask)
break;
}
Opt_uid_gt, Opt_euid_gt, Opt_fowner_gt,
Opt_uid_lt, Opt_euid_lt, Opt_fowner_lt,
Opt_appraise_type, Opt_permit_directio,
- Opt_pcr, Opt_err
+ Opt_pcr, Opt_template, Opt_err
};
static const match_table_t policy_tokens = {
{Opt_appraise_type, "appraise_type=%s"},
{Opt_permit_directio, "permit_directio"},
{Opt_pcr, "pcr=%s"},
+ {Opt_template, "template=%s"},
{Opt_err, NULL}
};
ima_log_string_op(ab, key, value, NULL);
}
+/*
+ * Validating the appended signature included in the measurement list requires
+ * the file hash calculated without the appended signature (i.e., the 'd-modsig'
+ * field). Therefore, notify the user if they have the 'modsig' field but not
+ * the 'd-modsig' field in the template.
+ */
+static void check_template_modsig(const struct ima_template_desc *template)
+{
+#define MSG "template with 'modsig' field also needs 'd-modsig' field\n"
+ bool has_modsig, has_dmodsig;
+ static bool checked;
+ int i;
+
+ /* We only need to notify the user once. */
+ if (checked)
+ return;
+
+ has_modsig = has_dmodsig = false;
+ for (i = 0; i < template->num_fields; i++) {
+ if (!strcmp(template->fields[i]->field_id, "modsig"))
+ has_modsig = true;
+ else if (!strcmp(template->fields[i]->field_id, "d-modsig"))
+ has_dmodsig = true;
+ }
+
+ if (has_modsig && !has_dmodsig)
+ pr_notice(MSG);
+
+ checked = true;
+#undef MSG
+}
+
static int ima_parse_rule(char *rule, struct ima_rule_entry *entry)
{
struct audit_buffer *ab;
char *from;
char *p;
bool uid_token;
+ struct ima_template_desc *template_desc;
int result = 0;
ab = integrity_audit_log_start(audit_context(), GFP_KERNEL,
entry->func = KEXEC_INITRAMFS_CHECK;
else if (strcmp(args[0].from, "POLICY_CHECK") == 0)
entry->func = POLICY_CHECK;
+ else if (strcmp(args[0].from, "KEXEC_CMDLINE") == 0)
+ entry->func = KEXEC_CMDLINE;
else
result = -EINVAL;
if (!result)
ima_log_string(ab, "appraise_type", args[0].from);
if ((strcmp(args[0].from, "imasig")) == 0)
entry->flags |= IMA_DIGSIG_REQUIRED;
+ else if (ima_hook_supports_modsig(entry->func) &&
+ strcmp(args[0].from, "imasig|modsig") == 0)
+ entry->flags |= IMA_DIGSIG_REQUIRED |
+ IMA_MODSIG_ALLOWED;
else
result = -EINVAL;
break;
else
entry->flags |= IMA_PCR;
+ break;
+ case Opt_template:
+ ima_log_string(ab, "template", args[0].from);
+ if (entry->action != MEASURE) {
+ result = -EINVAL;
+ break;
+ }
+ template_desc = lookup_template_desc(args[0].from);
+ if (!template_desc || entry->template) {
+ result = -EINVAL;
+ break;
+ }
+
+ /*
+ * template_desc_init_fields() does nothing if
+ * the template is already initialised, so
+ * it's safe to do this unconditionally
+ */
+ template_desc_init_fields(template_desc->fmt,
+ &(template_desc->fields),
+ &(template_desc->num_fields));
+ entry->template = template_desc;
break;
case Opt_err:
ima_log_string(ab, "UNKNOWN", p);
else if (entry->action == APPRAISE)
temp_ima_appraise |= ima_appraise_flag(entry->func);
+ if (!result && entry->flags & IMA_MODSIG_ALLOWED) {
+ template_desc = entry->template ? entry->template :
+ ima_template_desc_current();
+ check_template_modsig(template_desc);
+ }
+
audit_log_format(ab, "res=%d", !result);
audit_log_end(ab);
return result;
}
}
+#define __ima_hook_stringify(str) (#str),
+
+const char *const func_tokens[] = {
+ __ima_hooks(__ima_hook_stringify)
+};
+
#ifdef CONFIG_IMA_READ_POLICY
enum {
mask_exec = 0, mask_write, mask_read, mask_append
"^MAY_APPEND"
};
-#define __ima_hook_stringify(str) (#str),
-
-static const char *const func_tokens[] = {
- __ima_hooks(__ima_hook_stringify)
-};
-
void *ima_policy_start(struct seq_file *m, loff_t *pos)
{
loff_t l = *pos;
}
}
}
- if (entry->flags & IMA_DIGSIG_REQUIRED)
- seq_puts(m, "appraise_type=imasig ");
+ if (entry->template)
+ seq_printf(m, "template=%s ", entry->template->name);
+ if (entry->flags & IMA_DIGSIG_REQUIRED) {
+ if (entry->flags & IMA_MODSIG_ALLOWED)
+ seq_puts(m, "appraise_type=imasig|modsig ");
+ else
+ seq_puts(m, "appraise_type=imasig ");
+ }
if (entry->flags & IMA_PERMIT_DIRECTIO)
seq_puts(m, "permit_directio ");
rcu_read_unlock();
return 0;
}
#endif /* CONFIG_IMA_READ_POLICY */
+
+ #if defined(CONFIG_IMA_APPRAISE) && defined(CONFIG_INTEGRITY_TRUSTED_KEYRING)
+ /*
+ * ima_appraise_signature: whether IMA will appraise a given function using
+ * an IMA digital signature. This is restricted to cases where the kernel
+ * has a set of built-in trusted keys in order to avoid an attacker simply
+ * loading additional keys.
+ */
+ bool ima_appraise_signature(enum kernel_read_file_id id)
+ {
+ struct ima_rule_entry *entry;
+ bool found = false;
+ enum ima_hooks func;
+
+ if (id >= READING_MAX_ID)
+ return false;
+
+ func = read_idmap[id] ?: FILE_CHECK;
+
+ rcu_read_lock();
+ list_for_each_entry_rcu(entry, ima_rules, list) {
+ if (entry->action != APPRAISE)
+ continue;
+
+ /*
+ * A generic entry will match, but otherwise require that it
+ * match the func we're looking for
+ */
+ if (entry->func && entry->func != func)
+ continue;
+
+ /*
+ * We require this to be a digital signature, not a raw IMA
+ * hash.
+ */
+ if (entry->flags & IMA_DIGSIG_REQUIRED)
+ found = true;
+
+ /*
+ * We've found a rule that matches, so break now even if it
+ * didn't require a digital signature - a later rule that does
+ * won't override it, so would be a false positive.
+ */
+ break;
+ }
+
+ rcu_read_unlock();
+ return found;
+ }
+ #endif /* CONFIG_IMA_APPRAISE && CONFIG_INTEGRITY_TRUSTED_KEYRING */
/* How many LSMs were built into the kernel? */
#define LSM_COUNT (__end_lsm_info - __start_lsm_info)
+ #define EARLY_LSM_COUNT (__end_early_lsm_info - __start_early_lsm_info)
struct security_hook_heads security_hook_heads __lsm_ro_after_init;
-static ATOMIC_NOTIFIER_HEAD(lsm_notifier_chain);
+static BLOCKING_NOTIFIER_HEAD(blocking_lsm_notifier_chain);
static struct kmem_cache *lsm_file_cache;
static struct kmem_cache *lsm_inode_cache;
static void __init lsm_early_cred(struct cred *cred);
static void __init lsm_early_task(struct task_struct *task);
+ static int lsm_append(const char *new, char **result);
+
static void __init ordered_lsm_init(void)
{
struct lsm_info **lsm;
kfree(ordered_lsms);
}
+ int __init early_security_init(void)
+ {
+ int i;
+ struct hlist_head *list = (struct hlist_head *) &security_hook_heads;
+ struct lsm_info *lsm;
+
+ for (i = 0; i < sizeof(security_hook_heads) / sizeof(struct hlist_head);
+ i++)
+ INIT_HLIST_HEAD(&list[i]);
+
+ for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
+ if (!lsm->enabled)
+ lsm->enabled = &lsm_enabled_true;
+ prepare_lsm(lsm);
+ initialize_lsm(lsm);
+ }
+
+ return 0;
+ }
+
/**
* security_init - initializes the security framework
*
*/
int __init security_init(void)
{
- int i;
- struct hlist_head *list = (struct hlist_head *) &security_hook_heads;
+ struct lsm_info *lsm;
pr_info("Security Framework initializing\n");
- for (i = 0; i < sizeof(security_hook_heads) / sizeof(struct hlist_head);
- i++)
- INIT_HLIST_HEAD(&list[i]);
+ /*
+ * Append the names of the early LSM modules now that kmalloc() is
+ * available
+ */
+ for (lsm = __start_early_lsm_info; lsm < __end_early_lsm_info; lsm++) {
+ if (lsm->enabled)
+ lsm_append(lsm->name, &lsm_names);
+ }
/* Load LSMs in specified order. */
ordered_lsm_init();
return !strcmp(last, lsm);
}
- static int lsm_append(char *new, char **result)
+ static int lsm_append(const char *new, char **result)
{
char *cp;
hooks[i].lsm = lsm;
hlist_add_tail_rcu(&hooks[i].list, hooks[i].head);
}
- if (lsm_append(lsm, &lsm_names) < 0)
- panic("%s - Cannot get early memory.\n", __func__);
+
+ /*
+ * Don't try to append during early_security_init(), we'll come back
+ * and fix this up afterwards.
+ */
+ if (slab_is_available()) {
+ if (lsm_append(lsm, &lsm_names) < 0)
+ panic("%s - Cannot get early memory.\n", __func__);
+ }
}
-int call_lsm_notifier(enum lsm_event event, void *data)
+int call_blocking_lsm_notifier(enum lsm_event event, void *data)
{
- return atomic_notifier_call_chain(&lsm_notifier_chain, event, data);
+ return blocking_notifier_call_chain(&blocking_lsm_notifier_chain,
+ event, data);
}
-EXPORT_SYMBOL(call_lsm_notifier);
+EXPORT_SYMBOL(call_blocking_lsm_notifier);
-int register_lsm_notifier(struct notifier_block *nb)
+int register_blocking_lsm_notifier(struct notifier_block *nb)
{
- return atomic_notifier_chain_register(&lsm_notifier_chain, nb);
+ return blocking_notifier_chain_register(&blocking_lsm_notifier_chain,
+ nb);
}
-EXPORT_SYMBOL(register_lsm_notifier);
+EXPORT_SYMBOL(register_blocking_lsm_notifier);
-int unregister_lsm_notifier(struct notifier_block *nb)
+int unregister_blocking_lsm_notifier(struct notifier_block *nb)
{
- return atomic_notifier_chain_unregister(&lsm_notifier_chain, nb);
+ return blocking_notifier_chain_unregister(&blocking_lsm_notifier_chain,
+ nb);
}
-EXPORT_SYMBOL(unregister_lsm_notifier);
+EXPORT_SYMBOL(unregister_blocking_lsm_notifier);
/**
* lsm_cred_alloc - allocate a composite cred blob
return call_int_hook(move_mount, 0, from_path, to_path);
}
+int security_path_notify(const struct path *path, u64 mask,
+ unsigned int obj_type)
+{
+ return call_int_hook(path_notify, 0, path, mask, obj_type);
+}
+
int security_inode_alloc(struct inode *inode)
{
int rc = lsm_inode_alloc(inode);
call_void_hook(bpf_prog_free_security, aux);
}
#endif /* CONFIG_BPF_SYSCALL */
+
+ int security_locked_down(enum lockdown_reason what)
+ {
+ return call_int_hook(locked_down, 0, what);
+ }
+ EXPORT_SYMBOL(security_locked_down);