possible to determine what the correct size should be.
This option provides an override for these situations.
+ carrier_timeout=
+ [NET] Specifies amount of time (in seconds) that
+ the kernel should wait for a network carrier. By default
+ it waits 120 seconds.
+
ca_keys= [KEYS] This parameter identifies a specific key(s) on
the system trusted keyring to be used for certificate
trust validation.
The filter can be disabled or changed to another
driver later using sysfs.
+ driver_async_probe= [KNL]
+ List of driver names to be probed asynchronously.
+ Format: <driver_name1>,<driver_name2>...
+
drm.edid_firmware=[<connector>:]<file>[,[<connector>:]<file>]
Broken monitors, graphic adapters, KVMs and EDIDless
panels may send no or incorrect EDID data sets.
specified address. The serial port must already be
setup and configured. Options are not yet supported.
+ efifb,[options]
+ Start an early, unaccelerated console on the EFI
+ memory mapped framebuffer (if available). On cache
+ coherent non-x86 systems that use system memory for
+ the framebuffer, pass the 'ram' option so that it is
+ mapped with the correct attributes.
+
earlyprintk= [X86,SH,ARM,M68k,S390]
earlyprintk=vga
- earlyprintk=efi
earlyprintk=sclp
earlyprintk=xen
earlyprintk=serial[,ttySn[,baudrate]]
arch/x86/kernel/cpu/cpufreq/elanfreq.c.
elevator= [IOSCHED]
- Format: {"cfq" | "deadline" | "noop"}
- See Documentation/block/cfq-iosched.txt and
- Documentation/block/deadline-iosched.txt for details.
+ Format: { "mq-deadline" | "kyber" | "bfq" }
+ See Documentation/block/deadline-iosched.txt,
+ Documentation/block/kyber-iosched.txt and
+ Documentation/block/bfq-iosched.txt for details.
elfcorehdr=[size[KMG]@]offset[KMG] [IA64,PPC,SH,X86,S390]
Specifies physical address of start of kernel core
to let secondary kernels in charge of setting up
LPIs.
+ irqchip.gicv3_pseudo_nmi= [ARM64]
+ Enables support for pseudo-NMIs in the kernel. This
+ requires the kernel to be built with
+ CONFIG_ARM64_PSEUDO_NMI.
+
irqfixup [HW]
When an interrupt is not handled search all handlers
for it. Intended to get systems with badly broken
Built with CONFIG_DEBUG_KMEMLEAK_DEFAULT_OFF=y,
the default is off.
+ kpti= [ARM64] Control page table isolation of user
+ and kernel address spaces.
+ Default: enabled on cores which need mitigation.
+ 0: force disabled
+ 1: force enabled
+
kvm.ignore_msrs=[KVM] Ignore guest accesses to unhandled MSRs.
Default is 0 (don't ignore, but inject #GP)
lsm.debug [SECURITY] Enable LSM initialization debugging output.
+ lsm=lsm1,...,lsmN
+ [SECURITY] Choose order of LSM initialization. This
+ overrides CONFIG_LSM, and the "security=" parameter.
+
machvec= [IA-64] Force the use of a particular machine-vector
(machvec) in a generic kernel.
Example: machvec=hpzx1_swiotlb
in the "bleeding edge" mini2440 support kernel at
http://repo.or.cz/w/linux-2.6/mini2440.git
+ mitigations=
+ [X86,PPC,S390] Control optional mitigations for CPU
+ vulnerabilities. This is a set of curated,
+ arch-independent options, each of which is an
+ aggregation of existing arch-specific options.
+
+ off
+ Disable all optional CPU mitigations. This
+ improves system performance, but it may also
+ expose users to several CPU vulnerabilities.
+ Equivalent to: nopti [X86,PPC]
+ nospectre_v1 [PPC]
+ nobp=0 [S390]
+ nospectre_v2 [X86,PPC,S390]
+ spectre_v2_user=off [X86]
+ spec_store_bypass_disable=off [X86,PPC]
+ l1tf=off [X86]
+
+ auto (default)
+ Mitigate all CPU vulnerabilities, but leave SMT
+ enabled, even if it's vulnerable. This is for
+ users who don't want to be surprised by SMT
+ getting disabled across kernel upgrades, or who
+ have other ways of avoiding SMT-based attacks.
+ Equivalent to: (default behavior)
+
+ auto,nosmt
+ Mitigate all CPU vulnerabilities, disabling SMT
+ if needed. This is for users who always want to
+ be fully mitigated, even if it means losing SMT.
+ Equivalent to: l1tf=flush,nosmt [X86]
+
mminit_loglevel=
[KNL] When CONFIG_DEBUG_MEMORY_INIT is set, this
parameter allows control of the logging verbosity for
latencies, which will choose a value aligned
with the appropriate hardware boundaries.
- rcutree.jiffies_till_sched_qs= [KNL]
- Set required age in jiffies for a
- given grace period before RCU starts
- soliciting quiescent-state help from
- rcu_note_context_switch(). If not specified, the
- kernel will calculate a value based on the most
- recent settings of rcutree.jiffies_till_first_fqs
- and rcutree.jiffies_till_next_fqs.
- This calculated value may be viewed in
- rcutree.jiffies_to_sched_qs. Any attempt to
- set rcutree.jiffies_to_sched_qs will be
- cheerfully overwritten.
-
rcutree.jiffies_till_first_fqs= [KNL]
Set delay from grace-period initialization to
first attempt to force quiescent states.
quiescent states. Units are jiffies, minimum
value is one, and maximum value is HZ.
+ rcutree.jiffies_till_sched_qs= [KNL]
+ Set required age in jiffies for a
+ given grace period before RCU starts
+ soliciting quiescent-state help from
+ rcu_note_context_switch() and cond_resched().
+ If not specified, the kernel will calculate
+ a value based on the most recent settings
+ of rcutree.jiffies_till_first_fqs
+ and rcutree.jiffies_till_next_fqs.
+ This calculated value may be viewed in
+ rcutree.jiffies_to_sched_qs. Any attempt to set
+ rcutree.jiffies_to_sched_qs will be cheerfully
+ overwritten.
+
rcutree.kthread_prio= [KNL,BOOT]
Set the SCHED_FIFO priority of the RCU per-CPU
kthreads (rcuc/N). This value is also used for
This wake_up() will be accompanied by a
WARN_ONCE() splat and an ftrace_dump().
+ rcutree.sysrq_rcu= [KNL]
+ Commandeer a sysrq key to dump out Tree RCU's
+ rcu_node tree with an eye towards determining
+ why a new grace period has not yet started.
+
rcuperf.gp_async= [KNL]
Measure performance of asynchronous
grace-period primitives such as call_rcu().
Note: increases power consumption, thus should only be
enabled if running jitter sensitive (HPC/RT) workloads.
- security= [SECURITY] Choose a security module to enable at boot.
- If this boot parameter is not specified, only the first
- security module asking for security registration will be
- loaded. An invalid security module name will be treated
- as if no module has been chosen.
+ security= [SECURITY] Choose a legacy "major" security module to
+ enable at boot. This has been deprecated by the
+ "lsm=" parameter.
selinux= [SELINUX] Disable or enable SELinux at boot time.
Format: { "0" | "1" }
usbcore.authorized_default=
[USB] Default USB device authorization:
(default -1 = authorized except for wireless USB,
- 0 = not authorized, 1 = authorized)
+ 0 = not authorized, 1 = authorized, 2 = authorized
+ if device connected to internal port)
usbcore.autosuspend=
[USB] The autosuspend time delay (in seconds) used
or other driver-specific files in the
Documentation/watchdog/ directory.
+ watchdog_thresh=
+ [KNL]
+ Set the hard lockup detector stall duration
+ threshold in seconds. The soft lockup detector
+ threshold is set to twice the value. A value of 0
+ disables both lockup detectors. Default is 10
+ seconds.
+
workqueue.watchdog_thresh=
If CONFIG_WQ_WATCHDOG is configured, workqueue can
warn stall conditions and dump internal state to
enable = security_ftr_enabled(SEC_FTR_FAVOUR_SECURITY) &&
security_ftr_enabled(SEC_FTR_BNDS_CHK_SPEC_BAR);
- if (!no_nospec)
+ if (!no_nospec && !cpu_mitigations_off())
enable_barrier_nospec(enable);
}
early_param("nospectre_v2", handle_nospectre_v2);
void setup_spectre_v2(void)
{
- if (no_spectrev2)
+ if (no_spectrev2 || cpu_mitigations_off())
do_btb_flush_fixups();
else
btb_flush_enabled = true;
bcs = security_ftr_enabled(SEC_FTR_BCCTRL_SERIALISED);
ccd = security_ftr_enabled(SEC_FTR_COUNT_CACHE_DISABLED);
- if (bcs || ccd || count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) {
- bool comma = false;
+ if (bcs || ccd) {
seq_buf_printf(&s, "Mitigation: ");
- if (bcs) {
+ if (bcs)
seq_buf_printf(&s, "Indirect branch serialisation (kernel only)");
- comma = true;
- }
- if (ccd) {
- if (comma)
- seq_buf_printf(&s, ", ");
- seq_buf_printf(&s, "Indirect branch cache disabled");
- comma = true;
- }
-
- if (comma)
+ if (bcs && ccd)
seq_buf_printf(&s, ", ");
- seq_buf_printf(&s, "Software count cache flush");
+ if (ccd)
+ seq_buf_printf(&s, "Indirect branch cache disabled");
+ } else if (count_cache_flush_type != COUNT_CACHE_FLUSH_NONE) {
+ seq_buf_printf(&s, "Mitigation: Software count cache flush");
if (count_cache_flush_type == COUNT_CACHE_FLUSH_HW)
- seq_buf_printf(&s, "(hardware accelerated)");
+ seq_buf_printf(&s, " (hardware accelerated)");
} else if (btb_flush_enabled) {
seq_buf_printf(&s, "Mitigation: Branch predictor state flush");
} else {
stf_enabled_flush_types = type;
- if (!no_stf_barrier)
+ if (!no_stf_barrier && !cpu_mitigations_off())
stf_barrier_enable(enable);
}
static void *__init alloc_stack(unsigned long limit, int cpu)
{
- unsigned long pa;
+ void *ptr;
BUILD_BUG_ON(STACK_INT_FRAME_SIZE % 16);
- pa = memblock_alloc_base_nid(THREAD_SIZE, THREAD_SIZE, limit,
- early_cpu_to_node(cpu), MEMBLOCK_NONE);
- if (!pa) {
- pa = memblock_alloc_base(THREAD_SIZE, THREAD_SIZE, limit);
- if (!pa)
- panic("cannot allocate stacks");
- }
+ ptr = memblock_alloc_try_nid(THREAD_SIZE, THREAD_SIZE,
+ MEMBLOCK_LOW_LIMIT, limit,
+ early_cpu_to_node(cpu));
+ if (!ptr)
+ panic("cannot allocate stacks");
- return __va(pa);
+ return ptr;
}
void __init irqstack_early_init(void)
}
#endif
-/*
- * Emergency stacks are used for a range of things, from asynchronous
- * NMIs (system reset, machine check) to synchronous, process context.
- * We set preempt_count to zero, even though that isn't necessarily correct. To
- * get the right value we'd need to copy it from the previous thread_info, but
- * doing that might fault causing more problems.
- * TODO: what to do with accounting?
- */
-static void emerg_stack_init_thread_info(struct thread_info *ti, int cpu)
-{
- ti->task = NULL;
- ti->cpu = cpu;
- ti->preempt_count = 0;
- ti->local_flags = 0;
- ti->flags = 0;
- klp_init_thread_info(ti);
-}
-
/*
* Stack space used when we detect a bad kernel stack pointer, and
* early in SMP boots before relocation is enabled. Exclusive emergency
limit = min(ppc64_bolted_size(), ppc64_rma_size);
for_each_possible_cpu(i) {
- struct thread_info *ti;
-
- ti = alloc_stack(limit, i);
- memset(ti, 0, THREAD_SIZE);
- emerg_stack_init_thread_info(ti, i);
- paca_ptrs[i]->emergency_sp = (void *)ti + THREAD_SIZE;
+ paca_ptrs[i]->emergency_sp = alloc_stack(limit, i) + THREAD_SIZE;
#ifdef CONFIG_PPC_BOOK3S_64
/* emergency stack for NMI exception handling. */
- ti = alloc_stack(limit, i);
- memset(ti, 0, THREAD_SIZE);
- emerg_stack_init_thread_info(ti, i);
- paca_ptrs[i]->nmi_emergency_sp = (void *)ti + THREAD_SIZE;
+ paca_ptrs[i]->nmi_emergency_sp = alloc_stack(limit, i) + THREAD_SIZE;
/* emergency stack for machine check exception handling. */
- ti = alloc_stack(limit, i);
- memset(ti, 0, THREAD_SIZE);
- emerg_stack_init_thread_info(ti, i);
- paca_ptrs[i]->mc_emergency_sp = (void *)ti + THREAD_SIZE;
+ paca_ptrs[i]->mc_emergency_sp = alloc_stack(limit, i) + THREAD_SIZE;
#endif
}
}
* hardware prefetch runoff. We don't have a recipe for load patterns to
* reliably avoid the prefetcher.
*/
- l1d_flush_fallback_area = __va(memblock_alloc_base(l1d_size * 2, l1d_size, limit));
- memset(l1d_flush_fallback_area, 0, l1d_size * 2);
+ l1d_flush_fallback_area = memblock_alloc_try_nid(l1d_size * 2,
+ l1d_size, MEMBLOCK_LOW_LIMIT,
+ limit, NUMA_NO_NODE);
+ if (!l1d_flush_fallback_area)
+ panic("%s: Failed to allocate %llu bytes align=0x%llx max_addr=%pa\n",
+ __func__, l1d_size * 2, l1d_size, &limit);
+
for_each_possible_cpu(cpu) {
struct paca_struct *paca = paca_ptrs[cpu];
enabled_flush_types = types;
- if (!no_rfi_flush)
+ if (!no_rfi_flush && !cpu_mitigations_off())
rfi_flush_enable(enable);
}
char arg[20];
int ret, i;
- if (cmdline_find_option_bool(boot_command_line, "nospectre_v2"))
+ if (cmdline_find_option_bool(boot_command_line, "nospectre_v2") ||
+ cpu_mitigations_off())
return SPECTRE_V2_CMD_NONE;
ret = cmdline_find_option(boot_command_line, "spectre_v2", arg, sizeof(arg));
char arg[20];
int ret, i;
- if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable")) {
+ if (cmdline_find_option_bool(boot_command_line, "nospec_store_bypass_disable") ||
+ cpu_mitigations_off()) {
return SPEC_STORE_BYPASS_CMD_NONE;
} else {
ret = cmdline_find_option(boot_command_line, "spec_store_bypass_disable",
if (task_spec_ssb_force_disable(task))
return -EPERM;
task_clear_spec_ssb_disable(task);
+ task_clear_spec_ssb_noexec(task);
task_update_spec_tif(task);
break;
case PR_SPEC_DISABLE:
task_set_spec_ssb_disable(task);
+ task_clear_spec_ssb_noexec(task);
task_update_spec_tif(task);
break;
case PR_SPEC_FORCE_DISABLE:
task_set_spec_ssb_disable(task);
task_set_spec_ssb_force_disable(task);
+ task_clear_spec_ssb_noexec(task);
+ task_update_spec_tif(task);
+ break;
+ case PR_SPEC_DISABLE_NOEXEC:
+ if (task_spec_ssb_force_disable(task))
+ return -EPERM;
+ task_set_spec_ssb_disable(task);
+ task_set_spec_ssb_noexec(task);
task_update_spec_tif(task);
break;
default:
case SPEC_STORE_BYPASS_PRCTL:
if (task_spec_ssb_force_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_FORCE_DISABLE;
+ if (task_spec_ssb_noexec(task))
+ return PR_SPEC_PRCTL | PR_SPEC_DISABLE_NOEXEC;
if (task_spec_ssb_disable(task))
return PR_SPEC_PRCTL | PR_SPEC_DISABLE;
return PR_SPEC_PRCTL | PR_SPEC_ENABLE;
if (!boot_cpu_has_bug(X86_BUG_L1TF))
return;
+ if (cpu_mitigations_off())
+ l1tf_mitigation = L1TF_MITIGATION_OFF;
+ else if (cpu_mitigations_auto_nosmt())
+ l1tf_mitigation = L1TF_MITIGATION_FLUSH_NOSMT;
+
override_cache_bits(&boot_cpu_data);
switch (l1tf_mitigation) {
#include <linux/spinlock.h>
#include <linux/mm.h>
#include <linux/uaccess.h>
+ #include <linux/cpu.h>
#include <asm/cpufeature.h>
#include <asm/hypervisor.h>
pr_info("%s\n", reason);
}
-enum pti_mode {
+static enum pti_mode {
PTI_AUTO = 0,
PTI_FORCE_OFF,
PTI_FORCE_ON
}
}
- if (cmdline_find_option_bool(boot_command_line, "nopti")) {
+ if (cmdline_find_option_bool(boot_command_line, "nopti") ||
+ cpu_mitigations_off()) {
pti_mode = PTI_FORCE_OFF;
pti_print_if_insecure("disabled on command line.");
return;
set_memory_global(start, (end_global - start) >> PAGE_SHIFT);
}
-void pti_set_kernel_image_nonglobal(void)
+static void pti_set_kernel_image_nonglobal(void)
{
/*
* The identity map is created with PMDs, regardless of the
void lockdep_assert_cpus_held(void)
{
+ /*
+ * We can't have hotplug operations before userspace starts running,
+ * and some init codepaths will knowingly not take the hotplug lock.
+ * This is all valid, so mute lockdep until it makes sense to report
+ * unheld locks.
+ */
+ if (system_state < SYSTEM_RUNNING)
+ return;
+
percpu_rwsem_assert_held(&cpu_hotplug_lock);
}
cpuhp_invoke_callback(cpu, st->state, false, NULL, NULL);
}
+static inline bool can_rollback_cpu(struct cpuhp_cpu_state *st)
+{
+ if (IS_ENABLED(CONFIG_HOTPLUG_CPU))
+ return true;
+ /*
+ * When CPU hotplug is disabled, then taking the CPU down is not
+ * possible because takedown_cpu() and the architecture and
+ * subsystem specific mechanisms are not available. So the CPU
+ * which would be completely unplugged again needs to stay around
+ * in the current state.
+ */
+ return st->state <= CPUHP_BRINGUP_CPU;
+}
+
static int cpuhp_up_callbacks(unsigned int cpu, struct cpuhp_cpu_state *st,
enum cpuhp_state target)
{
st->state++;
ret = cpuhp_invoke_callback(cpu, st->state, true, NULL, NULL);
if (ret) {
- st->target = prev_state;
- undo_cpu_up(cpu, st);
+ if (can_rollback_cpu(st)) {
+ st->target = prev_state;
+ undo_cpu_up(cpu, st);
+ }
break;
}
}
#endif
this_cpu_write(cpuhp_state.state, CPUHP_ONLINE);
}
+
+ enum cpu_mitigations cpu_mitigations __ro_after_init = CPU_MITIGATIONS_AUTO;
+
+ static int __init mitigations_parse_cmdline(char *arg)
+ {
+ if (!strcmp(arg, "off"))
+ cpu_mitigations = CPU_MITIGATIONS_OFF;
+ else if (!strcmp(arg, "auto"))
+ cpu_mitigations = CPU_MITIGATIONS_AUTO;
+ else if (!strcmp(arg, "auto,nosmt"))
+ cpu_mitigations = CPU_MITIGATIONS_AUTO_NOSMT;
+
+ return 0;
+ }
+ early_param("mitigations", mitigations_parse_cmdline);