cgroup: make per-cgroup pressure stall tracking configurable
authorSuren Baghdasaryan <surenb@google.com>
Mon, 24 May 2021 19:53:39 +0000 (12:53 -0700)
committerTejun Heo <tj@kernel.org>
Tue, 8 Jun 2021 18:59:02 +0000 (14:59 -0400)
commit3958e2d0c34e18c41b60dc01832bd670a59ef70f
tree429f3434bd8e89fb688e9366c9087208b31fcda3
parent2ca11b0e043be6f5c2b188897e9a32275eaab046
cgroup: make per-cgroup pressure stall tracking configurable

PSI accounts stalls for each cgroup separately and aggregates it at each
level of the hierarchy. This causes additional overhead with psi_avgs_work
being called for each cgroup in the hierarchy. psi_avgs_work has been
highly optimized, however on systems with large number of cgroups the
overhead becomes noticeable.
Systems which use PSI only at the system level could avoid this overhead
if PSI can be configured to skip per-cgroup stall accounting.
Add "cgroup_disable=pressure" kernel command-line option to allow
requesting system-wide only pressure stall accounting. When set, it
keeps system-wide accounting under /proc/pressure/ but skips accounting
for individual cgroups and does not expose PSI nodes in cgroup hierarchy.

Signed-off-by: Suren Baghdasaryan <surenb@google.com>
Acked-by: Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by: Johannes Weiner <hannes@cmpxchg.org>
Signed-off-by: Tejun Heo <tj@kernel.org>
Documentation/admin-guide/kernel-parameters.txt
include/linux/cgroup-defs.h
include/linux/cgroup.h
kernel/cgroup/cgroup.c
kernel/sched/psi.c