ring-buffer: Make non-consuming read less expensive with lots of cpus.
authorDavid Miller <davem@davemloft.net>
Tue, 20 Apr 2010 22:47:11 +0000 (15:47 -0700)
committerSteven Rostedt <rostedt@goodmis.org>
Tue, 27 Apr 2010 17:06:35 +0000 (13:06 -0400)
commit72c9ddfd4c5bf54ef03cfdf57026416cb678eeba
treebd2c2b6b411975a8219d7138ba7699ee5d324e77
parent62b915f1060996a8e1f69be50e3b8e9e43b710cb
ring-buffer: Make non-consuming read less expensive with lots of cpus.

When performing a non-consuming read, a synchronize_sched() is
performed once for every cpu which is actively tracing.

This is very expensive, and can make it take several seconds to open
up the 'trace' file with lots of cpus.

Only one synchronize_sched() call is actually necessary.  What is
desired is for all cpus to see the disabling state change.  So we
transform the existing sequence:

for_each_cpu() {
ring_buffer_read_start();
}

where each ring_buffer_start() call performs a synchronize_sched(),
into the following:

for_each_cpu() {
ring_buffer_read_prepare();
}
ring_buffer_read_prepare_sync();
for_each_cpu() {
ring_buffer_read_start();
}

wherein only the single ring_buffer_read_prepare_sync() call needs to
do the synchronize_sched().

The first phase, via ring_buffer_read_prepare(), allocates the 'iter'
memory and increments ->record_disabled.

In the second phase, ring_buffer_read_prepare_sync() makes sure this
->record_disabled state is visible fully to all cpus.

And in the final third phase, the ring_buffer_read_start() calls reset
the 'iter' objects allocated in the first phase since we now know that
none of the cpus are adding trace entries any more.

This makes openning the 'trace' file nearly instantaneous on a
sparc64 Niagara2 box with 128 cpus tracing.

Signed-off-by: David S. Miller <davem@davemloft.net>
LKML-Reference: <20100420.154711.11246950.davem@davemloft.net>
Signed-off-by: Steven Rostedt <rostedt@goodmis.org>
include/linux/ring_buffer.h
kernel/trace/ring_buffer.c
kernel/trace/trace.c