At some point, we have to process the data in each CPU buffer and enter
it into the main (event) buffer. The file
buffer_sync.c
contains the relevant code. We
periodically (currently every HZ
/4 jiffies) start
the synchronisation process. In addition, we process the buffers on
certain events, such as an application calling
munmap()
. This is particularly important for
exit()
- because the CPU buffers contain pointers
to the task structure, if we don't process all the buffers before the
task is actually destroyed and the task structure freed, then we could
end up trying to dereference a bogus pointer in one of the CPU buffers.
We also add a notification when a kernel module is loaded; this is so
that user-space can re-read /proc/modules
to
determine the load addresses of kernel module text sections. Without
this notification, samples for a newly-loaded module could get lost or
be attributed to the wrong module.
The synchronisation itself works in the following manner: first, mutual
exclusion on the event buffer is taken. Remember, we do not need to do
that for each CPU buffer, as we only read from the tail iterator (whilst
interrupts might be arriving at the same buffer, but they will write to
the position of the head iterator, leaving previously written entries
intact). Then, we process each CPU buffer in turn. A CPU switch
notification is added to the buffer first (for
--separate=cpu
support). Then the processing of the
actual data starts.
As mentioned, the CPU buffer consists of task switch entries and the
actual samples. When the routine sync_buffer()
sees
a task switch, the process ID and process group ID are recorded into the
event buffer, along with a dcookie (see below) identifying the
application binary (e.g. /bin/bash
). The
mmap_sem
for the task is then taken, to allow safe
iteration across the tasks' list of mapped areas. Each sample is then
processed as described in the next section.
After a buffer has been read, the tail iterator is updated to reflect
how much of the buffer was processed. Note that when we determined how
much data there was to read in the CPU buffer, we also called
cpu_buffer_reset()
to reset
last_task
and last_is_kernel
, as
we've already mentioned. During the processing, more samples may have
been arriving in the CPU buffer; this is OK because we are careful to
only update the tail iterator to how much we actually read - on the next
buffer synchronisation, we will start again from that point.