If OProfile supports the hardware performance counters found on
a particular architecture, code for managing the details of setting
up and managing these counters can be found in the kernel source
tree in the relevant arch/arch/oprofile/
directory. The architecture-specific implementation works via
filling in the oprofile_operations structure at init time. This
provides a set of operations such as setup()
,
start()
, stop()
, etc.
that manage the hardware-specific details of fiddling with the
performance counter registers.
The other important facility available to the architecture code is
oprofile_add_sample()
. This is where a particular sample
taken at interrupt time is fed into the generic OProfile driver code.
OProfile implements a pseudo-filesystem known as "oprofilefs", mounted from
userspace at /dev/oprofile
. This consists of small
files for reporting and receiving configuration from userspace, as well
as the actual character device that the OProfile userspace receives samples
from. At setup()
time, the architecture-specific may
add further configuration files related to the details of the performance
counters. For example, on x86, one numbered directory for each hardware
performance counter is added, with files in each for the event type,
reset value, etc.
The filesystem also contains a stats
directory with
a number of useful counters for various OProfile events.
This lives in drivers/oprofile/
, and forms the core of
how OProfile works in the kernel. Its job is to take samples delivered
from the architecture-specific code (via oprofile_add_sample()
),
and buffer this data, in a transformed form as described later, until releasing
the data to the userspace daemon via the /dev/oprofile/buffer
character device.
The OProfile userspace daemon's job is to take the raw data provided by the
kernel and write it to the disk. It takes the single data stream from the
kernel and logs sample data against a number of sample files (found in
$SESSION_DIR/samples/current/
, by default located at
/var/lib/oprofile/samples/current/
. For the benefit
of the "separate" functionality, the names/paths of these sample files
are mangled to reflect where the samples were from: this can include
thread IDs, the binary file path, the event type used, and more.
After this final step from interrupt to disk file, the data is now persistent (that is, changes in the running of the system do not invalidate stored data). So the post-profiling tools can run on this data at any time (assuming the original binary files are still available and unchanged, naturally).