Overview
OProfile is an open source project that includes a statistical profiler for Linux systems, capable of profiling
all running code at low overhead. In version 0.9.9, an event counting tool, ocount, was added to the project.
OProfile is released under the GNU GPL. It has proven stable over a large number
of differing configurations; it is being used on machines ranging from laptops to
16-way NUMA-Q boxes. As always, there is no warranty.
For versions 0.9.7 and earlier, the profiler consisted of a kernel driver and a daemon for collecting sample data.
In version 0.9.8, with the introduction of operf, the legacy kernel driver/daemon method of collecting sample data
was deprecated in favor of profiling with the Linux Kernel Performance Events Subsystem (kernel version 2.6.31 or higher).
As of version 1.0.0, the legacy profiler has been removed.
OProfile leverages the hardware performance counters of the CPU to enable profiling
of a wide variety of interesting statistics, which can also be used for basic
time-spent profiling. All code is profiled: hardware and software interrupt handlers, kernel modules, the kernel,
shared libraries, and applications. Several post-profiling tools for turning profile data into human readable
information are available.
Features
- Unobtrusive
-
No special recompilations, wrapper libraries or the like are necessary. Even debug symbols
(-g option to gcc) are not necessary unless you want to produce annotated source.
Kernel patches are usually unnecessary, except in cases where the running kernel may not yet support
some newer processor models.
- System-wide profiling
-
All code running on the system is profiled, enabling analysis of system performance. Note: Root
authority is required to do system-wide profiling.
- Single process profiling
-
Application developers will find the single process profiling feature very convenient since it
does not require root authority, and profile data is collected only for the specified process
(or command). This method has the added benefit of "following" fork/execs and collecting
profile information on those child processes as well.
- Event counting
-
OProfile can be used to count native hardware events occurring in either
a given application, a set of processes or threads, a subset of active system processors, or
the entire system.
- Performance counter support
-
Enables collection of various low-level data, and assocation with particular sections
of code.
- Call-graph support
-
With an x86 or ARM 2.6 kernel, OProfile can provide gprof-style call-graph
profiling data.
- Low overhead
-
OProfile has a typical overhead of 1-8%, dependent on sampling frequency and workload.
- Post-profile analysis
-
Profile data can be produced on the function-level or instruction-level detail. Source trees
annotated with profile information can be created. A hit list of applications and functions
that take the most time across the whole system can be produced.
- System support
-
OProfile works across a range of CPUs, include the Intel range, AMD's Athlon and AMD64 processors range,
the Alpha, ARM, IBM PowerPC, and more.
OProfile will work against almost any 2.2, 2.4 and 2.6 kernels, and works on both UP and SMP
systems from desktops to the scariest NUMAQ boxes. Note: As of version 0.9.8, only 2.6 kernels
are supported.
Example reports
You can see what sort of output OProfile can produce with the example reports.
History
The early versions of OProfile were developed as part credit for an M.Sc. in Computer Science. The
basic principles of the design were inspired by Compaq's DCPI profiler.
Don't speculate - benchmark.
- Dan Bernstein
2020/07/20