An example of extended feature implementation can be seen by examining the AMD Instruction-Based Sampling support.
Instruction-Based Sampling (IBS) is a new performance measurement technique available on AMD Family 10h processors. Enabling IBS profiling is done simply by specifying IBS performance events through the "--event=" options.
opcontrol --event=IBS_FETCH_XXX:<count>:<um>:<kernel>:<user> opcontrol --event=IBS_OP_XXX:<count>:<um>:<kernel>:<user> Note: * Count and unitmask for all IBS fetch events must be the same, as do those for IBS op. |
IBS performance events are listed in opcontrol --list-events
.
When users specify these events, opcontrol verifies them using ophelp, which
checks for the ext:ibs_fetch
or ext:ibs_op
tag in events/x86-64/family10/events
file.
Then, it configures the driver interface (/dev/oprofile/ibs_fetch/... and
/dev/oprofile/ibs_op/...) and starts the OProfile daemon as follows.
oprofiled \ --ext-feature=ibs:\ fetch:<IBS_FETCH_EVENT1>,<IBS_FETCH_EVENT2>,...,:<IBS fetch count>:<IBS Fetch um>|\ op:<IBS_OP_EVENT1>,<IBS_OP_EVENT2>,...,:<IBS op count>:<IBS op um> |
Here, the OProfile daemon parses the --ext-feature
option and checks the feature name ("ibs") before calling the
the initialization function to handle the string
containing IBS events, counts, and unitmasks.
Then, it stores each event in the IBS virtual-counter table
(struct opd_event ibs_vc[OP_MAX_IBS_COUNTERS]
) and
stores the event index in the IBS Virtual Counter Index (VCI) map
(ibs_vci_map[OP_MAX_IBS_COUNTERS]
) with IBS event value
as the map key.
During a profile session, the OProfile daemon identifies IBS samples in the
event buffer using the "IBS_FETCH_CODE"
or
"IBS_OP_CODE"
. These codes trigger the handlers
code_ibs_fetch_sample()
or
code_ibs_op_sample()
listed in the
handler_t handlers[]
vector in
daemon/opd_trans.c
. These handlers are responsible for
processing IBS samples and translate them into IBS performance events.
Unlike traditional performance events, each IBS sample can be derived into
multiple IBS performance events. For each event that the user specifies,
a combination of bits from Model-Specific Registers (MSR) are checked
against the bitmask defining the event. If the condition is met, the event
will then be recorded. The derivation logic is in the files
daemon/opd_ibs_macro.h
and
daemon/opd_ibs_trans.[h,c]
.
Traditionally, sample file information (odb_t)
is stored
in the struct sfile::odb_t file[OP_MAX_COUNTER]
.
Currently, OP_MAX_COUNTER
is 8 on non-alpha, and 20 on
alpha-based system. Event index (the counter number on which the event
is configured) is used to access the corresponding entry in the array.
Unlike the traditional performance event, IBS does not use the actual
counter registers (i.e. /dev/oprofile/0,1,2,3
).
Also, the number of performance events generated by IBS could be larger than
OP_MAX_COUNTER
(currently upto 13 IBS-fetch and 46 IBS-op
events). Therefore IBS requires a special data structure and sfile
handlers (struct opd_ext_sfile_handlers
) for managing
IBS sample files. IBS-sample-file information is stored in a memory
allocated by handler ibs_sfile_create()
, which can
be accessed through struct sfile::odb_t * ext_files
.