5. Extended Feature Reference Implementation

5.1. Instruction-Based Sampling (IBS)

An example of extended feature implementation can be seen by examining the AMD Instruction-Based Sampling support.

5.1.1. IBS Initialization

Instruction-Based Sampling (IBS) is a new performance measurement technique available on AMD Family 10h processors. Enabling IBS profiling is done simply by specifying IBS performance events through the "--event=" options.

opcontrol --event=IBS_FETCH_XXX:<count>:<um>:<kernel>:<user>
opcontrol --event=IBS_OP_XXX:<count>:<um>:<kernel>:<user>

Note: * Count and unitmask for all IBS fetch events must be the same,
	as do those for IBS op.

IBS performance events are listed in opcontrol --list-events. When users specify these events, opcontrol verifies them using ophelp, which checks for the ext:ibs_fetch or ext:ibs_op tag in events/x86-64/family10/events file. Then, it configures the driver interface (/dev/oprofile/ibs_fetch/... and /dev/oprofile/ibs_op/...) and starts the OProfile daemon as follows.

oprofiled \
    --ext-feature=ibs:\
	fetch:<IBS_FETCH_EVENT1>,<IBS_FETCH_EVENT2>,...,:<IBS fetch count>:<IBS Fetch um>|\
	op:<IBS_OP_EVENT1>,<IBS_OP_EVENT2>,...,:<IBS op count>:<IBS op um>

Here, the OProfile daemon parses the --ext-feature option and checks the feature name ("ibs") before calling the the initialization function to handle the string containing IBS events, counts, and unitmasks. Then, it stores each event in the IBS virtual-counter table (struct opd_event ibs_vc[OP_MAX_IBS_COUNTERS]) and stores the event index in the IBS Virtual Counter Index (VCI) map (ibs_vci_map[OP_MAX_IBS_COUNTERS]) with IBS event value as the map key.

5.1.2. IBS Data Processing

During a profile session, the OProfile daemon identifies IBS samples in the event buffer using the "IBS_FETCH_CODE" or "IBS_OP_CODE". These codes trigger the handlers code_ibs_fetch_sample() or code_ibs_op_sample() listed in the handler_t handlers[] vector in daemon/opd_trans.c . These handlers are responsible for processing IBS samples and translate them into IBS performance events.

Unlike traditional performance events, each IBS sample can be derived into multiple IBS performance events. For each event that the user specifies, a combination of bits from Model-Specific Registers (MSR) are checked against the bitmask defining the event. If the condition is met, the event will then be recorded. The derivation logic is in the files daemon/opd_ibs_macro.h and daemon/opd_ibs_trans.[h,c].

5.1.3. IBS Sample File

Traditionally, sample file information (odb_t) is stored in the struct sfile::odb_t file[OP_MAX_COUNTER]. Currently, OP_MAX_COUNTER is 8 on non-alpha, and 20 on alpha-based system. Event index (the counter number on which the event is configured) is used to access the corresponding entry in the array. Unlike the traditional performance event, IBS does not use the actual counter registers (i.e. /dev/oprofile/0,1,2,3). Also, the number of performance events generated by IBS could be larger than OP_MAX_COUNTER (currently upto 13 IBS-fetch and 46 IBS-op events). Therefore IBS requires a special data structure and sfile handlers (struct opd_ext_sfile_handlers) for managing IBS sample files. IBS-sample-file information is stored in a memory allocated by handler ibs_sfile_create(), which can be accessed through struct sfile::odb_t * ext_files.