Page doesn't render properly ?

Intel P4 with hyperthreading (2 logical processor) events

This is a list of all P4-core CPU's with 2 logical processor per physical package performance counter event types. Please see the Intel Architecture 32 Family Developer's Manual, Volume 3, Appendix A. Oprofile use syntethised events and doen't provide a low-level access to P4 hardware, so the Intel manual is usefull mainly for people trying to add new events in Oprofile rather for end-user.

NameDescriptionCounters usableUnit mask options
GLOBAL_POWER_EVENTS time during which processor is not stopped 0 0x01: mandatory
BRANCH_RETIRED retired branches 3 0x01: branch not-taken predicted
0x02: branch not-taken mispredicted
0x04: branch taken predicted
0x08: branch taken mispredicted
MISPRED_BRANCH_RETIRED retired mispredicted branches 3 0x01: retired instruction is non-bogus
BPU_FETCH_REQUEST instruction fetch requests from the branch predict unit 0 0x01: trace cache lookup miss
ITLB_REFERENCE translations using the instruction translation lookaside buffer 0 0x01: ITLB hit
0x02: ITLB miss
0x04: uncacheable ITLB hit
MEMORY_CANCEL cancelled requesets in data cache address control unit 2 0x04: replayed because no store request buffer available
0x08: conflicts due to 64k aliasing
MEMORY_COMPLETE completed split 2 0x01: load split completed, excluding UC/WC loads
0x02: any split stores completed
0x04: uncacheable load split completed
0x08: uncacheable store split complete
LOAD_PORT_REPLAY replayed events at the load port 2 0x02: split load
STORE_PORT_REPLAY replayed events at the store port 2 0x02: split store
MOB_LOAD_REPLAY replayed loads from the memory order buffer 0 0x02: replay cause: unknown store address
0x08: replay cause: unknown store data
0x10: replay cause: partial overlap between load and store
0x20: replay cause: mismatched low 4 bits between load and store addr
BSQ_CACHE_REFERENCE cache references seen by the bus unit 0 0x01: read 2nd level cache hit shared
0x02: read 2nd level cache hit exclusive
0x04: read 2nd level cache hit modified
0x08: read 3rd level cache hit shared
0x10: read 3rd level cache hit exclusive
0x20: read 3rd level cache hit modified
0x100: read 2nd level cache miss
0x200: read 3rd level cache miss
0x400: writeback lookup from DAC misses 2nd level cache
X87_ASSIST retired x87 instructions which required special handling 3 0x01: handle FP stack underflow
0x02: handle FP stack overflow
0x04: handle x87 output overflow
0x08: handle x87 output underflow
0x10: handle x87 input assist
MACHINE_CLEAR cycles with entire machine pipeline cleared 3 0x01: count a portion of cycles the machine is cleared for any cause
0x04: count each time the machine is cleared due to memory ordering issues
0x40: count each time the machine is cleared due to self modifying code
TC_MS_XFER number of times uops deliver changed from TC to MS ROM 1 0x01: count TC to MS transfers
UOP_QUEUE_WRITES number of valid uops written to the uop queue 1 0x01: count uops written to queue from TC build mode
0x02: count uops written to queue from TC deliver mode
0x04: count uops written to queue from microcode ROM
INSTR_RETIRED retired instructions 3 0x01: count non-bogus instructions which are not tagged
0x02: count non-bogus instructions which are tagged
0x04: count bogus instructions which are not tagged
0x08: count bogus instructions which are tagged
UOPS_RETIRED retired uops 3 0x01: count marked uops which are non-bogus
0x02: count marked uops which are bogus
UOP_TYPE type of uop tagged by front-end tagging 3 0x02: count uops which are load operations
0x04: count uops which are store operations
RETIRED_MISPRED_BRANCH_TYPE retired mispredicted branched, selected by type 1 0x01: count unconditional jumps
0x02: count conditional jumps
0x04: count call branches
0x08: count return branches
0x10: count indirect jumps
RETIRED_BRANCH_TYPE retired branches, selected by type 1 0x01: count unconditional jumps
0x02: count conditional jumps
0x04: count call branches
0x08: count return branches
0x10: count indirect jumps
Measurement is a crucial component of performance improvement since reasoning and intuition are fallible guides and must be supplemented with tools like timing commands and profilers. - The Practice of Programming, Brian W. Kernighan and Rob Pike
2020/07/20