This is a list of all Intel Westmere Microarchitecture performance counter event types. Please see Intel Architecture Developer's Manual Volume 3B, Appendix A and Intel Architecture Optimization Reference Manual (730795-001)
Name | Description | Counters usable | Unit mask options |
CPU_CLK_UNHALTED | Clock cycles when not halted | all | |
UNHALTED_REFERENCE_CYCLES | Unhalted reference cycles | all |
0x01: No unit mask
|
INST_RETIRED | number of instructions retired | all | |
LLC_MISSES | Last level cache demand requests from this core that missed the LLC | all |
0x41: No unit mask
|
LLC_REFS | Last level cache demand requests from this core | all |
0x4f: No unit mask
|
BR_INST_RETIRED | number of branch instructions retired | all | |
BR_MISS_PRED_RETIRED | number of mispredicted branches retired (precise) | all | |
LOAD_BLOCK | Loads that partially overlap an earlier store | all |
0x02: No unit mask
|
SB_DRAIN | All Store buffer stall cycles | all |
0x07: No unit mask
|
MISALIGN_MEM_REF | Misaligned store references | all |
0x02: No unit mask
|
STORE_BLOCKS | Loads delayed with at-Retirement block code | all |
0x04: (name=at_ret) Loads delayed with at-Retirement block code
0x08: (name=l1d_block) Cacheable loads delayed with L1D block code |
PARTIAL_ADDRESS_ALIAS | False dependencies due to partial address aliasing | all |
0x01: No unit mask
|
DTLB_LOAD_MISSES | DTLB load misses | all |
0x01: (name=any) DTLB load misses
0x02: (name=walk_completed) DTLB load miss page walks complete 0x04: (name=walk_cycles) DTLB load miss page walk cycles 0x10: (name=stlb_hit) DTLB second level hit 0x20: (name=pde_miss) DTLB load miss caused by low part of address 0x80: (name=large_walk_completed) DTLB load miss large page walks |
MEM_INST_RETIRED | Memory instructions retired above 0 clocks (Precise Event) | all |
0x01: (name=loads) Instructions retired which contains a load (Precise Event)
0x02: (name=stores) Instructions retired which contains a store (Precise Event) |
MEM_STORE_RETIRED | Retired stores that miss the DTLB (Precise Event) | all |
0x01: No unit mask
|
UOPS_ISSUED | Uops issued | all |
0x01: (name=any) Uops issued
0x02: (name=fused) Fused Uops issued |
MEM_UNCORE_RETIRED | Load instructions retired that HIT modified data in sibling core (Precise Event) | all |
0x02: (name=local_hitm) Load instructions retired that HIT modified data in sibling core (Precise Event)
0x04: (name=remote_hitm) Retired loads that hit remote socket in modified state (Precise Event) 0x08: (name=local_dram_and_remote_cache_hit) Load instructions retired local dram and remote cache HIT data sources (Precise Event) 0x10: (name=remote_dram) Load instructions retired remote DRAM and remote home-remote cache HITM (Precise Event) 0x80: (name=uncacheable) Load instructions retired IO (Precise Event) |
FP_COMP_OPS_EXE | MMX Uops | all |
0x01: (name=x87) Computational floating-point operations executed
0x02: (name=mmx) MMX Uops 0x04: (name=sse_fp) SSE and SSE2 FP Uops 0x08: (name=sse2_integer) SSE2 integer Uops 0x10: (name=sse_fp_packed) SSE FP packed Uops 0x20: (name=sse_fp_scalar) SSE FP scalar Uops 0x40: (name=sse_single_precision) SSE* FP single precision Uops 0x80: (name=sse_double_precision) SSE* FP double precision Uops |
SIMD_INT_128 | 128 bit SIMD integer pack operations | all |
0x01: (name=packed_mpy) 128 bit SIMD integer multiply operations
0x02: (name=packed_shift) 128 bit SIMD integer shift operations 0x04: (name=pack) 128 bit SIMD integer pack operations 0x08: (name=unpack) 128 bit SIMD integer unpack operations 0x10: (name=packed_logical) 128 bit SIMD integer logical operations 0x20: (name=packed_arith) 128 bit SIMD integer arithmetic operations 0x40: (name=shuffle_move) 128 bit SIMD integer shuffle/move operations |
LOAD_DISPATCH | All loads dispatched | all |
0x01: (name=rs) Loads dispatched that bypass the MOB
0x02: (name=rs_delayed) Loads dispatched from stage 305 0x04: (name=mob) Loads dispatched from the MOB 0x07: (name=any) All loads dispatched |
ARITH | Cycles the divider is busy | all |
0x01: (name=cycles_div_busy) Cycles the divider is busy
0x02: (name=mul) Multiply operations executed |
INST_QUEUE_WRITES | Instructions written to instruction queue. | all |
0x01: No unit mask
|
INST_DECODED | Instructions that must be decoded by decoder 0 | all |
0x01: No unit mask
|
TWO_UOP_INSTS_DECODED | Two Uop instructions decoded | all |
0x01: No unit mask
|
INST_QUEUE_WRITE_CYCLES | Cycles instructions are written to the instruction queue | all |
0x01: No unit mask
|
LSD_OVERFLOW | Loops that can't stream from the instruction queue | all |
0x01: No unit mask
|
L2_RQSTS | L2 instruction fetch hits | all |
0x01: (name=ld_hit) L2 load hits
0x02: (name=ld_miss) L2 load misses 0x03: (name=loads) L2 requests 0x04: (name=rfo_hit) L2 RFO hits 0x08: (name=rfo_miss) L2 RFO misses 0x0c: (name=rfos) L2 RFO requests 0x10: (name=ifetch_hit) L2 instruction fetch hits 0x20: (name=ifetch_miss) L2 instruction fetch misses 0x30: (name=ifetches) L2 instruction fetches 0x40: (name=prefetch_hit) L2 prefetch hits 0x80: (name=prefetch_miss) L2 prefetch misses 0xaa: (name=miss) All L2 misses 0xc0: (name=prefetches) All L2 prefetches 0xff: (name=references) All L2 requests |
L2_DATA_RQSTS | All L2 data requests | all |
0x01: (name=demand_i_state) L2 data demand loads in I state (misses)
0x02: (name=demand_s_state) L2 data demand loads in S state 0x04: (name=demand_e_state) L2 data demand loads in E state 0x08: (name=demand_m_state) L2 data demand loads in M state 0x0f: (name=demand_mesi) L2 data demand requests 0x10: (name=prefetch_i_state) L2 data prefetches in the I state (misses) 0x20: (name=prefetch_s_state) L2 data prefetches in the S state 0x40: (name=prefetch_e_state) L2 data prefetches in E state 0x80: (name=prefetch_m_state) L2 data prefetches in M state 0xf0: (name=prefetch_mesi) All L2 data prefetches 0xff: (name=any) All L2 data requests |
L2_WRITE | L2 demand lock RFOs in E state | all |
0x01: (name=rfo_i_state) L2 demand store RFOs in I state (misses)
0x02: (name=rfo_s_state) L2 demand store RFOs in S state 0x08: (name=rfo_m_state) L2 demand store RFOs in M state 0x0e: (name=rfo_hit) All L2 demand store RFOs that hit the cache 0x0f: (name=rfo_mesi) All L2 demand store RFOs 0x10: (name=lock_i_state) L2 demand lock RFOs in I state (misses) 0x20: (name=lock_s_state) L2 demand lock RFOs in S state 0x40: (name=lock_e_state) L2 demand lock RFOs in E state 0x80: (name=lock_m_state) L2 demand lock RFOs in M state 0xe0: (name=lock_hit) All demand L2 lock RFOs that hit the cache 0xf0: (name=lock_mesi) All demand L2 lock RFOs |
L1D_WB_L2 | L1 writebacks to L2 in E state | all |
0x01: (name=i_state) L1 writebacks to L2 in I state (misses)
0x02: (name=s_state) L1 writebacks to L2 in S state 0x04: (name=e_state) L1 writebacks to L2 in E state 0x08: (name=m_state) L1 writebacks to L2 in M state 0x0f: (name=mesi) All L1 writebacks to L2 |
LONGEST_LAT_CACHE | Longest latency cache miss | all |
0x01: (name=miss) Longest latency cache miss
0x02: (name=reference) Longest latency cache reference |
CPU_CLK_UNHALTED | Reference base clock (133 Mhz) cycles when thread is not halted (programmable counter) | all |
0x00: (name=thread_p) Cycles when thread is not halted (programmable counter)
0x01: (name=ref_p) Reference base clock (133 Mhz) cycles when thread is not halted (programmable counter) |
DTLB_MISSES | DTLB misses | all |
0x01: (name=any) DTLB misses
0x02: (name=walk_completed) DTLB miss page walks 0x04: (name=walk_cycles) DTLB miss page walk cycles 0x10: (name=stlb_hit) DTLB first level misses but second level hit 0x20: (name=pde_miss) DTLB misses casued by low part of address 0x80: (name=large_walk_completed) DTLB miss large page walks |
LOAD_HIT_PRE | Load operations conflicting with software prefetches | 0, 1 |
0x01: No unit mask
|
L1D_PREFETCH | L1D hardware prefetch misses | 0, 1 |
0x01: (name=requests) L1D hardware prefetch requests
0x02: (name=miss) L1D hardware prefetch misses 0x04: (name=triggers) L1D hardware prefetch requests triggered |
EPT | Extended Page Table walk cycles | all |
0x10: No unit mask
|
L1D | L1D cache lines replaced in M state | 0, 1 |
0x01: (name=repl) L1 data cache lines allocated
0x02: (name=m_repl) L1D cache lines allocated in the M state 0x04: (name=m_evict) L1D cache lines replaced in M state 0x08: (name=m_snoop_evict) L1D snoop eviction of cache lines in M state |
L1D_CACHE_PREFETCH_LOCK_FB_HIT | L1D prefetch load lock accepted in fill buffer | 0, 1 |
0x01: No unit mask
|
OFFCORE_REQUESTS_OUTSTANDING | Outstanding offcore reads | 0 |
0x01: (name=demand_read_data) Outstanding offcore demand data reads
0x02: (name=demand_read_code) Outstanding offcore demand code reads 0x04: (name=demand_rfo) Outstanding offcore demand RFOs 0x08: (name=any_read) Outstanding offcore reads |
CACHE_LOCK_CYCLES | Cycles L1D locked | 0, 1 |
0x01: (name=l1d_l2) Cycles L1D and L2 locked
0x02: (name=l1d) Cycles L1D locked |
IO_TRANSACTIONS | I/O transactions | all |
0x01: No unit mask
|
L1I | L1I instruction fetch stall cycles | all |
0x01: (name=hits) L1I instruction fetch hits
0x02: (name=misses) L1I instruction fetch misses 0x03: (name=reads) L1I Instruction fetches 0x04: (name=cycles_stalled) L1I instruction fetch stall cycles |
LARGE_ITLB | Large ITLB hit | all |
0x01: No unit mask
|
ITLB_MISSES | ITLB miss | all |
0x01: (name=any) ITLB miss
0x02: (name=walk_completed) ITLB miss page walks 0x04: (name=walk_cycles) ITLB miss page walk cycles 0x80: (name=large_walk_completed) ITLB miss large page walks |
ILD_STALL | Any Instruction Length Decoder stall cycles | all |
0x01: (name=lcp) Length Change Prefix stall cycles
0x02: (name=mru) Stall cycles due to BPU MRU bypass 0x04: (name=iq_full) Instruction Queue full stall cycles 0x08: (name=regen) Regen stall cycles 0x0f: (name=any) Any Instruction Length Decoder stall cycles |
BR_INST_EXEC | Branch instructions executed | all |
0x01: (name=cond) Conditional branch instructions executed
0x02: (name=direct) Unconditional branches executed 0x04: (name=indirect_non_call) Indirect non call branches executed 0x07: (name=non_calls) All non call branches executed 0x08: (name=return_near) Indirect return branches executed 0x10: (name=direct_near_call) Unconditional call branches executed 0x20: (name=indirect_near_call) Indirect call branches executed 0x30: (name=near_calls) Call branches executed 0x40: (name=taken) Taken branches executed 0x7f: (name=any) Branch instructions executed |
BR_MISP_EXEC | Mispredicted branches executed | all |
0x01: (name=cond) Mispredicted conditional branches executed
0x02: (name=direct) Mispredicted unconditional branches executed 0x04: (name=indirect_non_call) Mispredicted indirect non call branches executed 0x07: (name=non_calls) Mispredicted non call branches executed 0x08: (name=return_near) Mispredicted return branches executed 0x10: (name=direct_near_call) Mispredicted non call branches executed 0x20: (name=indirect_near_call) Mispredicted indirect call branches executed 0x30: (name=near_calls) Mispredicted call branches executed 0x40: (name=taken) Mispredicted taken branches executed 0x7f: (name=any) Mispredicted branches executed |
RESOURCE_STALLS | Resource related stall cycles | all |
0x01: (name=any) Resource related stall cycles
0x02: (name=load) Load buffer stall cycles 0x04: (name=rs_full) Reservation Station full stall cycles 0x08: (name=store) Store buffer stall cycles 0x10: (name=rob_full) ROB full stall cycles 0x20: (name=fpcw) FPU control word write stall cycles 0x40: (name=mxcsr) MXCSR rename stall cycles 0x80: (name=other) Other Resource related stall cycles |
MACRO_INSTS_FUSED | Macro-fused instructions decoded | all |
0x01: No unit mask
|
BACLEAR_FORCE_IQ | Instruction queue forced BACLEAR | all |
0x01: No unit mask
|
LSD | Cycles when uops were delivered by the LSD | all |
0x01: No unit mask
|
ITLB_FLUSH | ITLB flushes | all |
0x01: No unit mask
|
OFFCORE_REQUESTS | All offcore requests | all |
0x01: (name=demand_read_data) Offcore demand data read requests
0x02: (name=demand_read_code) Offcore demand code read requests 0x04: (name=demand_rfo) Offcore demand RFO requests 0x08: (name=any_read) Offcore read requests 0x10: (name=any_rfo) Offcore RFO requests 0x40: (name=l1d_writeback) Offcore L1 data cache writebacks 0x80: (name=any) All offcore requests |
UOPS_EXECUTED | Cycles Uops executed on any port (core count) | all |
0x01: (name=port0) Uops executed on port 0
0x02: (name=port1) Uops executed on port 1 0x04: (name=port2_core) Uops executed on port 2 (core count) 0x08: (name=port3_core) Uops executed on port 3 (core count) 0x10: (name=port4_core) Uops executed on port 4 (core count) 0x1f: (name=core_active_cycles_no_port5) Cycles Uops executed on ports 0-4 (core count) 0x20: (name=port5) Uops executed on port 5 0x3f: (name=core_active_cycles) Cycles Uops executed on any port (core count) 0x40: (name=port015) Uops issued on ports 0, 1 or 5 0x80: (name=port234_core) Uops issued on ports 2, 3 or 4 |
OFFCORE_REQUESTS_SQ_FULL | Offcore requests blocked due to Super Queue full | all |
0x01: No unit mask
|
SNOOPQ_REQUESTS_OUTSTANDING | Outstanding snoop code requests | 0 |
0x01: (name=data) Outstanding snoop data requests
0x02: (name=invalidate) Outstanding snoop invalidate requests 0x04: (name=code) Outstanding snoop code requests |
SNOOPQ_REQUESTS | Snoop code requests | all |
0x01: (name=data) Snoop data requests
0x02: (name=invalidate) Snoop invalidate requests 0x04: (name=code) Snoop code requests |
OFFCORE_RESPONSE_ANY_DATA_0 | REQUEST = ANY_DATA read and RESPONSE = ANY_CACHE_DRAM | 2 |
0x01: No unit mask
|
SNOOP_RESPONSE | Thread responded HIT to snoop | all |
0x01: (name=hit) Thread responded HIT to snoop
0x02: (name=hite) Thread responded HITE to snoop 0x04: (name=hitm) Thread responded HITM to snoop |
OFFCORE_RESPONSE_ANY_DATA_1 | REQUEST = ANY_DATA read and RESPONSE = ANY_CACHE_DRAM | 1 |
0x01: No unit mask
|
INST_RETIRED | Instructions retired (Programmable counter and Precise Event) | all |
0x01: (name=any_p) Instructions retired (Programmable counter and Precise Event)
0x02: (name=x87) Retired floating-point operations (Precise Event) 0x04: (name=mmx) Retired MMX instructions (Precise Event) |
UOPS_RETIRED | Cycles Uops are being retired | all |
0x01: (name=active_cycles) Cycles Uops are being retired
0x02: (name=retire_slots) Retirement slots used (Precise Event) 0x04: (name=macro_fused) Macro-fused Uops retired (Precise Event) |
MACHINE_CLEARS | Cycles machine clear asserted | all |
0x01: (name=cycles) Cycles machine clear asserted
0x02: (name=mem_order) Execution pipeline restart due to Memory ordering conflicts 0x04: (name=smc) Self-Modifying Code detected |
BR_INST_RETIRED | Retired branch instructions (Precise Event) | all |
0x01: (name=conditional) Retired conditional branch instructions (Precise Event)
0x02: (name=near_call) Retired near call instructions (Precise Event) 0x04: (name=all_branches) Retired branch instructions (Precise Event) |
BR_MISP_RETIRED | Mispredicted retired branch instructions (Precise Event) | all |
0x01: (name=conditional) Mispredicted conditional retired branches (Precise Event)
0x02: (name=near_call) Mispredicted near retired calls (Precise Event) 0x04: (name=all_branches) Mispredicted retired branch instructions (Precise Event) |
SSEX_UOPS_RETIRED | SIMD Packed-Double Uops retired (Precise Event) | all |
0x01: (name=packed_single) SIMD Packed-Single Uops retired (Precise Event)
0x02: (name=scalar_single) SIMD Scalar-Single Uops retired (Precise Event) 0x04: (name=packed_double) SIMD Packed-Double Uops retired (Precise Event) 0x08: (name=scalar_double) SIMD Scalar-Double Uops retired (Precise Event) 0x10: (name=vector_integer) SIMD Vector Integer Uops retired (Precise Event) |
ITLB_MISS_RETIRED | Retired instructions that missed the ITLB (Precise Event) | all |
0x20: No unit mask
|
MEM_LOAD_RETIRED | Retired loads that miss the DTLB (Precise Event) | all |
0x01: (name=l1d_hit) Retired loads that hit the L1 data cache (Precise Event)
0x02: (name=l2_hit) Retired loads that hit the L2 cache (Precise Event) 0x04: (name=llc_unshared_hit) Retired loads that hit valid versions in the LLC cache (Precise Event) 0x08: (name=other_core_l2_hit_hitm) Retired loads that hit sibling core's L2 in modified or unmodified states (Precise Event) 0x10: (name=llc_miss) Retired loads that miss the LLC cache (Precise Event) 0x40: (name=hit_lfb) Retired loads that miss L1D and hit an previously allocated LFB (Precise Event) 0x80: (name=dtlb_miss) Retired loads that miss the DTLB (Precise Event) |
FP_MMX_TRANS | All Floating Point to and from MMX transitions | all |
0x01: (name=to_fp) Transitions from MMX to Floating Point instructions
0x02: (name=to_mmx) Transitions from Floating Point to MMX instructions 0x03: (name=any) All Floating Point to and from MMX transitions |
MACRO_INSTS | Instructions decoded | all |
0x01: No unit mask
|
UOPS_DECODED | Stack pointer instructions decoded | all |
0x01: (name=stall_cycles) Cycles no Uops are decoded
0x02: (name=ms_cycles_active) Uops decoded by Microcode Sequencer 0x04: (name=esp_folding) Stack pointer instructions decoded 0x08: (name=esp_sync) Stack pointer sync operations |
RAT_STALLS | All RAT stall cycles | all |
0x01: (name=flags) Flag stall cycles
0x02: (name=registers) Partial register stall cycles 0x04: (name=rob_read_port) ROB read port stalls cycles 0x08: (name=scoreboard) Scoreboard stall cycles 0x0f: (name=any) All RAT stall cycles |
SEG_RENAME_STALLS | Segment rename stall cycles | all |
0x01: No unit mask
|
ES_REG_RENAMES | ES segment renames | all |
0x01: No unit mask
|
UOP_UNFUSION | Uop unfusions due to FP exceptions | all |
0x01: No unit mask
|
BR_INST_DECODED | Branch instructions decoded | all |
0x01: No unit mask
|
BPU_MISSED_CALL_RET | Branch prediction unit missed call or return | all |
0x01: No unit mask
|
BACLEAR | BACLEAR asserted with bad target address | all |
0x01: (name=clear) BACLEAR asserted, regardless of cause
0x02: (name=bad_target) BACLEAR asserted with bad target address |
BPU_CLEARS | Early Branch Prediction Unit clears | all |
0x01: (name=early) Early Branch Prediction Unit clears
0x02: (name=late) Late Branch Prediction Unit clears |
L2_TRANSACTIONS | All L2 transactions | all |
0x01: (name=load) L2 Load transactions
0x02: (name=rfo) L2 RFO transactions 0x04: (name=ifetch) L2 instruction fetch transactions 0x08: (name=prefetch) L2 prefetch transactions 0x10: (name=l1d_wb) L1D writeback to L2 transactions 0x20: (name=fill) L2 fill transactions 0x40: (name=wb) L2 writeback to LLC transactions 0x80: (name=any) All L2 transactions |
L2_LINES_IN | L2 lines alloacated | all |
0x02: (name=s_state) L2 lines allocated in the S state
0x04: (name=e_state) L2 lines allocated in the E state 0x07: (name=any) L2 lines alloacated |
L2_LINES_OUT | L2 lines evicted | all |
0x01: (name=demand_clean) L2 lines evicted by a demand request
0x02: (name=demand_dirty) L2 modified lines evicted by a demand request 0x04: (name=prefetch_clean) L2 lines evicted by a prefetch request 0x08: (name=prefetch_dirty) L2 modified lines evicted by a prefetch request 0x0f: (name=any) L2 lines evicted |
SQ_MISC | Super Queue LRU hints sent to LLC | all |
0x04: (name=lru_hints) Super Queue LRU hints sent to LLC
0x10: (name=split_lock) Super Queue lock splits across a cache line |
SQ_FULL_STALL_CYCLES | Super Queue full stall cycles | all |
0x01: No unit mask
|
FP_ASSIST | X87 Floating point assists (Precise Event) | all |
0x01: (name=all) X87 Floating point assists (Precise Event)
0x02: (name=output) X87 Floating point assists for invalid output value (Precise Event) 0x04: (name=input) X87 Floating poiint assists for invalid input value (Precise Event) |
SIMD_INT_64 | SIMD integer 64 bit pack operations | all |
0x01: (name=packed_mpy) SIMD integer 64 bit packed multiply operations
0x02: (name=packed_shift) SIMD integer 64 bit shift operations 0x04: (name=pack) SIMD integer 64 bit pack operations 0x08: (name=unpack) SIMD integer 64 bit unpack operations 0x10: (name=packed_logical) SIMD integer 64 bit logical operations 0x20: (name=packed_arith) SIMD integer 64 bit arithmetic operations 0x40: (name=shuffle_move) SIMD integer 64 bit shuffle/move operations |
Bottlenecks occur in surprising places, so don't try to second guess and put in a speed hack until you've proven that's where the bottleneck is.- Rob Pike