Page doesn't render properly ?

Intel Sandy Bridge Microarchitecture events

This is a list of all Intel Sandy Bridge Microarchitecture performance counter event types. Please see Intel Architecture Developer's Manual Volume 3B, Appendix A and Intel Architecture Optimization Reference Manual (730795-001)

NameDescriptionCounters usableUnit mask options
CPU_CLK_UNHALTED Clock cycles when not halted all
UNHALTED_REFERENCE_CYCLES Unhalted reference cycles all 0x01: No unit mask
INST_RETIRED number of instructions retired all
LLC_MISSES Last level cache demand requests from this core that missed the LLC all 0x41: No unit mask
LLC_REFS Last level cache demand requests from this core all 0x4f: No unit mask
BR_INST_RETIRED number of branch instructions retired all
BR_MISS_PRED_RETIRED number of mispredicted branches retired (precise) all
ld_blocks blocked loads all 0x01: (name=data_unknown) blocked loads due to store buffer blocks with unknown data.
0x02: (name=store_forward) loads blocked by overlapping with store buffer that cannot be forwarded
0x08: (name=no_sr) This event counts the number of times that split load operations are temporarily blocked because all resources for handling the split accesses are in use.
0x10: (name=all_block) Number of cases where any load is blocked but has no DCU miss.
misalign_mem_ref Misaligned memory references all 0x01: (name=loads) Speculative cache-line split load uops dispatched to the L1D.
0x02: (name=stores) Speculative cache-line split Store-address uops dispatched to L1D
ld_blocks_partial Partial loads all 0x01: (name=address_alias) False dependencies in MOB due to partial compare on address
0x08: (name=all_sta_block) This event counts the number of times that load operations are temporarily blocked because of older stores, with addresses that are not yet known. A load operation may incur more than one block of this type.
dtlb_load_misses D-TLB misses all 0x01: (name=miss_causes_a_walk) Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G)
0x02: (name=walk_completed) Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G)
0x04: (name=walk_duration) Cycles PMH is busy with this walk
0x10: (name=stlb_hit) First level miss but second level hit; no page walk.
int_misc Instruction decoder events all 0x40: (name=rat_stall_cycles) Cycles Resource Allocation Table (RAT) external stall is sent to Instruction Decode Queue (IDQ) for this thread.
0x03: (name=recovery_cycles) Number of cycles waiting to be recover after Nuke due to all other cases except JEClear.
0x03: (name=recovery_stalls_count) Edge applied to recovery_cycles, thus counts occurrences.
uops_issued Number of Uops issued all 0x01: (name=any) Number of Uops issued by the Resource Allocation Table (RAT) to the Reservation Station (RS)
0x01: (name=stall_cycles) cycles no uops issued by this thread.
arith Misc ALU events all 0x01: (name=fpu_div_active) Cycles that the divider is busy with any divide or sqrt operation.
0x01: (name=fpu_div) Number of times that the divider is actived, includes INT, SIMD and FP.
insts_written_to_iq Number of instructions written to Instruction Queue (IQ) this cycle. all 0x01: No unit mask
l2_rqsts Requests from L2 cache all 0x01: (name=demand_data_rd_hit) Demand Data Read hit L2, no rejects
0x04: (name=rfo_hit) RFO requests that hit L2 cache
0x08: (name=rfo_miss) RFO requests that miss L2 cache
0x10: (name=code_rd_hit) L2 cache hits when fetching instructions, code reads.
0x20: (name=code_rd_miss) L2 cache misses when fetching instructions
0x40: (name=pf_hit) Requests from the L2 hardware prefetchers that hit L2 cache
0x80: (name=pf_miss) Requests from the L2 hardware prefetchers that miss L2 cache
0x03: (name=all_demand_data_rd) Any data read request to L2 cache
0x0c: (name=all_rfo) Any data RFO request to L2 cache
0x30: (name=all_code_rd) Any code read request to L2 cache
0xc0: (name=all_pf) Any L2 HW prefetch request to L2 cache
l2_store_lock_rqsts L2 cache store lock requests all 0x0f: (name=all) RFOs that access cache lines in any state
0x01: (name=miss) RFO (as a result of regular RFO or Lock request) miss cache - I state
0x04: (name=hit_e) RFO (as a result of regular RFO or Lock request) hits cache in E state
0x08: (name=hit_m) RFO (as a result of regular RFO or Lock request) hits cache in M state
l2_l1d_wb_rqsts writebacks from L1D to the L2 cache all 0x04: (name=hit_e) writebacks from L1D to L2 cache lines in E state
0x08: (name=hit_m) writebacks from L1D to L2 cache lines in M state
l1d_pend_miss Cycles with L1D load Misses outstanding. 2 0x01: (name=pending) Cycles with L1D load Misses outstanding.
0x01: (name=occurences) This event counts the number of L1D misses outstanding occurences.
dtlb_store_misses D-TLB store misses all 0x01: (name=miss_causes_a_walk) Miss in all TLB levels causes an page walk of any page size (4K/2M/4M/1G)
0x02: (name=walk_completed) Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M/1G)
0x04: (name=walk_duration) Cycles PMH is busy with this walk
0x10: (name=stlb_hit) First level miss but second level hit; no page walk. Only relevant if multiple levels.
load_hit_pre Load dispatches that hit fill buffer all 0x01: (name=sw_pf) Load dispatches that hit fill buffer allocated for S/W prefetch.
0x02: (name=hw_pf) Load dispatches that hit fill buffer allocated for HW prefetch.
hw_pre_req Hardware Prefetch requests all 0x02: No unit mask
l1d L1D cache events all 0x01: (name=replacement) L1D Data line replacements.
0x02: (name=allocated_in_m) L1D M-state Data Cache Lines Allocated
0x04: (name=eviction) L1D M-state Data Cache Lines Evicted due to replacement (only)
0x08: (name=all_m_replacement) All Modified lines evicted out of L1D
partial_rat_stalls Partial RAT stalls all 0x20: (name=flags_merge_uop) Number of perf sensitive flags-merge uops added by Sandy Bridge u-arch.
0x40: (name=slow_lea_window) Number of cycles with at least 1 slow Load Effective Address (LEA) uop being allocated.
0x80: (name=mul_single_uop) Number of Multiply packed/scalar single precision uops allocated
0x20: (name=flags_merge_uop_cycles) Cycles with perf sensitive flags-merge uops added by SandyBridge u-arch.
resource_stalls2 Misc resource stalls all 0x40: (name=bob_full) Cycles Allocator is stalled due Branch Order Buffer (BOB).
0x0f: (name=all_prf_control) Resource stalls2 control structures full for physical registers
0x0c: (name=all_fl_empty) Cycles with either free list is empty
0x4f: (name=ooo_rsrc) Resource stalls2 control structures full Physical Register Reclaim Table (PRRT), Physical History Table (PHT), INT or SIMD Free List (FL), Branch Order Buffer (BOB)
cpl_cycles Unhalted core cycles in specific rings all 0x01: (name=ring0) Unhalted core cycles the Thread was in Rings 0.
0x01: (name=ring0_trans) Transitions from ring123 to Ring0.
0x02: (name=ring123) Unhalted core cycles the Thread was in Rings 1/2/3.
rs_events Events for the reservation station all 0x01: No unit mask
offcore_requests_outstanding Offcore outstanding transactions all 0x01: (name=demand_data_rd) Offcore outstanding Demand Data Read transactions in the SuperQueue (SQ), queue to uncore, every cycle. Includes L1D data hardware prefetches.
0x01: (name=cycles_with_demand_data_rd) cycles there are Offcore outstanding RD data transactions in the SuperQueue (SQ), queue to uncore.
0x02: (name=demand_code_rd) Offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x04: (name=demand_rfo) Offcore outstanding RFO (store) transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x08: (name=all_data_rd) Offcore outstanding all cacheable Core Data Read transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x08: (name=cycles_with_data_rd) Cycles there are Offcore outstanding all Data read transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x02: (name=cycles_with_demand_code_rd) Cycles with offcore outstanding Code Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
0x04: (name=cycles_with_demand_rfo) Cycles with offcore outstanding demand RFO Reads transactions in the SuperQueue (SQ), queue to uncore, every cycle.
lock_cycles Cycles due to LOCK prefixes. all 0x01: (name=split_lock_uc_lock_duration) Cycles in which the L1D and L2 are locked, due to a UC lock or split lock
0x02: (name=cache_lock_duration) cycles that theL1D is locked
idq Instruction Decode Queue events all 0x02: (name=empty) Cycles the Instruction Decode Queue (IDQ) is empty.
0x04: (name=mite_uops) Number of uops delivered to Instruction Decode Queue (IDQ) from MITE path.
0x08: (name=dsb_uops) Number of uops delivered to Instruction Decode Queue (IDQ) from Decode Stream Buffer (DSB) path.
0x10: (name=ms_dsb_uops) Number of Uops delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB).
0x20: (name=ms_mite_uops) Number of Uops delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by MITE.
0x30: (name=ms_uops) Number of Uops were delivered into Instruction Decode Queue (IDQ) from MS, initiated by Decode Stream Buffer (DSB) or MITE.
0x30: (name=ms_cycles) Number of cycles that Uops were delivered into Instruction Decode Queue (IDQ) when MS_Busy, initiated by Decode Stream Buffer (DSB) or MITE.
0x04: (name=mite_cycles) Cycles MITE is active
0x08: (name=dsb_cycles) Cycles Decode Stream Buffer (DSB) is active
0x10: (name=ms_dsb_cycles) Cycles Decode Stream Buffer (DSB) Microcode Sequenser (MS) is active
0x10: (name=ms_dsb_occur) Occurences of Decode Stream Buffer (DSB) Microcode Sequenser (MS) going active
0x18: (name=all_dsb_cycles_any_uops) Cycles Decode Stream Buffer (DSB) is delivering anything
0x18: (name=all_dsb_cycles_4_uops) Cycles Decode Stream Buffer (DSB) is delivering 4 Uops
0x24: (name=all_mite_cycles_any_uops) Cycles MITE is delivering anything
0x24: (name=all_mite_cycles_4_uops) Cycles MITE is delivering 4 Uops
0x3c: (name=mite_all_uops) Number of uops delivered to Instruction Decode Queue (IDQ) from any path.
icache Instruction cache events all 0x02: No unit mask
itlb_misses I-TLB misses all 0x01: (name=miss_causes_a_walk) Miss in all TLB levels causes an page walk of any page size (4K/2M/4M)
0x02: (name=walk_completed) Miss in all TLB levels causes a page walk that completes of any page size (4K/2M/4M)
0x04: (name=walk_duration) Cycles PMH is busy with this walk.
0x10: (name=stlb_hit) First level miss but second level hit; no page walk.
ild_stall Instruction decoding stalls all 0x01: (name=lcp) Stall "occurrences" due to length changing prefixes (LCP).
0x04: (name=iq_full) Stall cycles when instructions cannot be written because the Instruction Queue (IQ) is full.
br_inst_exec Branch instructions all 0xff: (name=all_branches) All branch instructions executed.
0x41: (name=nontaken_conditional) All macro conditional nontaken branch instructions.
0x81: (name=taken_conditional) All macro conditional taken branch instructions.
0x82: (name=taken_direct_jump) All macro unconditional taken branch instructions, excluding calls and indirects.
0x84: (name=taken_indirect_jump_non_call_ret) All taken indirect branches that are not calls nor returns.
0x88: (name=taken_indirect_near_return) All taken indirect branches that have a return mnemonic.
0x90: (name=taken_direct_near_call) All taken non-indirect calls.
0xa0: (name=taken_indirect_near_call) All taken indirect calls, including both register and memory indirect.
0xc1: (name=all_conditional) All macro conditional branch instructions.
0xc2: (name=all_direct_jmp) All macro unconditional branch instructions, excluding calls and indirects
0xc4: (name=all_indirect_jump_non_call_ret) All indirect branches that are not calls nor returns.
0xc8: (name=all_indirect_near_return) All indirect return branches.
0xd0: (name=all_direct_near_call) All non-indirect calls executed.
br_misp_exec Mispredicted branch instructions all 0xff: (name=all_branches) All mispredicted branch instructions executed.
0x41: (name=nontaken_conditional) All nontaken mispredicted macro conditional branch instructions.
0x81: (name=taken_conditional) All taken mispredicted macro conditional branch instructions.
0x84: (name=taken_indirect_jump_non_call_ret) All taken mispredicted indirect branches that are not calls nor returns.
0x88: (name=taken_return_near) All taken mispredicted indirect branches that have a return mnemonic.
0x90: (name=taken_direct_near_call) All taken mispredicted non-indirect calls.
0xa0: (name=taken_indirect_near_call) All taken mispredicted indirect calls, including both register and memory indirect.
0xc1: (name=all_conditional) All mispredicted macro conditional branch instructions.
0xc4: (name=all_indirect_jump_non_call_ret) All mispredicted indirect branches that are not calls nor returns.
0xd0: (name=all_direct_near_call) All mispredicted non-indirect calls
idq_uops_not_delivered uops not delivered to IDQ. all 0x01: (name=core) Count number of non-delivered uops to Resource Allocation Table (RAT).
0x01: (name=cycles_0_uops_deliv.core) Counts the cycles no uops were delivered
0x01: (name=cycles_le_1_uop_deliv.core) Counts the cycles less than 1 uops were delivered
0x01: (name=cycles_le_2_uop_deliv.core) Counts the cycles less than 2 uops were delivered
0x01: (name=cycles_le_3_uop_deliv.core) Counts the cycles less than 3 uops were delivered
0x01: (name=cycles_ge_1_uop_deliv.core) Cycles when 1 or more uops were delivered to the by the front end.
0x01: (name=cycles_fe_was_ok) Counts cycles FE delivered 4 uops or Resource Allocation Table (RAT) was stalling FE.
uops_dispatched_port Count on which ports uops are dispatched. all 0x01: (name=port_0) Cycles which a Uop is dispatched on port 0
0x02: (name=port_1) Cycles which a Uop is dispatched on port 1
0x04: (name=port_2_ld) Cycles which a load Uop is dispatched on port 2
0x08: (name=port_2_sta) Cycles which a STA Uop is dispatched on port 2
0x10: (name=port_3_ld) Cycles which a load Uop is dispatched on port 3
0x20: (name=port_3_sta) Cycles which a STA Uop is dispatched on port 3
0x40: (name=port_4) Cycles which a Uop is dispatched on port 4
0x80: (name=port_5) Cycles which a Uop is dispatched on port 5
0x0c: (name=port_2) Uops disptached to port 2, loads and stores (speculative and retired)
0x30: (name=port_3) Uops disptached to port 3, loads and stores (speculative and retired)
0x0c: (name=port_2_core) Uops disptached to port 2, loads and stores per core (speculative and retired)
0x30: (name=port_3_core) Uops disptached to port 3, loads and stores per core (speculative and retired)
resource_stalls Core resource stalls all 0x01: (name=any) Cycles Allocation is stalled due to Resource Related reason.
0x02: (name=lb) Cycles Allocator is stalled due to Load Buffer full
0x04: (name=rs) Stall due to no eligible Reservation Station (RS) entry available.
0x08: (name=sb) Cycles Allocator is stalled due to Store Buffer full (not including draining from synch).
0x10: (name=rob) ROB full cycles.
0x0e: (name=mem_rs) Resource stalls due to LB, SB or Reservation Station (RS) being completely in use
0xf0: (name=ooo_rsrc) Resource stalls due to Rob being full, FCSW, MXCSR and OTHER
0x0a: (name=lb_sb) Resource stalls due to load or store buffers
dsb2mite_switches Number of Decode Stream Buffer (DSB) to MITE switches all 0x01: (name=count) Number of Decode Stream Buffer (DSB) to MITE switches
0x02: (name=penalty_cycles) Decode Stream Buffer (DSB)-to-MITE switch true penalty cycles.
dsb_fill DSB fill events all 0x02: (name=other_cancel) Count number of times a valid DSB fill has been actually cancelled for any reason.
0x08: (name=exceed_dsb_lines) Decode Stream Buffer (DSB) Fill encountered > 3 Decode Stream Buffer (DSB) lines.
0x0a: (name=all_cancel) Count number of times a valid Decode Stream Buffer (DSB) fill has been actually cancelled for any reason.
itlb ITLB events all 0x01: No unit mask
offcore_requests Requests sent outside the core all 0x01: (name=demand_data_rd) Demand Data Read requests sent to uncore
0x02: (name=demand_code_rd) Offcore Code read requests. Includes Cacheable and Un-cacheables.
0x04: (name=demand_rfo) Offcore Demand RFOs. Includes regular RFO, Locks, ItoM.
0x08: (name=all_data_rd) Offcore Demand and prefetch data reads returned to the core.
uops_dispatched uops dispatched all 0x01: (name=thread) Counts total number of uops to be dispatched per-thread each cycle.
0x01: (name=stall_cycles) Counts number of cycles no uops were dispatced to be executed on this thread.
0x02: (name=core) Counts total number of uops dispatched from any thread
offcore_requests_buffer Offcore requests buffer events all 0x01: No unit mask
agu_bypass_cancel AGU bypass cancel all 0x01: No unit mask
tlb_flush TLB flushes all 0x01: (name=dtlb_thread) Count number of DTLB flushes of thread-specific entries.
0x20: (name=stlb_any) Count number of any STLB flushes
l1d_blocks L1D cache blocking events all 0x01: (name=ld_bank_conflict) Any dispatched loads cancelled due to DCU bank conflict
0x05: (name=bank_conflict_cycles) Cycles with l1d blocks due to bank conflicts
inst_retired Instructions retired 1 0x01: No unit mask
other_assists Instructions that needed an assist all 0x02: (name=itlb_miss_retired) Instructions that experienced an ITLB miss. Non Pebs
0x10: (name=avx_to_sse) Number of transitions from AVX-256 to legacy SSE when penalty applicable Non Pebs
0x20: (name=sse_to_avx) Number of transitions from legacy SSE to AVX-256 when penalty applicable Non Pebs
uops_retired uops that actually retired. all 0x01: (name=all) All uops that actually retired.
0x02: (name=retire_slots) number of retirement slots used non PEBS
0x01: (name=stall_cycles) Cycles no executable uops retired
0x01: (name=total_cycles) Number of cycles using always true condition applied to non PEBS uops retired event.
machine_clears Number of Machine Clears detected. all 0x02: (name=memory_ordering) Number of Memory Ordering Machine Clears detected.
0x04: (name=smc) Number of Self-modifying code (SMC) Machine Clears detected.
0x20: (name=maskmov) Number of AVX masked mov Machine Clears detected.
br_inst_retired Counts branch instructions retired all 0x01: (name=conditional) Counts all taken and not taken macro conditional branch instructions.
0x02: (name=near_call) Counts all macro direct and indirect near calls. non PEBS
0x08: (name=near_return) This event counts the number of near ret instructions retired.
0x10: (name=not_taken) Counts all not taken macro branch instructions retired.
0x20: (name=near_taken) Counts the number of near branch taken instructions retired.
0x40: (name=far_branch) Counts the number of far branch instructions retired.
0x04: (name=all_branches_ps) Counts all taken and not taken macro branches including far branches.(Precise Event)
0x02: (name=near_call_r3) Ring123 only near calls (non precise)
0x02: (name=near_call_r3_ps) Ring123 only near calls (precise event)
br_misp_retired Counts mispredicted branch instructions all 0x01: (name=conditional) All mispredicted macro conditional branch instructions.
0x02: (name=near_call) All macro direct and indirect near calls
0x10: (name=not_taken) number of branch instructions retired that were mispredicted and not-taken.
0x20: (name=taken) number of branch instructions retired that were mispredicted and taken.
0x04: (name=all_branches_ps) all macro branches (Precise Event)
fp_assist Counts floating point assists all 0x1e: (name=any) Counts any FP_ASSIST umask was incrementing.
0x02: (name=x87_output) output - Numeric Overflow, Numeric Underflow, Inexact Result
0x04: (name=x87_input) input - Invalid Operation, Denormal Operand, SNaN Operand
0x08: (name=simd_output) Any output SSE* FP Assist - Numeric Overflow, Numeric Underflow.
0x10: (name=simd_input) Any input SSE* FP Assist
hw_interrupts Number of hardware interrupts received by the processor. all 0x01: No unit mask
rob_misc_events Count ROB (Register Reorder Buffer) events. all 0x20: No unit mask
mem_trans_retired Count memory transactions 3 0x02: No unit mask
mem_uops_retired Count uops with memory accessed retired all 0x11: (name=stlb_miss_loads) STLB misses dues to retired loads
0x12: (name=stlb_miss_stores) STLB misses dues to retired stores
0x21: (name=lock_loads) Locked retired loads
0x41: (name=split_loads) Retired loads causing cacheline splits
0x42: (name=split_stores) Retired stores causing cacheline splits
0x81: (name=all_loads) Any retired loads
0x82: (name=all_stores) Any retired stores
mem_load_uops_retired Memory load uops. all 0x01: (name=l1_hit) Load hit in nearest-level (L1D) cache
0x02: (name=l2_hit) Load hit in mid-level (L2) cache
0x04: (name=llc_hit) Load hit in last-level (L3) cache with no snoop needed
0x40: (name=hit_lfb) A load missed L1D but hit the Fill Buffer
mem_load_uops_llc_hit_retired Memory load uops with LLC (Last level cache) hit all 0x01: (name=xsnp_miss) Load LLC Hit and a cross-core Snoop missed in on-pkg core cache
0x02: (name=xsnp_hit) Load LLC Hit and a cross-core Snoop hits in on-pkg core cache
0x04: (name=xsnp_hitm) Load had HitM Response from a core on same socket (shared LLC).
0x08: (name=xsnp_none) Load hit in last-level (L3) cache with no snoop needed.
mem_load_uops_misc_retired Memory load uops retired all 0x02: No unit mask
l2_trans L2 cache accesses all 0x80: (name=all_requests) Transactions accessing L2 pipe
0x01: (name=demand_data_rd) Demand Data Read requests that access L2 cache, includes L1D prefetches.
0x02: (name=rfo) RFO requests that access L2 cache
0x04: (name=code_rd) L2 cache accesses when fetching instructions including L1D code prefetches
0x08: (name=all_pf) L2 or LLC HW prefetches that access L2 cache
0x10: (name=l1d_wb) L1D writebacks that access L2 cache
0x20: (name=l2_fill) L2 fill requests that access L2 cache
0x40: (name=l2_wb) L2 writebacks that access L2 cache
l2_lines_in L2 cache lines in all 0x07: (name=all) L2 cache lines filling L2
0x01: (name=i) L2 cache lines in I state filling L2
0x02: (name=s) L2 cache lines in S state filling L2
0x04: (name=e) L2 cache lines in E state filling L2
l2_lines_out L2 cache lines out all 0x01: (name=demand_clean) Clean line evicted by a demand
0x02: (name=demand_dirty) Dirty line evicted by a demand
0x04: (name=pf_clean) Clean line evicted by an L2 Prefetch
0x08: (name=pf_dirty) Dirty line evicted by an L2 Prefetch
0x0a: (name=dirty_all) Any Dirty line evicted
sq_misc Store queue misc events all 0x10: No unit mask
Measurement is a crucial component of performance improvement since reasoning and intuition are fallible guides and must be supplemented with tools like timing commands and profilers. - The Practice of Programming, Brian W. Kernighan and Rob Pike
2020/07/20