This is a list of AMD64 family 11h's CPU's performance counter event types. See BIOS and Kernel Developer's Guide for AMD Family 11h Processors, (31116.pdf), Section 3.14.
Name | Description | Counters usable | Unit mask options |
DISPATCHED_FPU_OPS | Dispatched FPU ops | all |
0x01: Add pipe ops
0x02: Multiply pipe 0x04: Store pipe ops 0x08: Add pipe load ops 0x10: Multiply pipe load ops 0x20: Store pipe load ops |
CYCLES_NO_FPU_OPS_RETIRED | Cycles in which the FPU is empty | all | |
DISPATCHED_FPU_OPS_FAST_FLAG | Dispatched FPU ops that use the fast flag interface | all | |
SEGMENT_REGISTER_LOADS | Segment register loads | all |
0x01: ES register
0x02: CS register 0x04: SS register 0x08: DS register 0x10: FS register 0x20: GS register 0x40: HS register |
PIPELINE_RESTART_DUE_TO_SELF_MODIFYING_CODE | Micro-architectural re-sync caused by self modifying code | all | |
PIPELINE_RESTART_DUE_TO_PROBE_HIT | Micro-architectural re-sync caused by snoop | all | |
LS_BUFFER_2_FULL_CYCLES | Cycles LS Buffer 2 full | all | |
LOCKED_OPS | Locked operations | all |
0x01: The number of locked instructions executed
0x02: The number of cycles spent in speculative phase 0x04: The number of cycles spent in non-speculative phase (including cache miss penalty) |
DATA_CACHE_ACCESSES | Data cache accesses | all | |
DATA_CACHE_MISSES | Data cache misses | all | |
DATA_CACHE_REFILLS_FROM_L2_OR_SYSTEM | Data cache refills from L2 or system | all |
0x01: refill from system
0x02: (S)hared cache state from L2 0x04: (E)xclusive cache state from L2 0x08: (O)wned cache state from L2 0x10: (M)odified cache state from L2 0x1e: All cache states except Invalid |
DATA_CACHE_REFILLS_FROM_SYSTEM | Data cache refills from system | all |
0x01: (I)nvalid cache state
0x02: (S)hared cache state 0x04: (E)xclusive cache state 0x08: (O)wned cache state 0x10: (M)odified cache state 0x1f: All cache states |
DATA_CACHE_LINES_EVICTED | Data cache lines evicted | all |
0x01: (I)nvalid cache state
0x02: (S)hared cache state 0x04: (E)xclusive cache state 0x08: (O)wned cache state 0x10: (M)odified cache state 0x1f: All cache states |
L1_DTLB_MISS_AND_L2_DTLB_HIT | L1 DTLB misses and L2 DTLB hits | all | |
L1_DTLB_AND_L2_DTLB_MISS | L1 and L2 DTLB misses | all | |
MISALIGNED_ACCESSES | Misaligned Accesses | all | |
MICROARCHITECTURAL_LATE_CANCEL_OF_AN_ACCESS | Micro-architectural late cancel of an access | all | |
MICROARCHITECTURAL_EARLY_CANCEL_OF_AN_ACCESS | Micro-architectural early cancel of an access | all | |
SCRUBBER_SINGLE_BIT_ECC_ERRORS | One bit ECC error recorded by scrubber | all |
0x01: Scrubber error
0x02: Piggyback scrubber errors |
PREFETCH_INSTRUCTIONS_DISPATCHED | Prefetch instructions dispatched | all |
0x01: Load
0x02: Store 0x04: NTA |
DCACHE_MISS_LOCKED_INSTRUCTIONS | DCACHE misses by locked instructions | all |
0x02: Data cache misses by locked instructions
|
MEMORY_REQUESTS | Memory requests by type | all |
0x01: Requests to non-cacheable (UC) memory
0x02: Requests to write-combining (WC) memory or WC buffer flushes to WB memory 0x80: Streaming store (SS) requests |
DATA_PREFETCHES | Data prefetcher | all |
0x01: Cancelled prefetches
0x02: Prefetch attempts |
SYSTEM_READ_RESPONSES | System read responses by coherency state | all |
0x01: Exclusive
0x02: Modified 0x04: Shared 0x08: Data Error |
QUADWORD_WRITE_TRANSFERS | Quadwords written to system | all |
0x01: Quadword write transfer
|
REQUESTS_TO_L2 | Requests to L2 cache | all |
0x01: IC fill
0x02: DC fill 0x04: TLB fill (page table walk) 0x08: Tag snoop request 0x10: Cancelled request |
L2_CACHE_MISS | L2 cache misses | all |
0x01: IC fill
0x02: DC fill 0x04: TLB page table walk |
L2_CACHE_FILL_WRITEBACK | L2 fill/writeback | all |
0x01: L2 fills (victims from L1 caches, TLB page table walks and data prefetches)
0x02: L2 writebacks to system |
INSTRUCTION_CACHE_FETCHES | Instruction cache fetches | all | |
INSTRUCTION_CACHE_MISSES | Instruction cache misses | all | |
INSTRUCTION_CACHE_REFILLS_FROM_L2 | Instruction cache refills from L2 | all | |
INSTRUCTION_CACHE_REFILLS_FROM_SYSTEM | Instruction cache refills from system | all | |
L1_ITLB_MISS_AND_L2_ITLB_HIT | L1 ITLB miss and L2 ITLB hit | all | |
L1_ITLB_MISS_AND_L2_ITLB_MISS | L1 ITLB miss and L2 ITLB miss | all | |
PIPELINE_RESTART_DUE_TO_INSTRUCTION_STREAM_PROBE | Pipeline restart due to instruction stream probe | all | |
INSTRUCTION_FETCH_STALL | Instruction fetch stall | all | |
RETURN_STACK_HITS | Return stack hits | all | |
RETURN_STACK_OVERFLOWS | Return stack overflows | all | |
RETIRED_CFLUSH | Retired CLFLUSH instructions | all | |
RETIRED_CPUID | Retired CPUID instructions | all | |
CPU_CLK_UNHALTED | Cycles outside of halt state | all | |
RETIRED_INSTRUCTIONS | Retired instructions (includes exceptions, interrupts, re-syncs) | all | |
RETIRED_UOPS | Retired micro-ops | all | |
RETIRED_BRANCH_INSTRUCTIONS | Retired branches (conditional, unconditional, exceptions, interrupts) | all | |
RETIRED_MISPREDICTED_BRANCH_INSTRUCTIONS | Retired mispredicted branch instructions | all | |
RETIRED_TAKEN_BRANCH_INSTRUCTIONS | Retired taken branch instructions | all | |
RETIRED_TAKEN_BRANCH_INSTRUCTIONS_MISPREDICTED | Retired taken branches mispredicted | all | |
RETIRED_FAR_CONTROL_TRANSFERS | Retired far control transfers | all | |
RETIRED_BRANCH_RESYNCS | Retired branches resyncs (only non-control transfer branches) | all | |
RETIRED_NEAR_RETURNS | Retired near returns | all | |
RETIRED_NEAR_RETURNS_MISPREDICTED | Retired near returns mispredicted | all | |
RETIRED_INDIRECT_BRANCHES_MISPREDICTED | Retired indirect branches mispredicted | all | |
RETIRED_MMX_FP_INSTRUCTIONS | Retired MMX/FP instructions | all |
0x01: x87 instructions
0x02: MMX & 3DNow instructions 0x04: Packed SSE & SSE2 instructions 0x08: Packed scalar SSE & SSE2 instructions |
RETIRED_FASTPATH_DOUBLE_OP_INSTRUCTIONS | Retired FastPath double-op instructions | all |
0x01: With low op in position 0
0x02: With low op in position 1 0x04: With low op in position 2 |
INTERRUPTS_MASKED_CYCLES | Cycles with interrupts masked (IF=0) | all | |
INTERRUPTS_MASKED_CYCLES_WITH_INTERRUPT_PENDING | Cycles with interrupts masked while interrupt pending | all | |
INTERRUPTS_TAKEN | Number of taken hardware interrupts | all | |
DECODER_EMPTY | Nothing to dispatch (decoder empty) | all | |
DISPATCH_STALLS | Dispatch stalls | all | |
DISPATCH_STALL_FOR_BRANCH_ABORT | Dispatch stall from branch abort to retire | all | |
DISPATCH_STALL_FOR_SERIALIZATION | Dispatch stall for serialization | all | |
DISPATCH_STALL_FOR_SEGMENT_LOAD | Dispatch stall for segment load | all | |
DISPATCH_STALL_FOR_REORDER_BUFFER_FULL | Dispatch stall for reorder buffer full | all | |
DISPATCH_STALL_FOR_RESERVATION_STATION_FULL | Dispatch stall when reservation stations are full | all | |
DISPATCH_STALL_FOR_FPU_FULL | Dispatch stall when FPU is full | all | |
DISPATCH_STALL_FOR_LS_FULL | Dispatch stall when LS is full | all | |
DISPATCH_STALL_WAITING_FOR_ALL_QUIET | Dispatch stall when waiting for all to be quiet | all | |
DISPATCH_STALL_FOR_FAR_TRANSFER_OR_RESYNC | Dispatch stall for far transfer or resync to retire | all | |
FPU_EXCEPTIONS | FPU exceptions | all |
0x01: x87 reclass microfaults
0x02: SSE retype microfaults 0x04: SSE reclass microfaults 0x08: SSE and x87 microtraps |
DR0_BREAKPOINTS | Number of breakpoints for DR0 | all | |
DR1_BREAKPOINTS | Number of breakpoints for DR1 | all | |
DR2_BREAKPOINTS | Number of breakpoints for DR2 | all | |
DR3_BREAKPOINTS | Number of breakpoints for DR3 | all | |
DRAM_ACCESSES | DRAM accesses | all |
0x01: DCT0 Page hit
0x02: DCT0 Page miss 0x04: DCT0 Page conflict 0x08: DCT1 Page hit 0x10: DCT1 Page miss 0x20: DCT1 Page conflict 0x40: Write request 0x80: Read request |
DRAM_CONTROLLER_PAGE_TABLE_EVENTS | DRAM Controller Page Table Events | all |
0x01: DCT Page Table Overflow
0x02: Number of stale table entry hits (hit on a page closed too soon) 0x04: Page table idle cycle limit incremented 0x08: Page table idle cycle limit decremented |
MEMORY_CONTROLLER_TURNAROUNDS | Memory controller turnarounds | all |
0x01: DCT0 Read to write turnaround
0x02: DCT0 Write to read turnaround 0x04: DCT0 DIMM (chip select) turnaround 0x08: DCT1 Read to write turnaround 0x10: DCT1 Write to read turnaround 0x20: DCT1 DIMM (chip select) turnaround |
MEMORY_CONTROLLER_RBD_QUEUE_EVENTS | Memory controller RBD queue events | all |
0x04: F2x[1,0]94[DcqBypassMax] counter reached
|
THERMAL_STATUS | Thermal status | all |
0x01: Number of clocks MEMHOT_L is asserted
0x04: Number of times the HTC transitions from inactive to active 0x20: Number of clocks HTC P-state is inactive 0x40: Number of clocks HTC P-state is active 0x80: PROCHOT_L asserted by an external source and P-state change occurred |
CPU_IO_REQUESTS_TO_MEMORY_IO | CPU/IO requests to memory/IO | all |
0xa1: Requests Local I/O to Local I/O
0xa2: Requests Local I/O to Local Memory 0xa3: Requests Local I/O to Local (I/O or Mem) 0xa4: Requests Local CPU to Local I/O 0xa5: Requests Local (CPU or I/O) to Local I/O 0xa8: Requests Local CPU to Local Memory 0xaa: Requests Local (CPU or I/O) to Local Memory 0xac: Requests Local CPU to Local (I/O or Mem) 0xaf: Requests Local (CPU or I/O) to Local (I/O or Mem) 0x91: Requests Local I/O to Remote I/O 0x92: Requests Local I/O to Remote Memory 0x93: Requests Local I/O to Remote (I/O or Mem) 0x94: Requests Local CPU to Remote I/O 0x95: Requests Local (CPU or I/O) to Remote I/O 0x98: Requests Local CPU to Remote Memory 0x9a: Requests Local (CPU or I/O) to Remote Memory 0x9c: Requests Local CPU to Remote (I/O or Mem) 0x9f: Requests Local (CPU or I/O) to Remote (I/O or Mem) 0xb1: Requests Local I/O to Any I/O 0xb2: Requests Local I/O to Any Memory 0xb3: Requests Local I/O to Any (I/O or Mem) 0xb4: Requests Local CPU to Any I/O 0xb5: Requests Local (CPU or I/O) to Any I/O 0xb8: Requests Local CPU to Any Memory 0xba: Requests Local (CPU or I/O) to Any Memory 0xbc: Requests Local CPU to Any (I/O or Mem) 0xbf: Requests Local (CPU or I/O) to Any (I/O or Mem) 0x61: Requests Remote I/O to Local I/O 0x64: Requests Remote CPU to Local I/O 0x65: Requests Remote (CPU or I/O) to Local I/O |
CACHE_BLOCK_COMMANDS | Cache block commands | all |
0x01: Victim Block (Writeback)
0x04: Read Block (Dcache load miss refill) 0x08: Read Block Shared (Icache refill) 0x10: Read Block Modified (Dcache store miss refill) 0x20: Change to Dirty (first store to clean block already in cache) |
SIZED_COMMANDS | Sized commands | all |
0x01: Non-posted write byte (1-32 bytes)
0x02: Non-posted write DWORD (1-16 DWORDs) 0x04: Posted write byte (1-32 bytes) 0x08: Posted write DWORD (1-16 DWORDs) 0x10: Read byte (4 bytes) 0x20: Read DWORD (1-16 DWORDs) |
PROBE_RESPONSES_AND_UPSTREAM_REQUESTS | Probe responses and upstream requests | all |
0x01: Probe miss
0x02: Probe hit clean 0x04: Probe hit dirty without memory cancel 0x08: Probe hit dirty with memory cancel 0x10: Upstream display refresh/ISOC reads 0x20: Upstream non-display refresh reads 0x40: Upstream ISOC writes 0x80: Upstream non-ISOC writes |
DEV_EVENTS | DEV events | all |
0x10: DEV hit
0x20: DEV miss 0x40: DEV error |
MEMORY_CONTROLLER_REQUESTS | Memory controller requests | all |
0x08: 32 Bytes Sized Writes
0x10: 64 Bytes Sized Writes 0x20: 32 Bytes Sized Reads 0x40: 64 Bytes Sized Reads |
SIDEBAND_SIGNALS_AND_SPECIAL_CYCLES | Sideband Signals and Special Cycles | all |
0x01: HALT
0x02: STOPGRANT 0x04: SHUTDOWN 0x08: WBINVD 0x10: INVD |
INTERRUPT_EVENTS | Interrupt Events | all |
0x01: Fixed
0x02: LPA 0x04: SMI 0x08: NMI 0x10: INIT 0x20: STARTUP 0x40: INT 0x80: EOI |
HYPERTRANSPORT_LINK_0_TRANSMIT_BANDWIDTH | HyperTransport(tm) link 0 transmit bandwidth | all |
0x01: Command DWORD sent
0x02: Address DWORD sent 0x04: Data DWORD sent 0x08: Buffer release DWORD sent 0x10: Nop DW sent (idle) 0x20: Per packet CRC sent |
Measurement is a crucial component of performance improvement since reasoning and intuition are fallible guides and must be supplemented with tools like timing commands and profilers.- The Practice of Programming, Brian W. Kernighan and Rob Pike