This is a list of all ARM V8 Cortex-A53's performance counter event types. Please see Cortex-A53 MPCore Technical Reference Manual Cortex A53 DDI (ARM DDI 0500D, revision r0p2).
Name | Description | Counters usable | Unit mask options |
SW_INCR | Instruction architecturally executed, condition code check pass, software increment | all | |
L1I_CACHE_REFILL | Level 1 instruction cache refill | all | |
L1I_TLB_REFILL | Level 1 instruction TLB refill | all | |
L1D_CACHE_REFILL | Level 1 data cache refill | all | |
L1D_CACHE | Level 1 data cache access | all | |
L1D_TLB_REFILL | Level 1 data TLB refill | all | |
LD_RETIRED | Instruction architecturally executed, condition code check pass, load | all | |
ST_RETIRED | Instruction architecturally executed, condition code check pass, store | all | |
INST_RETIRED | Instruction architecturally executed | all | |
EXC_TAKEN | Exception taken | all | |
EXC_RETURN | Instruction architecturally executed, condition code check pass, exception return | all | |
CID_WRITE_RETIRED | Instruction architecturally executed, condition code check pass, write to CONTEXTIDR | all | |
PC_WRITE_RETIRED | Instruction architecturally executed, condition code check pass, software change of the PC | all | |
BR_IMMED_RETIRED | Instruction architecturally executed, immediate branch | all | |
BR_RETURN_RETIRED | Instruction architecturally executed, condition code check pass, procedure return | all | |
UNALIGNED_LDST_RETIRED | Instruction architecturally executed, condition code check pass, unaligned load or store | all | |
BR_MIS_PRED | Mispredicted or not predicted branch speculatively executed | all | |
CPU_CYCLES | Cycle | all | |
BR_PRED | Predictable branch speculatively executed | all | |
MEM_ACCESS | Data memory access | all | |
L1I_CACHE | Level 1 instruction cache access | all | |
L1D_CACHE_WB | Level 1 data cache write-back | all | |
L2D_CACHE | Level 2 data cache access | all | |
L2D_CACHE_REFILL | Level 2 data cache refill | all | |
L2D_CACHE_WB | Level 2 data cache write-back | all | |
BUS_ACCESS | Bus access | all | |
MEMORY_ERROR | Local memory error | all | |
INST_SPEC | Operation speculatively executed | all | |
TTBR_WRITE_RETIRED | Instruction architecturally executed, condition code check pass, write to TTBR | all | |
BUS_CYCLES | Bus cycle | all | |
L1D_CACHE_ALLOCATE | Level 1 data cache allocation without refill | all | |
L2D_CACHE_ALLOCATE | Level 2 data cache allocation without refill | all | |
BUS_ACCESS_LD | Bus access - Read | all | |
BUS_ACCESS_ST | Bus access - Write | all | |
BR_INDIRECT_SPEC | Branch speculatively executed - Indirect branch | all | |
EXC_IRQ | Exception taken, IRQ | all | |
EXC_FIQ | Exception taken, FIQ | all | |
EXT_MEM_REQ | External memory request | all | |
EXT_MEM_REQ_NC | Non-cacheable external memory request | all | |
PREFETCH_LINEFILL | Linefill because of prefetch | all | |
PREFETCH_LINEFILL_DROP | Instruction Cache Throttle occurred | all | |
READ_ALLOC_ENTER | Entering read allocate mode | all | |
READ_ALLOC | Read allocate mode | all | |
PRE_DECODE_ERR | Pre-decode error | all | |
STALL_SB_FULL | Data Write operation that stalls the pipeline because the store buffer is full | all | |
EXT_SNOOP | SCU Snooped data from another CPU for this CPU | all | |
BR_COND | Conditional branch executed | all | |
BR_INDIRECT_MISPRED | Indirect branch mispredicted | all | |
BR_INDIRECT_MISPRED_ADDR | Indirect branch mispredicted because of address miscompare | all | |
BR_COND_MISPRED | Conditional branch mispredicted | all | |
L1I_CACHE_ERR | L1 Instruction Cache (data or tag) memory error | all | |
L1D_CACHE_ERR | L1 Data Cache (data, tag or dirty) memory error, correctable or non-correctable | all | |
TLB_ERR | TLB memory error | all | |
OTHER_IQ_DEP_STALL | Cycles that the DPU IQ is empty and that is not because of a recent micro-TLB miss, instruction cache miss or pre-decode error | all | |
IC_DEP_STALL | Cycles the DPU IQ is empty and there is an instruction cache miss being processed | all | |
IUTLB_DEP_STALL | Cycles the DPU IQ is empty and there is an instruction micro-TLB miss being processed | all | |
DECODE_DEP_STALL | Cycles the DPU IQ is empty and there is a pre-decode error being processed | all | |
OTHER_INTERLOCK_STALL | Cycles there is an interlock other than Advanced SIMD/Floating-point instructions or load/store instruction | all | |
AGU_DEP_STALL | Cycles there is an interlock for a load/store instruction waiting for data to calculate the address in the AGU | all | |
SIMD_DEP_STALL | Cycles there is an interlock for an Advanced SIMD/Floating-point operation. | all | |
LD_DEP_STALL | Cycles there is a stall in the Wr stage because of a load miss | all | |
ST_DEP_STALL | Cycles there is a stall in the Wr stage because of a store | all |
Measurement is a crucial component of performance improvement since reasoning and intuition are fallible guides and must be supplemented with tools like timing commands and profilers.- The Practice of Programming, Brian W. Kernighan and Rob Pike