Intel Phi (Knights Landing) microarchitecture performance counter events

Intel Phi (Knights Landing) Microarchitecture events

This is a list of all Intel Phi (Knights Landing) Microarchitecture performance counter event types. Please see Intel Xeon Phi(TM) Processor Performance Monitoring Reference and Intel Architecture Optimization Reference Manual.

Name	Description	Counters usable	Unit mask options
CYCLES	Cycles	0
PM_1PLUS_PPC_CMPL	1 or more ppc insts finished (completed).	0
PM_1PLUS_PPC_DISP	Cycles at least one Instr Dispatched. Could be a group with only microcode. Issue HW016521	3
PM_ANY_THRD_RUN_CYC	Any thread in run_cycles (was one thread in run_cycles).	0
PM_BR_MPRED_CMPL	Number of Branch Mispredicts.	3
PM_BR_TAKEN_CMPL	Branch Taken.	1
PM_CYC	Cycles .	0, 1, 2, 3
PM_DATA_FROM_L2MISS	Demand LD - L2 Miss (not L2 hit).	1
PM_DATA_FROM_L3MISS	Demand LD - L3 Miss (not L2 hit and not L3 hit).	2
PM_DATA_FROM_MEM	Data cache reload from memory (including L4).	3
PM_DTLB_MISS	Data PTEG Reloaded (DTLB Miss).	2
PM_EXT_INT	external interrupt.	1
PM_FLOP	Floating Point Operations Finished.	0
PM_FLUSH	Flush (any type).	3
PM_GCT_NOSLOT_CYC	Pipeline empty (No itags assigned , no GCT slots used).	0
PM_IERAT_RELOAD	IERAT Reloaded (Miss).	0
PM_INST_DISP	PPC Dispatched.	1
PM_INST_FROM_L3MISS	Inst from L3 miss.	2
PM_ITLB_MISS	ITLB Reloaded.	3
PM_L1_DCACHE_RELOAD_VALID	DL1 reloaded due to Demand Load .	2
PM_L1_ICACHE_MISS	Demand iCache Miss.	1
PM_LD_MISS_L1	Load Missed L1.	2
PM_LSU_DERAT_MISS	DERAT Reloaded (Miss).	1
PM_MRK_BR_MPRED_CMPL	Marked Branch Mispredicted.	2
PM_MRK_BR_TAKEN_CMPL	Marked Branch Taken.	0
PM_MRK_DATA_FROM_L2MISS	Data cache reload L2 miss.	3
PM_MRK_DATA_FROM_L3MISS	The processor's data cache was reloaded from a localtion other than the local core's L3 due to a marked load.	1
PM_MRK_DATA_FROM_MEM	The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load.	1
PM_MRK_DERAT_MISS	Erat Miss (TLB Access) All page sizes.	2
PM_MRK_DTLB_MISS	Marked dtlb miss.	3
PM_MRK_INST_CMPL	marked instruction completed.	3
PM_MRK_INST_DISP	Marked Instruction dispatched.	0
PM_MRK_INST_FROM_L3MISS	n/a	3
PM_MRK_L1_ICACHE_MISS	Marked L1 Icache Miss.	0
PM_MRK_L1_RELOAD_VALID	Marked demand reload.	0
PM_MRK_LD_MISS_L1	Marked DL1 Demand Miss counted at exec time.	1
PM_MRK_ST_CMPL	Marked store completed.	0
PM_RUN_CYC	Run_cycles.	5
PM_RUN_INST_CMPL	Run_Instructions.	4
PM_RUN_PURR	Run_PURR.	3
PM_ST_FIN	Store Instructions Finished (store sent to nest).	1
PM_ST_MISS_L1	Store Missed L1.	2
PM_TB_BIT_TRANS	timebase event.	2
PM_THRD_CONC_RUN_INST	Concurrent Run Instructions.	2
PM_THRESH_EXC_1024	Threshold counter exceeded a value of 1024.	2
PM_THRESH_EXC_128	Threshold counter exceeded a value of 128.	3
PM_THRESH_EXC_2048	Threshold counter exceeded a value of 2048.	3
PM_THRESH_EXC_256	Threshold counter exceed a count of 256.	0
PM_THRESH_EXC_32	Threshold counter exceeded a value of 32.	1
PM_THRESH_EXC_4096	Threshold counter exceed a count of 4096.	0
PM_THRESH_EXC_512	Threshold counter exceeded a value of 512.	1
PM_THRESH_EXC_64	Threshold counter exceeded a value of 64.	2
PM_THRESH_MET	threshold exceeded.	0
PM_1FLOP_CMPL	one flop (fadd, fmul, fsub, fcmp, fsel, fabs, fnabs, fres, fsqrte, fneg) operation completed	3
PM_1PLUS_PPC_CMPL	1 or more ppc insts finished	0
PM_1PLUS_PPC_DISP	Cycles at least one Instr Dispatched	3
PM_2FLOP_CMPL	DP vector version of fmul, fsub, fcmp, fsel, fabs, fnabs, fres ,fsqrte, fneg	3
PM_4FLOP_CMPL	4 FLOP instruction completed	3
PM_8FLOP_CMPL	8 FLOP instruction completed	3
PM_ANY_THRD_RUN_CYC	Cycles in which at least one thread has the run latch set	0
PM_BACK_BR_CMPL	Branch instruction completed with a target address less than current instruction address	1
PM_BANK_CONFLICT	Read blocked due to interleave conflict. The ifar logic will detect an interleave conflict and kill the data that was read that cycle.	0, 1, 2, 3
PM_BFU_BUSY	Cycles in which all 4 Binary Floating Point units are busy. The BFU is running at capacity	2
PM_BR_2PATH	Branches that are not strongly biased	1
PM_BR_2PATH	Branches that are not strongly biased	3
PM_BR_CMPL	Any Branch instruction completed	3
PM_BR_CORECT_PRED_TAKEN_CMPL	Conditional Branch Completed in which the HW correctly predicted the direction as taken. Counted at completion time	0, 1, 2, 3
PM_BR_MPRED_CCACHE	Conditional Branch Completed that was Mispredicted due to the Count Cache Target Prediction	0, 1, 2, 3
PM_BR_MPRED_CMPL	Number of Branch Mispredicts	3
PM_BR_MPRED_LSTACK	Conditional Branch Completed that was Mispredicted due to the Link Stack Target Prediction	0, 1, 2, 3
PM_BR_MPRED_PCACHE	Conditional Branch Completed that was Mispredicted due to pattern cache prediction	0, 1, 2, 3
PM_BR_MPRED_TAKEN_CR	A Conditional Branch that resolved to taken was mispredicted as not taken (due to the BHT Direction Prediction).	0, 1, 2, 3
PM_BR_MPRED_TAKEN_TA	Conditional Branch Completed that was Mispredicted due to the Target Address Prediction from the Count Cache or Link Stack. Only XL-form branches that resolved Taken set this event.	0, 1, 2, 3
PM_BR_PRED	Conditional Branch Executed in which the HW predicted the Direction or Target. Includes taken and not taken and is counted at execution time	0, 1, 2, 3
PM_BR_PRED_CCACHE	Conditional Branch Completed that used the Count Cache for Target Prediction	0, 1, 2, 3
PM_BR_PRED_LSTACK	Conditional Branch Completed that used the Link Stack for Target Prediction	0, 1, 2, 3
PM_BR_PRED_PCACHE	Conditional branch completed that used pattern cache prediction	0, 1, 2, 3
PM_BR_PRED_TA	Conditional Branch Completed that had its target address predicted. Only XL-form branches set this event. This equal the sum of CCACHE, LSTACK, and PCACHE	0, 1, 2, 3
PM_BR_PRED_TAKEN_CR	Conditional Branch that had its direction predicted. I-form branches do not set this event. In addition, B-form branches which do not use the BHT do not set this event - these are branches with BO-field set to 'always taken' and branches	0, 1, 2, 3
PM_BR_TAKEN_CMPL	New event for Branch Taken	1
PM_BRU_FIN	Branch Instruction Finished	0
PM_BR_UNCOND	Unconditional Branch Completed. HW branch prediction was not used for this branch. This can be an I-form branch, a B-form branch with BO-field set to branch always, or a B-form branch which was covenrted to a Resolve.	0, 1, 2, 3
PM_BTAC_BAD_RESULT	BTAC thinks branch will be taken but it is either predicted not-taken by the BHT, or the target address is wrong (less common). In both cases, a redirect will happen	0, 1, 2, 3
PM_BTAC_GOOD_RESULT	BTAC predicts a taken branch and the BHT agrees, and the target address is correct	0, 1, 2, 3
PM_CHIP_PUMP_CPRED	Initial and Final Pump Scope was chip pump (prediction=correct) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	0
PM_CLB_HELD	CLB (control logic block - indicates quadword fetch block) Hold: Any Reason	0, 1, 2, 3
PM_CMPLU_STALL	Nothing completed and ICT not empty	0
PM_CMPLU_STALL_ANY_SYNC	Cycles in which the NTC sync instruction (isync, lwsync or hwsync) is not allowed to complete	0
PM_CMPLU_STALL_BRU	Completion stall due to a Branch Unit	3
PM_CMPLU_STALL_CRYPTO	Finish stall because the NTF instruction was routed to the crypto execution pipe and was waiting to finish	3
PM_CMPLU_STALL_DCACHE_MISS	Finish stall because the NTF instruction was a load that missed the L1 and was waiting for the data to return from the nest	1
PM_CMPLU_STALL_DFLONG	Finish stall because the NTF instruction was a multi-cycle instruction issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Qualified by multicycle	0
PM_CMPLU_STALL_DFU	Finish stall because the NTF instruction was issued to the Decimal Floating Point execution pipe and waiting to finish. Includes decimal floating point instructions + 128 bit binary floating point instructions. Not qualified by multicycle	1
PM_CMPLU_STALL_DMISS_L21_L31	Completion stall by Dcache miss which resolved on chip ( excluding local L2/L3)	1
PM_CMPLU_STALL_DMISS_L2L3	Completion stall by Dcache miss which resolved in L2/L3	0
PM_CMPLU_STALL_DMISS_L2L3_CONFLICT	Completion stall due to cache miss that resolves in the L2 or L3 with a conflict	3
PM_CMPLU_STALL_DMISS_L3MISS	Completion stall due to cache miss resolving missed the L3	3
PM_CMPLU_STALL_DMISS_LMEM	Completion stall due to cache miss that resolves in local memory	2
PM_CMPLU_STALL_DMISS_REMOTE	Completion stall by Dcache miss which resolved from remote chip (cache or memory)	1
PM_CMPLU_STALL_DP	Finish stall because the NTF instruction was a scalar instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by NOT vector	0
PM_CMPLU_STALL_DPLONG	Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle	2
PM_CMPLU_STALL_EIEIO	Finish stall because the NTF instruction is an EIEIO waiting for response from L2	3
PM_CMPLU_STALL_EMQ_FULL	Finish stall because the next to finish instruction suffered an ERAT miss and the EMQ was full	2
PM_CMPLU_STALL_ERAT_MISS	Finish stall because the NTF instruction was a load or store that suffered a translation miss	3
PM_CMPLU_STALL_EXCEPTION	Cycles in which the NTC instruction is not allowed to complete because it was interrupted by ANY exception, which has to be serviced before the instruction can complete	2
PM_CMPLU_STALL_EXEC_UNIT	Completion stall due to execution units (FXU/VSU/CRU)	1
PM_CMPLU_STALL_FLUSH_ANY_THREAD	Cycles in which the NTC instruction is not allowed to complete because any of the 4 threads in the same core suffered a flush, which blocks completion	0
PM_CMPLU_STALL_FXLONG	Completion stall due to a long latency scalar fixed point instruction (division, square root)	3
PM_CMPLU_STALL_FXU	Finish stall due to a scalar fixed point or CR instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes	1
PM_CMPLU_STALL_HWSYNC	completion stall due to hwsync	2
PM_CMPLU_STALL_LARX	Finish stall because the NTF instruction was a larx waiting to be satisfied	0
PM_CMPLU_STALL_LHS	Finish stall because the NTF instruction was a load that hit on an older store and it was waiting for store data	1
PM_CMPLU_STALL_LMQ_FULL	Finish stall because the NTF instruction was a load that missed in the L1 and the LMQ was unable to accept this load miss request because it was full	3
PM_CMPLU_STALL_LOAD_FINISH	Finish stall because the NTF instruction was a load instruction with all its dependencies satisfied just going through the LSU pipe to finish	3
PM_CMPLU_STALL_LRQ_FULL	Finish stall because the NTF instruction was a load that was held in LSAQ (load-store address queue) because the LRQ (load-reorder queue) was full	1
PM_CMPLU_STALL_LRQ_OTHER	Finish stall due to LRQ miscellaneous reasons, lost arbitration to LMQ slot, bank collisions, set prediction cleanup, set prediction multihit and others	0
PM_CMPLU_STALL_LSAQ_ARB	Finish stall because the NTF instruction was a load or store that was held in LSAQ because an older instruction from SRQ or LRQ won arbitration to the LSU pipe when this instruction tried to launch	3
PM_CMPLU_STALL_LSU	Completion stall by LSU instruction	1
PM_CMPLU_STALL_LSU_FIN	Finish stall because the NTF instruction was an LSU op (other than a load or a store) with all its dependencies met and just going through the LSU pipe to finish	0
PM_CMPLU_STALL_LSU_FLUSH_NEXT	Completion stall of one cycle because the LSU requested to flush the next iop in the sequence. It takes 1 cycle for the ISU to process this request before the LSU instruction is allowed to complete	1
PM_CMPLU_STALL_LSU_MFSPR	Finish stall because the NTF instruction was a mfspr instruction targeting an LSU SPR and it was waiting for the register data to be returned	2
PM_CMPLU_STALL_LWSYNC	completion stall due to lwsync	0
PM_CMPLU_STALL_MTFPSCR	Completion stall because the ISU is updating the register and notifying the Effective Address Table (EAT)	3
PM_CMPLU_STALL_NESTED_TBEGIN	Completion stall because the ISU is updating the TEXASR to keep track of the nested tbegin. This is a short delay, and it includes ROT	0
PM_CMPLU_STALL_NESTED_TEND	Completion stall because the ISU is updating the TEXASR to keep track of the nested tend and decrement the TEXASR nested level. This is a short delay	2
PM_CMPLU_STALL_NTC_DISP_FIN	Finish stall because the NTF instruction was one that must finish at dispatch.	3
PM_CMPLU_STALL_NTC_FLUSH	Completion stall due to ntc flush	1
PM_CMPLU_STALL_OTHER_CMPL	Instructions the core completed while this tread was stalled	2
PM_CMPLU_STALL_PASTE	Finish stall because the NTF instruction was a paste waiting for response from L2	1
PM_CMPLU_STALL_PM	Finish stall because the NTF instruction was issued to the Permute execution pipe and waiting to finish. Includes permute and decimal fixed point instructions (128 bit BCD arithmetic) + a few 128 bit fixpoint add/subtract instructions with carry. Not qualified by vector or multicycle	2
PM_CMPLU_STALL_SLB	Finish stall because the NTF instruction was awaiting L2 response for an SLB	0
PM_CMPLU_STALL_SPEC_FINISH	Finish stall while waiting for the non-speculative finish of either a stcx waiting for its result or a load waiting for non-critical sectors of data and ECC	2
PM_CMPLU_STALL_SRQ_FULL	Finish stall because the NTF instruction was a store that was held in LSAQ because the SRQ was full	2
PM_CMPLU_STALL_STCX	Finish stall because the NTF instruction was a stcx waiting for response from L2	1
PM_CMPLU_STALL_ST_FWD	Completion stall due to store forward	3
PM_CMPLU_STALL_STORE_DATA	Finish stall because the next to finish instruction was a store waiting on data	2
PM_CMPLU_STALL_STORE_FIN_ARB	Finish stall because the NTF instruction was a store waiting for a slot in the store finish pipe. This means the instruction is ready to finish but there are instructions ahead of it, using the finish pipe	2
PM_CMPLU_STALL_STORE_FINISH	Finish stall because the NTF instruction was a store with all its dependencies met, just waiting to go through the LSU pipe to finish	1
PM_CMPLU_STALL_STORE_PIPE_ARB	Finish stall because the NTF instruction was a store waiting for the next relaunch opportunity after an internal reject. This means the instruction is ready to relaunch and tried once but lost arbitration	3
PM_CMPLU_STALL_SYNC_PMU_INT	Cycles in which the NTC instruction is waiting for a synchronous PMU interrupt	1
PM_CMPLU_STALL_TEND	Finish stall because the NTF instruction was a tend instruction awaiting response from L2	0
PM_CMPLU_STALL_THRD	Completion Stalled because the thread was blocked	0
PM_CMPLU_STALL_TLBIE	Finish stall because the NTF instruction was a tlbie waiting for response from L2	1
PM_CMPLU_STALL_VDP	Finish stall because the NTF instruction was a vector instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Not qualified multicycle. Qualified by vector	3
PM_CMPLU_STALL_VDPLONG	Finish stall because the NTF instruction was a scalar multi-cycle instruction issued to the Double Precision execution pipe and waiting to finish. Includes binary floating point instructions in 32 and 64 bit binary floating point format. Qualified by NOT vector AND multicycle	2
PM_CMPLU_STALL_VFXLONG	Completion stall due to a long latency vector fixed point instruction (division, square root)	1
PM_CMPLU_STALL_VFXU	Finish stall due to a vector fixed point instruction in the execution pipeline. These instructions get routed to the ALU, ALU2, and DIV pipes	2
PM_CO0_BUSY	CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)	2
PM_CO0_BUSY	CO mach 0 Busy. Used by PMU to sample ave CO lifetime (mach0 used as sample point)	3
PM_CO_DISP_FAIL	CO dispatch failed due to all CO machines being busy	0
PM_CO_TM_SC_FOOTPRINT	L2 did a cleanifdirty CO to the L3 (ie created an SC line in the L3) OR L2 TM_store hit dirty HPC line and L3 indicated SC line formed in L3 on RDR bus	1
PM_CO_USAGE	Continuous 16 cycle (2to1) window where this signals rotates thru sampling each CO machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running	1
PM_CYC	Processor cycles	0
PM_CYC	Processor cycles	1
PM_CYC	Processor cycles	2
PM_CYC	Processor cycles	3
PM_DARQ0_0_3_ENTRIES	Cycles in which 3 or less DARQ entries (out of 12) are in use	3
PM_DARQ0_10_12_ENTRIES	Cycles in which 10 or more DARQ entries (out of 12) are in use	0
PM_DARQ0_4_6_ENTRIES	Cycles in which 4, 5, or 6 DARQ entries (out of 12) are in use	2
PM_DARQ0_7_9_ENTRIES	Cycles in which 7,8, or 9 DARQ entries (out of 12) are in use	1
PM_DARQ1_0_3_ENTRIES	Cycles in which 3 or fewer DARQ1 entries (out of 12) are in use	3
PM_DARQ1_10_12_ENTRIES	Cycles in which 10 or more DARQ1 entries (out of 12) are in use	1
PM_DARQ1_4_6_ENTRIES	Cycles in which 4, 5, or 6 DARQ1 entries (out of 12) are in use	2
PM_DARQ1_7_9_ENTRIES	Cycles in which 7 to 9 DARQ1 entries (out of 12) are in use	1
PM_DARQ_STORE_REJECT	The DARQ attempted to transmit a store into an LSAQ or SRQ entry but It was rejected. Divide by PM_DARQ_STORE_XMIT to get reject ratio	3
PM_DARQ_STORE_XMIT	The DARQ attempted to transmit a store into an LSAQ or SRQ entry. Includes rejects. Not qualified by thread, so it includes counts for the whole core	2
PM_DATA_CHIP_PUMP_CPRED	Initial and Final Pump Scope was chip pump (prediction=correct) for a demand load	0
PM_DATA_FROM_DL2L3_MOD	The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load	3
PM_DATA_FROM_DL2L3_SHR	The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a demand load	2
PM_DATA_FROM_DL4	The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a demand load	2
PM_DATA_FROM_DMEM	The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a demand load	3
PM_DATA_FROM_L2	The processor's data cache was reloaded from local core's L2 due to a demand load	0
PM_DATA_FROM_L21_MOD	The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a demand load	3
PM_DATA_FROM_L21_SHR	The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a demand load	2
PM_DATA_FROM_L2_DISP_CONFLICT_LDHITST	The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a demand load	2
PM_DATA_FROM_L2_DISP_CONFLICT_OTHER	The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a demand load	3
PM_DATA_FROM_L2_MEPF	The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state due to a demand load	1
PM_DATA_FROM_L2MISS	Demand LD - L2 Miss (not L2 hit)	1
PM_DATA_FROM_L2MISS_MOD	The processor's data cache was reloaded from a location other than the local core's L2 due to a demand load	0
PM_DATA_FROM_L2_NO_CONFLICT	The processor's data cache was reloaded from local core's L2 without conflict due to a demand load	0
PM_DATA_FROM_L3	The processor's data cache was reloaded from local core's L3 due to a demand load	3
PM_DATA_FROM_L31_ECO_MOD	The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a demand load	3
PM_DATA_FROM_L31_ECO_SHR	The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a demand load	2
PM_DATA_FROM_L31_MOD	The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a demand load	1
PM_DATA_FROM_L31_SHR	The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a demand load	0
PM_DATA_FROM_L3_DISP_CONFLICT	The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a demand load	2
PM_DATA_FROM_L3_MEPF	The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state due to a demand load	1
PM_DATA_FROM_L3MISS	Demand LD - L3 Miss (not L2 hit and not L3 hit)	2
PM_DATA_FROM_L3MISS_MOD	The processor's data cache was reloaded from a location other than the local core's L3 due to a demand load	3
PM_DATA_FROM_L3_NO_CONFLICT	The processor's data cache was reloaded from local core's L3 without conflict due to a demand load	0
PM_DATA_FROM_LL4	The processor's data cache was reloaded from the local chip's L4 cache due to a demand load	0
PM_DATA_FROM_LMEM	The processor's data cache was reloaded from the local chip's Memory due to a demand load	1
PM_DATA_FROM_MEMORY	The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a demand load	3
PM_DATA_FROM_OFF_CHIP_CACHE	The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a demand load	3
PM_DATA_FROM_ON_CHIP_CACHE	The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a demand load	0
PM_DATA_FROM_RL2L3_MOD	The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load	1
PM_DATA_FROM_RL2L3_SHR	The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a demand load	0
PM_DATA_FROM_RL4	The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a demand load	1
PM_DATA_FROM_RMEM	The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a demand load	2
PM_DATA_GRP_PUMP_CPRED	Initial and Final Pump Scope was group pump (prediction=correct) for a demand load	1
PM_DATA_GRP_PUMP_MPRED	ended up either larger or smaller than Initial Pump Scope for a demand load	1	Final Pump Scope Group
PM_DATA_GRP_PUMP_MPRED_RTY	ended up larger than Initial Pump Scope (Chip) for a demand load	0	Final Pump Scope Group
PM_DATA_PUMP_CPRED	Pump prediction correct. Counts across all types of pumps for a demand load	0
PM_DATA_PUMP_MPRED	Pump misprediction. Counts across all types of pumps for a demand load	3
PM_DATA_STORE	All ops that drain from s2q to L2 containing data	0, 1, 2, 3
PM_DATA_SYS_PUMP_CPRED	Initial and Final Pump Scope was system pump (prediction=correct) for a demand load	2
PM_DATA_SYS_PUMP_MPRED	Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for a demand load	2
PM_DATA_SYS_PUMP_MPRED_RTY	Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for a demand load	3
PM_DATA_TABLEWALK_CYC	Data Tablewalk Cycles. Could be 1 or 2 active tablewalks. Includes data prefetches.	2
PM_DC_DEALLOC_NO_CONF	A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)	0, 1, 2, 3
PM_DC_PREF_CONF	A demand load referenced a line in an active prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software. Includes forwards and backwards streams	0, 1, 2, 3
PM_DC_PREF_CONS_ALLOC	Prefetch stream allocated in the conservative phase by either the hardware prefetch mechanism or software prefetch	0, 1, 2, 3
PM_DC_PREF_FUZZY_CONF	A demand load referenced a line in an active fuzzy prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.Fuzzy stream confirm (out of order effects, or pf cant keep up)	0, 1, 2, 3
PM_DC_PREF_HW_ALLOC	Prefetch stream allocated by the hardware prefetch mechanism	0, 1, 2, 3
PM_DC_PREF_STRIDED_CONF	A demand load referenced a line in an active strided prefetch stream. The stream could have been allocated through the hardware prefetch mechanism or through software.	0, 1, 2, 3
PM_DC_PREF_SW_ALLOC	Prefetch stream allocated by software prefetching	0, 1, 2, 3
PM_DC_PREF_XCONS_ALLOC	Prefetch stream allocated in the Ultra conservative phase by either the hardware prefetch mechanism or software prefetch	0, 1, 2, 3
PM_DECODE_FUSION_CONST_GEN	32-bit constant generation	0, 1, 2, 3
PM_DECODE_FUSION_EXT_ADD	32-bit extended addition	0, 1, 2, 3
PM_DECODE_FUSION_LD_ST_DISP	32-bit displacement D-form and 16-bit displacement X-form	0, 1, 2, 3
PM_DECODE_FUSION_OP_PRESERV	Destructive op operand preservation	0, 1, 2, 3
PM_DECODE_HOLD_ICT_FULL	Counts the number of cycles in which the IFU was not able to decode and transmit one or more instructions because all itags were in use. This means the ICT is full for this thread	0, 1, 2, 3
PM_DECODE_LANES_NOT_AVAIL	Decode has something to transmit but dispatch lanes are not available	0, 1, 2, 3
PM_DERAT_MISS_16G	Data ERAT Miss (Data TLB Access) page size 16G	3
PM_DERAT_MISS_16M	Data ERAT Miss (Data TLB Access) page size 16M	2
PM_DERAT_MISS_1G	Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation	1
PM_DERAT_MISS_2M	Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation	0
PM_DERAT_MISS_4K	Data ERAT Miss (Data TLB Access) page size 4K	0
PM_DERAT_MISS_64K	Data ERAT Miss (Data TLB Access) page size 64K	1
PM_DFU_BUSY	Cycles in which all 4 Decimal Floating Point units are busy. The DFU is running at capacity	3
PM_DISP_CLB_HELD_BAL	Dispatch/CLB Hold: Balance Flush	0, 1, 2, 3
PM_DISP_CLB_HELD_SB	Dispatch/CLB Hold: Scoreboard	0, 1, 2, 3
PM_DISP_CLB_HELD_TLBIE	Dispatch Hold: Due to TLBIE	0, 1, 2, 3
PM_DISP_HELD	Dispatch Held	0
PM_DISP_HELD_HB_FULL	Dispatch held due to History Buffer full. Could be GPR/VSR/VMR/FPR/CR/XVF	2
PM_DISP_HELD_ISSQ_FULL	Dispatch held due to Issue q full. Includes issue queue and branch queue	1
PM_DISP_HELD_SYNC_HOLD	Cycles in which dispatch is held because of a synchronizing instruction in the pipeline	3
PM_DISP_HELD_TBEGIN	This outer tbegin transaction cannot be dispatched until the previous tend instruction completes	0, 1, 2, 3
PM_DISP_STARVED	Dispatched Starved	2
PM_DP_QP_FLOP_CMPL	Double-Precion or Quad-Precision instruction completed	3
PM_DPTEG_FROM_DL2L3_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_DL2L3_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_DPTEG_FROM_DL4	A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_DPTEG_FROM_DMEM	A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_L2	A Page Table Entry was loaded into the TLB from local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_L21_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_L21_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_DPTEG_FROM_L2_MEPF	A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_L2MISS	A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_L2_NO_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_L3	A Page Table Entry was loaded into the TLB from local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_L31_ECO_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_L31_ECO_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_DPTEG_FROM_L31_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_L31_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_L3_DISP_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_DPTEG_FROM_L3_MEPF	A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_L3MISS	A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_L3_NO_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_LL4	A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_LMEM	A Page Table Entry was loaded into the TLB from the local chip's Memory due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_MEMORY	A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_OFF_CHIP_CACHE	A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_DPTEG_FROM_ON_CHIP_CACHE	A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_RL2L3_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_RL2L3_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_DPTEG_FROM_RL4	A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_DPTEG_FROM_RMEM	A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_DSIDE_L2MEMACC	Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory (excluding hpcread64 accesses), i.e., total memory accesses by RCs	2
PM_DSIDE_MRU_TOUCH	D-side L2 MRU touch sent to L2	1
PM_DSIDE_OTHER_64B_L2MEMACC	Valid when first beat of data comes in for an D-side fetch where data came EXCLUSIVELY from memory that was for hpc_read64, (RC had to fetch other 64B of a line from MC) i.e., number of times RC had to go to memory to get 'missing' 64B	2
PM_DSLB_MISS	Data SLB Miss - Total of all segment sizes	0, 1, 2, 3
PM_DSLB_MISS	gate_and(sd_pc_c0_comp_valid AND sd_pc_c0_comp_thread(0:1)=tid,sd_pc_c0_comp_ppc_count(0:3)) + gate_and(sd_pc_c1_comp_valid AND sd_pc_c1_comp_thread(0:1)=tid,sd_pc_c1_comp_ppc_count(0:3))	0
PM_DTLB_MISS	Data PTEG reload	2
PM_DTLB_MISS_16G	Data TLB Miss page size 16G	0
PM_DTLB_MISS_16M	Data TLB Miss page size 16M	3
PM_DTLB_MISS_1G	Data TLB reload (after a miss) page size 1G. Implies radix translation was used	3
PM_DTLB_MISS_2M	Data TLB reload (after a miss) page size 2M. Implies radix translation was used	0
PM_DTLB_MISS_4K	Data TLB Miss page size 4k	1
PM_DTLB_MISS_64K	Data TLB Miss page size 64K	2
PM_EAT_FORCE_MISPRED	XL-form branch was mispredicted due to the predicted target address missing from EAT. The EAT forces a mispredict in this case since there is no predicated target to validate. This is a rare case that may occur when the EAT is full and a branch is issued	0, 1, 2, 3
PM_EAT_FULL_CYC	Cycles No room in EAT	0, 1, 2, 3
PM_EE_OFF_EXT_INT	CyclesMSR[EE] is off and external interrupts are active	0, 1, 2, 3
PM_EXT_INT	external interrupt	1
PM_FLOP_CMPL	Floating Point Operation Finished	3
PM_FLUSH	Flush (any type)	3
PM_FLUSH_COMPLETION	The instruction that was next to complete did not complete because it suffered a flush	2
PM_FLUSH_DISP	Dispatch flush	0, 1, 2, 3
PM_FLUSH_DISP_SB	Dispatch Flush: Scoreboard	0, 1, 2, 3
PM_FLUSH_DISP_TLBIE	Dispatch Flush: TLBIE	0, 1, 2, 3
PM_FLUSH_HB_RESTORE_CYC	Cycles in which no new instructions can be dispatched to the ICT after a flush. History buffer recovery	0, 1, 2, 3
PM_FLUSH_LSU	LSU flushes. Includes all lsu flushes	0, 1, 2, 3
PM_FLUSH_MPRED	Branch mispredict flushes. Includes target and address misprecition	0, 1, 2, 3
PM_FMA_CMPL	two flops operation completed (fmadd, fnmadd, fmsub, fnmsub) Scalar instructions only.	3
PM_FORCED_NOP	Instruction was forced to execute as a nop because it was found to behave like a nop (have no effect) at decode time	0, 1, 2, 3
PM_FREQ_DOWN	Power Management: Below Threshold B	2
PM_FREQ_UP	Power Management: Above Threshold A	3
PM_FXU_1PLUS_BUSY	At least one of the 4 FXU units is busy	2
PM_FXU_BUSY	Cycles in which all 4 FXUs are busy. The FXU is running at capacity	1
PM_FXU_FIN	The fixed point unit Unit finished an instruction. Instructions that finish may not necessary complete.	3
PM_FXU_IDLE	Cycles in which FXU0, FXU1, FXU2, and FXU3 are all idle	1
PM_GRP_PUMP_CPRED	Initial and Final Pump Scope and data sourced across this scope was group pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	1
PM_GRP_PUMP_MPRED	ended up either larger or smaller than Initial Pump Scope for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	1	Final Pump Scope Group
PM_GRP_PUMP_MPRED_RTY	ended up larger than Initial Pump Scope (Chip) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	0	Final Pump Scope Group
PM_HV_CYC	Cycles in which msr_hv is high. Note that this event does not take msr_pr into consideration	1
PM_HWSYNC	Hwsync instruction decoded and transferred	0, 1, 2, 3
PM_IBUF_FULL_CYC	Cycles No room in ibuff	0, 1, 2, 3
PM_IC_DEMAND_CYC	ended up larger than Initial Pump Scope (Chip) for a demand load	0	Final Pump Scope Group
PM_IC_DEMAND_L2_BHT_REDIRECT	L2 I cache demand request due to BHT redirect, branch redirect ( 2 bubbles 3 cycles)	0, 1, 2, 3
PM_IC_DEMAND_L2_BR_REDIRECT	L2 I cache demand request due to branch Mispredict ( 15 cycle path)	0, 1, 2, 3
PM_IC_DEMAND_REQ	Demand Instruction fetch request	0, 1, 2, 3
PM_IC_INVALIDATE	Ic line invalidated	0, 1, 2, 3
PM_IC_MISS_CMPL	Non-speculative icache miss, counted at completion	3
PM_IC_MISS_ICBI	threaded version, IC Misses where we got EA dir hit but no sector valids were on. ICBI took line out	0, 1, 2, 3
PM_IC_PREF_CANCEL_HIT	Prefetch Canceled due to icache hit	0, 1, 2, 3
PM_IC_PREF_CANCEL_L2	L2 Squashed a demand or prefetch request	0, 1, 2, 3
PM_IC_PREF_CANCEL_PAGE	Prefetch Canceled due to page boundary	0, 1, 2, 3
PM_IC_PREF_REQ	Instruction prefetch requests	0, 1, 2, 3
PM_IC_PREF_WRITE	Instruction prefetch written into IL1	0, 1, 2, 3
PM_IC_RELOAD_PRIVATE	Reloading line was brought in private for a specific thread. Most lines are brought in shared for all eight threads. If RA does not match then invalidates and then brings it shared to other thread. In P7 line brought in private , then line was invalidat	0, 1, 2, 3
PM_ICT_EMPTY_CYC	Cycles in which the ICT is completely empty. No itags are assigned to any thread	1
PM_ICT_NOSLOT_BR_MPRED	Ict empty for this thread due to branch mispred	3
PM_ICT_NOSLOT_BR_MPRED_ICMISS	Ict empty for this thread due to Icache Miss and branch mispred	2
PM_ICT_NOSLOT_CYC	Number of cycles the ICT has no itags assigned to this thread	0
PM_ICT_NOSLOT_DISP_HELD	Cycles in which the NTC instruction is held at dispatch for any reason	3
PM_ICT_NOSLOT_DISP_HELD_HB_FULL	Ict empty for this thread due to dispatch holds because the History Buffer was full. Could be GPR/VSR/VMR/FPR/CR/XVF	2
PM_ICT_NOSLOT_DISP_HELD_ISSQ	Ict empty for this thread due to dispatch hold on this thread due to Issue q full, BRQ full, XVCF Full, Count cache, Link, Tar full	1
PM_ICT_NOSLOT_DISP_HELD_SYNC	Dispatch held due to a synchronizing instruction at dispatch	3
PM_ICT_NOSLOT_DISP_HELD_TBEGIN	the NTC instruction is being held at dispatch because it is a tbegin instruction and there is an older tbegin in the pipeline that must complete before the younger tbegin can dispatch	0
PM_ICT_NOSLOT_IC_L3	Ict empty for this thread due to icache misses that were sourced from the local L3	2
PM_ICT_NOSLOT_IC_L3MISS	Ict empty for this thread due to icache misses that were sourced from beyond the local L3. The source could be local/remote/distant memory or another core's cache	3
PM_ICT_NOSLOT_IC_MISS	Ict empty for this thread due to Icache Miss	1
PM_IERAT_RELOAD	Number of I-ERAT reloads	0
PM_IERAT_RELOAD_16M	IERAT Reloaded (Miss) for a 16M page	3
PM_IERAT_RELOAD_4K	IERAT reloaded (after a miss) for 4K pages	1
PM_IERAT_RELOAD_64K	IERAT Reloaded (Miss) for a 64k page	2
PM_IFETCH_THROTTLE	Cycles in which Instruction fetch throttle was active.	2
PM_INST_CHIP_PUMP_CPRED	Initial and Final Pump Scope was chip pump (prediction=correct) for an instruction fetch	0
PM_INST_CMPL	Number of PowerPC Instructions that completed.	0
PM_INST_CMPL	Number of PowerPC Instructions that completed.	1
PM_INST_CMPL	Number of PowerPC Instructions that completed.	2
PM_INST_CMPL	Number of PowerPC Instructions that completed.	3
PM_INST_DISP	# PPC Dispatched	1
PM_INST_DISP	# PPC Dispatched	2
PM_INST_FROM_DL2L3_MOD	The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)	3
PM_INST_FROM_DL2L3_SHR	The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to an instruction fetch (not prefetch)	2
PM_INST_FROM_DL4	The processor's Instruction cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to an instruction fetch (not prefetch)	2
PM_INST_FROM_DMEM	The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group (Distant) due to an instruction fetch (not prefetch)	3
PM_INST_FROM_L1	Instruction fetches from L1. L1 instruction hit	0, 1, 2, 3
PM_INST_FROM_L2	The processor's Instruction cache was reloaded from local core's L2 due to an instruction fetch (not prefetch)	0
PM_INST_FROM_L21_MOD	The processor's Instruction cache was reloaded with Modified (M) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)	3
PM_INST_FROM_L21_SHR	The processor's Instruction cache was reloaded with Shared (S) data from another core's L2 on the same chip due to an instruction fetch (not prefetch)	2
PM_INST_FROM_L2_DISP_CONFLICT_LDHITST	The processor's Instruction cache was reloaded from local core's L2 with load hit store conflict due to an instruction fetch (not prefetch)	2
PM_INST_FROM_L2_DISP_CONFLICT_OTHER	The processor's Instruction cache was reloaded from local core's L2 with dispatch conflict due to an instruction fetch (not prefetch)	3
PM_INST_FROM_L2_MEPF	The processor's Instruction cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to an instruction fetch (not prefetch)	1
PM_INST_FROM_L2MISS	The processor's Instruction cache was reloaded from a location other than the local core's L2 due to an instruction fetch (not prefetch)	0
PM_INST_FROM_L2_NO_CONFLICT	The processor's Instruction cache was reloaded from local core's L2 without conflict due to an instruction fetch (not prefetch)	0
PM_INST_FROM_L3	The processor's Instruction cache was reloaded from local core's L3 due to an instruction fetch (not prefetch)	3
PM_INST_FROM_L31_ECO_MOD	The processor's Instruction cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)	3
PM_INST_FROM_L31_ECO_SHR	The processor's Instruction cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to an instruction fetch (not prefetch)	2
PM_INST_FROM_L31_MOD	The processor's Instruction cache was reloaded with Modified (M) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)	1
PM_INST_FROM_L31_SHR	The processor's Instruction cache was reloaded with Shared (S) data from another core's L3 on the same chip due to an instruction fetch (not prefetch)	0
PM_INST_FROM_L3_DISP_CONFLICT	The processor's Instruction cache was reloaded from local core's L3 with dispatch conflict due to an instruction fetch (not prefetch)	2
PM_INST_FROM_L3_MEPF	The processor's Instruction cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to an instruction fetch (not prefetch)	1
PM_INST_FROM_L3MISS	Marked instruction was reloaded from a location beyond the local chiplet	2
PM_INST_FROM_L3MISS_MOD	The processor's Instruction cache was reloaded from a location other than the local core's L3 due to a instruction fetch	3
PM_INST_FROM_L3_NO_CONFLICT	The processor's Instruction cache was reloaded from local core's L3 without conflict due to an instruction fetch (not prefetch)	0
PM_INST_FROM_LL4	The processor's Instruction cache was reloaded from the local chip's L4 cache due to an instruction fetch (not prefetch)	0
PM_INST_FROM_LMEM	The processor's Instruction cache was reloaded from the local chip's Memory due to an instruction fetch (not prefetch)	1
PM_INST_FROM_MEMORY	The processor's Instruction cache was reloaded from a memory location including L4 from local remote or distant due to an instruction fetch (not prefetch)	1
PM_INST_FROM_OFF_CHIP_CACHE	The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to an instruction fetch (not prefetch)	3
PM_INST_FROM_ON_CHIP_CACHE	The processor's Instruction cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to an instruction fetch (not prefetch)	0
PM_INST_FROM_RL2L3_MOD	The processor's Instruction cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)	1
PM_INST_FROM_RL2L3_SHR	The processor's Instruction cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to an instruction fetch (not prefetch)	0
PM_INST_FROM_RL4	The processor's Instruction cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)	1
PM_INST_FROM_RMEM	The processor's Instruction cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to an instruction fetch (not prefetch)	2
PM_INST_GRP_PUMP_CPRED	Initial and Final Pump Scope was group pump (prediction=correct) for an instruction fetch (demand only)	1
PM_INST_GRP_PUMP_MPRED	ended up either larger or smaller than Initial Pump Scope for an instruction fetch (demand only)	1	Final Pump Scope Group
PM_INST_GRP_PUMP_MPRED_RTY	ended up larger than Initial Pump Scope (Chip) for an instruction fetch	0	Final Pump Scope Group
PM_INST_IMC_MATCH_CMPL	IMC Match Count	3
PM_INST_PUMP_CPRED	Pump prediction correct. Counts across all types of pumps for an instruction fetch	0
PM_INST_PUMP_MPRED	Pump misprediction. Counts across all types of pumps for an instruction fetch	3
PM_INST_SYS_PUMP_CPRED	Initial and Final Pump Scope was system pump (prediction=correct) for an instruction fetch	2
PM_INST_SYS_PUMP_MPRED	Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for an instruction fetch	2
PM_INST_SYS_PUMP_MPRED_RTY	Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for an instruction fetch	3
PM_IOPS_CMPL	Internal Operations completed	1
PM_IPTEG_FROM_DL2L3_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request	3
PM_IPTEG_FROM_DL2L3_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a instruction side request	2
PM_IPTEG_FROM_DL4	A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a instruction side request	2
PM_IPTEG_FROM_DMEM	A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a instruction side request	3
PM_IPTEG_FROM_L2	A Page Table Entry was loaded into the TLB from local core's L2 due to a instruction side request	0
PM_IPTEG_FROM_L21_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a instruction side request	3
PM_IPTEG_FROM_L21_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a instruction side request	2
PM_IPTEG_FROM_L2_MEPF	A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a instruction side request	1
PM_IPTEG_FROM_L2MISS	A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a instruction side request	0
PM_IPTEG_FROM_L2_NO_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a instruction side request	0
PM_IPTEG_FROM_L3	A Page Table Entry was loaded into the TLB from local core's L3 due to a instruction side request	3
PM_IPTEG_FROM_L31_ECO_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a instruction side request	3
PM_IPTEG_FROM_L31_ECO_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a instruction side request	2
PM_IPTEG_FROM_L31_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a instruction side request	1
PM_IPTEG_FROM_L31_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a instruction side request	0
PM_IPTEG_FROM_L3_DISP_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a instruction side request	2
PM_IPTEG_FROM_L3_MEPF	A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a instruction side request	1
PM_IPTEG_FROM_L3MISS	A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a instruction side request	3
PM_IPTEG_FROM_L3_NO_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a instruction side request	0
PM_IPTEG_FROM_LL4	A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a instruction side request	0
PM_IPTEG_FROM_LMEM	A Page Table Entry was loaded into the TLB from the local chip's Memory due to a instruction side request	1
PM_IPTEG_FROM_MEMORY	A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a instruction side request	1
PM_IPTEG_FROM_OFF_CHIP_CACHE	A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a instruction side request	3
PM_IPTEG_FROM_ON_CHIP_CACHE	A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a instruction side request	0
PM_IPTEG_FROM_RL2L3_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request	1
PM_IPTEG_FROM_RL2L3_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a instruction side request	0
PM_IPTEG_FROM_RL4	A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a instruction side request	1
PM_IPTEG_FROM_RMEM	A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a instruction side request	2
PM_ISIDE_DISP	All I-side dispatch attempts for this thread (excludes i_l2mru_tch_reqs)	0
PM_ISIDE_DISP_FAIL_ADDR	All I-side dispatch attempts for this thread that failed due to a addr collision with another machine (excludes i_l2mru_tch_reqs)	1
PM_ISIDE_DISP_FAIL_OTHER	All I-side dispatch attempts for this thread that failed due to a reason other than addrs collision (excludes i_l2mru_tch_reqs)	1
PM_ISIDE_L2MEMACC	Valid when first beat of data comes in for an I-side fetch where data came from memory	1
PM_ISIDE_MRU_TOUCH	I-side L2 MRU touch sent to L2 for this thread	3
PM_ISLB_MISS	Instruction SLB Miss - Total of all segment sizes	0, 1, 2, 3
PM_ISLB_MISS	Number of ISLB misses for this thread	3
PM_ISQ_0_8_ENTRIES	Cycles in which 8 or less Issue Queue entries are in use. This is a shared event, not per thread	2
PM_ISQ_36_44_ENTRIES	Cycles in which 36 or more Issue Queue entries are in use. This is a shared event, not per thread. There are 44 issue queue entries across 4 slices in the whole core	3
PM_ISU0_ISS_HOLD_ALL	All ISU rejects	0, 1, 2, 3
PM_ISU1_ISS_HOLD_ALL	All ISU rejects	0, 1, 2, 3
PM_ISU2_ISS_HOLD_ALL	All ISU rejects	0, 1, 2, 3
PM_ISU3_ISS_HOLD_ALL	All ISU rejects	0, 1, 2, 3
PM_ISYNC	Isync completion count per thread	0, 1, 2, 3
PM_ITLB_MISS	ITLB Reloaded. Counts 1 per ITLB miss for HPT but multiple for radix depending on number of levels traveresed	3
PM_L1_DCACHE_RELOADED_ALL	L1 data cache reloaded for demand. If MMCR1[16] is 1, prefetches will be included as well	0
PM_L1_DCACHE_RELOAD_VALID	DL1 reloaded due to Demand Load	2
PM_L1_DEMAND_WRITE	Instruction Demand sectors written into IL1	0, 1, 2, 3
PM_L1_ICACHE_MISS	Demand iCache Miss	1
PM_L1_ICACHE_RELOADED_ALL	Counts all Icache reloads includes demand, prefetch, prefetch turned into demand and demand turned into prefetch	3
PM_L1_ICACHE_RELOADED_PREF	Counts all Icache prefetch reloads ( includes demand turned into prefetch)	2
PM_L1PF_L2MEMACC	Valid when first beat of data comes in for an L1PF where data came from memory	0
PM_L1_PREF	A data line was written to the L1 due to a hardware or software prefetch	1
PM_L1_SW_PREF	Software L1 Prefetches, including SW Transient Prefetches	0, 1, 2, 3
PM_L2_CASTOUT_MOD	L2 Castouts - Modified (M,Mu,Me)	0
PM_L2_CASTOUT_SHR	L2 Castouts - Shared (Tx,Sx)	0
PM_L2_CHIP_PUMP	RC requests that were local (aka chip) pump attempts	3
PM_L2_DC_INV	D-cache invalidates sent over the reload bus to the core	1
PM_L2_DISP_ALL_L2MISS	All successful Ld/St dispatches for this thread that were an L2 miss (excludes i_l2mru_tch_reqs)	3
PM_L2_GROUP_PUMP	RC requests that were on group (aka nodel) pump attempts	3
PM_L2_GRP_GUESS_CORRECT	L2 guess grp (GS or NNS) and guess was correct (data intra-group AND ^on-chip)	1
PM_L2_GRP_GUESS_WRONG	L2 guess grp (GS or NNS) and guess was not correct (ie data on-chip OR beyond-group)	1
PM_L2_IC_INV	I-cache Invalidates sent over the realod bus to the core	1
PM_L2_INST	All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)	2
PM_L2_INST	All successful I-side dispatches for this thread (excludes i_l2mru_tch reqs)	2
PM_L2_INST_MISS	All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)	2
PM_L2_INST_MISS	All successful I-side dispatches that were an L2 miss for this thread (excludes i_l2mru_tch reqs)	3
PM_L2_LD	All successful D-side Load dispatches for this thread (L2 miss + L2 hits)	0
PM_L2_LD_DISP	All successful D-side load dispatches for this thread (L2 miss + L2 hits)	0
PM_L2_LD_DISP	All successful I-or-D side load dispatches for this thread (excludes i_l2mru_tch_reqs)	2
PM_L2_LD_HIT	All successful D-side load dispatches that were L2 hits for this thread	1
PM_L2_LD_HIT	All successful I-or-D side load dispatches for this thread that were L2 hits (excludes i_l2mru_tch_reqs)	2
PM_L2_LD_MISS	All successful D-Side Load dispatches that were an L2 miss for this thread	1
PM_L2_LD_MISS_128B	All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.e., M=0)	0
PM_L2_LD_MISS_64B	All successful D-side load dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B(i.e., M=1)	1
PM_L2_LOC_GUESS_CORRECT	L2 guess local (LNS) and guess was correct (ie data local)	0
PM_L2_LOC_GUESS_WRONG	L2 guess local (LNS) and guess was not correct (ie data not on chip)	0
PM_L2_RCLD_DISP	All I-or-D side load dispatch attempts for this thread (excludes i_l2mru_tch_reqs)	0
PM_L2_RCLD_DISP_FAIL_ADDR	All I-od-D side load dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ machine (excludes i_l2mru_tch_reqs)	0
PM_L2_RCLD_DISP_FAIL_OTHER	All I-or-D side load dispatch attempts for this thread that failed due to reason other than address collision (excludes i_l2mru_tch_reqs)	1
PM_L2_RCST_DISP	All D-side store dispatch attempts for this thread	2
PM_L2_RCST_DISP_FAIL_ADDR	All D-side store dispatch attempts for this thread that failed due to address collision with RC/CO/SN/SQ	2
PM_L2_RCST_DISP_FAIL_OTHER	All D-side store dispatch attempts for this thread that failed due to reason other than address collision	3
PM_L2_RC_ST_DONE	RC did store to line that was Tx or Sx	2
PM_L2_RTY_LD	RC retries on PB for any load from core (excludes DCBFs)	2
PM_L2_RTY_LD	RC retries on PB for any load from core (excludes DCBFs)	2
PM_L2_RTY_ST	RC retries on PB for any store from core (excludes DCBFs)	2
PM_L2_RTY_ST	RC retries on PB for any store from core (excludes DCBFs)	3
PM_L2_SN_M_RD_DONE	SNP dispatched for a read and was M (true M)	3
PM_L2_SN_M_WR_DONE	SNP dispatched for a write and was M (true M)	0
PM_L2_SN_M_WR_DONE	SNP dispatched for a write and was M (true M)	3
PM_L2_SN_SX_I_DONE	SNP dispatched and went from Sx to Ix	2
PM_L2_ST	All successful D-side store dispatches for this thread (L2 miss + L2 hits)	0
PM_L2_ST_DISP	All successful D-side store dispatches for this thread (L2 miss + L2 hits)	0
PM_L2_ST_DISP	All successful D-side store dispatches for this thread	3
PM_L2_ST_HIT	All successful D-side store dispatches that were L2 hits for this thread	1
PM_L2_ST_HIT	All successful D-side store dispatches for this thread that were L2 hits	3
PM_L2_ST_MISS	All successful D-Side Store dispatches that were an L2 miss for this thread	1
PM_L2_ST_MISS_128B	All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 128B (i.e., M=0)	0
PM_L2_ST_MISS_64B	All successful D-side store dispatches that were an L2 miss (NOT Sx,Tx,Mx) for this thread and the RC calculated the request should be for 64B (i.e., M=1)	1
PM_L2_SYS_GUESS_CORRECT	L2 guess system (VGS or RNS) and guess was correct (ie data beyond-group)	2
PM_L2_SYS_GUESS_WRONG	L2 guess system (VGS or RNS) and guess was not correct (ie data ^beyond-group)	2
PM_L2_SYS_PUMP	RC requests that were system pump attempts	3
PM_L3_CI_HIT	L3 Castins Hit (total count)	1
PM_L3_CI_MISS	L3 castins miss (total count)	1
PM_L3_CINJ	L3 castin of cache inject	2
PM_L3_CI_USAGE	Rotating sample of 16 CI or CO actives	0
PM_L3_CO	L3 castout occurring (does not include casthrough or log writes (cinj/dmaw))	2
PM_L3_CO0_BUSY	Lifetime, sample of CO machine 0 valid	2
PM_L3_CO0_BUSY	Lifetime, sample of CO machine 0 valid	3
PM_L3_CO_L31	L3 CO to L3.1 OR of port 0 and 1 (lossy = may undercount if two cresps come in the same cyc)	1
PM_L3_CO_LCO	Total L3 COs occurred on LCO L3.1 (good cresp, may end up in mem on a retry)	2
PM_L3_CO_MEM	L3 CO to memory OR of port 0 and 1 (lossy = may undercount if two cresp come in the same cyc)	1
PM_L3_CO_MEPF	L3 CO of line in Mep state (includes casthrough to memory). The Mepf state indicates that a line was brought in to satisfy an L3 prefetch request	0
PM_L3_CO_MEPF	L3 castouts in Mepf state for this thread	2
PM_L3_GRP_GUESS_CORRECT	Initial scope=group (GS or NNS) and data from same group (near) (pred successful)	0
PM_L3_GRP_GUESS_WRONG_HIGH	Initial scope=group (GS or NNS) but data from local node. Prediction too high	2
PM_L3_GRP_GUESS_WRONG_LOW	Initial scope=group (GS or NNS) but data from outside group (far or rem). Prediction too Low	2
PM_L3_HIT	L3 Hits (L2 miss hitting L3, including data/instrn/xlate)	0
PM_L3_L2_CO_HIT	L2 CO hits	2
PM_L3_L2_CO_MISS	L2 CO miss	2
PM_L3_LAT_CI_HIT	L3 Lateral Castins Hit	3
PM_L3_LAT_CI_MISS	L3 Lateral Castins Miss	3
PM_L3_LD_HIT	L3 Hits for demand LDs	1
PM_L3_LD_MISS	L3 Misses for demand LDs	1
PM_L3_LD_PREF	L3 load prefetch, sourced from a hardware or software stream, was sent to the nest	0, 1, 2, 3
PM_L3_LOC_GUESS_CORRECT	initial scope=node/chip (LNS) and data from local node (local) (pred successful) - always PFs only	0
PM_L3_LOC_GUESS_WRONG	Initial scope=node (LNS) but data from out side local node (near or far or rem). Prediction too Low	1
PM_L3_MISS	L3 Misses (L2 miss also missing L3, including data/instrn/xlate)	0
PM_L3_P0_CO_L31	L3 CO to L3.1 (LCO) port 0 with or without data	3
PM_L3_P0_CO_MEM	L3 CO to memory port 0 with or without data	2
PM_L3_P0_CO_RTY	L3 CO received retry port 0 (memory only), every retry counted	2
PM_L3_P0_CO_RTY	L3 CO received retry port 2 (memory only), every retry counted	3
PM_L3_P0_GRP_PUMP	L3 PF sent with grp scope port 0, counts even retried requests	1
PM_L3_P0_LCO_DATA	LCO sent with data port 0	1
PM_L3_P0_LCO_NO_DATA	Dataless L3 LCO sent port 0	0
PM_L3_P0_LCO_RTY	L3 initiated LCO received retry on port 0 (can try 4 times)	0
PM_L3_P0_NODE_PUMP	L3 PF sent with nodal scope port 0, counts even retried requests	0
PM_L3_P0_PF_RTY	L3 PF received retry port 0, every retry counted	0
PM_L3_P0_PF_RTY	L3 PF received retry port 2, every retry counted	1
PM_L3_P0_SYS_PUMP	L3 PF sent with sys scope port 0, counts even retried requests	2
PM_L3_P1_CO_L31	L3 CO to L3.1 (LCO) port 1 with or without data	3
PM_L3_P1_CO_MEM	L3 CO to memory port 1 with or without data	2
PM_L3_P1_CO_RTY	L3 CO received retry port 1 (memory only), every retry counted	2
PM_L3_P1_CO_RTY	L3 CO received retry port 3 (memory only), every retry counted	3
PM_L3_P1_GRP_PUMP	L3 PF sent with grp scope port 1, counts even retried requests	1
PM_L3_P1_LCO_DATA	LCO sent with data port 1	1
PM_L3_P1_LCO_NO_DATA	Dataless L3 LCO sent port 1	0
PM_L3_P1_LCO_RTY	L3 initiated LCO received retry on port 1 (can try 4 times)	0
PM_L3_P1_NODE_PUMP	L3 PF sent with nodal scope port 1, counts even retried requests	0
PM_L3_P1_PF_RTY	L3 PF received retry port 1, every retry counted	0
PM_L3_P1_PF_RTY	L3 PF received retry port 3, every retry counted	1
PM_L3_P1_SYS_PUMP	L3 PF sent with sys scope port 1, counts even retried requests	2
PM_L3_P2_LCO_RTY	L3 initiated LCO received retry on port 2 (can try 4 times)	1
PM_L3_P3_LCO_RTY	L3 initiated LCO received retry on port 3 (can try 4 times)	1
PM_L3_PF0_BUSY	Lifetime, sample of PF machine 0 valid	2
PM_L3_PF0_BUSY	Lifetime, sample of PF machine 0 valid	3
PM_L3_PF_HIT_L3	L3 PF hit in L3 (abandoned)	1
PM_L3_PF_MISS_L3	L3 PF missed in L3	0
PM_L3_PF_OFF_CHIP_CACHE	L3 PF from Off chip cache	2
PM_L3_PF_OFF_CHIP_MEM	L3 PF from Off chip memory	3
PM_L3_PF_ON_CHIP_CACHE	L3 PF from On chip cache	2
PM_L3_PF_ON_CHIP_MEM	L3 PF from On chip memory	3
PM_L3_PF_USAGE	Rotating sample of 32 PF actives	1
PM_L3_RD0_BUSY	Lifetime, sample of RD machine 0 valid	2
PM_L3_RD0_BUSY	Lifetime, sample of RD machine 0 valid	3
PM_L3_RD_USAGE	Rotating sample of 16 RD actives	1
PM_L3_SN0_BUSY	Lifetime, sample of snooper machine 0 valid	2
PM_L3_SN0_BUSY	Lifetime, sample of snooper machine 0 valid	3
PM_L3_SN_USAGE	Rotating sample of 16 snoop valids	0
PM_L3_SW_PREF	L3 load prefetch, sourced from a software prefetch stream, was sent to the nest	0, 1, 2, 3
PM_L3_SYS_GUESS_CORRECT	Initial scope=system (VGS or RNS) and data from outside group (far or rem)(pred successful)	1
PM_L3_SYS_GUESS_WRONG	Initial scope=system (VGS or RNS) but data from local or near. Prediction too high	3
PM_L3_TRANS_PF	L3 Transient prefetch received from L2	3
PM_L3_WI0_BUSY	Rotating sample of 8 WI valid	0
PM_L3_WI0_BUSY	Rotating sample of 8 WI valid (duplicate)	1
PM_L3_WI_USAGE	Lifetime, sample of Write Inject machine 0 valid	0
PM_LARX_FIN	Larx finished	2
PM_LD_CMPL	count of Loads completed	3
PM_LD_L3MISS_PEND_CYC	Cycles L3 miss was pending for this thread	0
PM_LD_MISS_L1	Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.	2
PM_LD_MISS_L1	Load Missed L1, counted at execution time (can be greater than loads finished). LMQ merges are not included in this count. i.e. if a load instruction misses on an address that is already allocated on the LMQ, this event will not increment for that load). Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.	3
PM_LD_MISS_L1_FIN	Number of load instructions that finished with an L1 miss. Note that even if a load spans multiple slices this event will increment only once per load op.	1
PM_LD_REF_L1	All L1 D cache load references counted at finish, gated by reject	0
PM_LINK_STACK_CORRECT	Link stack predicts right address	0, 1, 2, 3
PM_LINK_STACK_INVALID_PTR	It is most often caused by certain types of flush where the pointer is not available. Can result in the data in the link stack becoming unusable.	0, 1, 2, 3
PM_LINK_STACK_WRONG_ADD_PRED	Link stack predicts wrong address, because of link stack design limitation or software violating the coding conventions	0, 1, 2, 3
PM_LMQ_EMPTY_CYC	Cycles in which the LMQ has no pending load misses for this thread	1
PM_LMQ_MERGE	A demand miss collides with a prefetch for the same line	0
PM_LRQ_REJECT	Internal LSU reject from LRQ. Rejects cause the load to go back to LRQ, but it stays contained within the LSU once it gets issued. This event counts the number of times the LRQ attempts to relaunch an instruction after a reject. Any load can suffer multiple rejects	1
PM_LS0_DC_COLLISIONS	Read-write data cache collisions	0, 1, 2, 3
PM_LS0_ERAT_MISS_PREF	LS0 Erat miss due to prefetch	0, 1, 2, 3
PM_LS0_LAUNCH_HELD_PREF	Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle	0, 1, 2, 3
PM_LS0_PTE_TABLEWALK_CYC	Cycles when a tablewalk is pending on this thread on table 0	0, 1, 2, 3
PM_LS0_TM_DISALLOW	A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it	0, 1, 2, 3
PM_LS0_UNALIGNED_LD	Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS0_UNALIGNED_ST	Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS1_DC_COLLISIONS	Read-write data cache collisions	0, 1, 2, 3
PM_LS1_ERAT_MISS_PREF	LS1 Erat miss due to prefetch	0, 1, 2, 3
PM_LS1_LAUNCH_HELD_PREF	Number of times a load or store instruction was unable to launch/relaunch because a high priority prefetch used that relaunch cycle	0, 1, 2, 3
PM_LS1_PTE_TABLEWALK_CYC	Cycles when a tablewalk is pending on this thread on table 1	0, 1, 2, 3
PM_LS1_TM_DISALLOW	A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it	0, 1, 2, 3
PM_LS1_UNALIGNED_LD	Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS1_UNALIGNED_ST	Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS2_DC_COLLISIONS	Read-write data cache collisions	0, 1, 2, 3
PM_LS2_ERAT_MISS_PREF	LS0 Erat miss due to prefetch	0, 1, 2, 3
PM_LS2_TM_DISALLOW	A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it	0, 1, 2, 3
PM_LS2_UNALIGNED_LD	Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS2_UNALIGNED_ST	Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS3_DC_COLLISIONS	Read-write data cache collisions	0, 1, 2, 3
PM_LS3_ERAT_MISS_PREF	LS1 Erat miss due to prefetch	0, 1, 2, 3
PM_LS3_TM_DISALLOW	A TM-ineligible instruction tries to execute inside a transaction and the LSU disallows it	0, 1, 2, 3
PM_LS3_UNALIGNED_LD	Load instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the load of that size. If the load wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LS3_UNALIGNED_ST	Store instructions whose data crosses a double-word boundary, which causes it to require an additional slice than than what normally would be required of the Store of that size. If the Store wraps from slice 3 to slice 0, thee is an additional 3-cycle penalty	0, 1, 2, 3
PM_LSU0_1_LRQF_FULL_CYC	Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ	0, 1, 2, 3
PM_LSU0_ERAT_HIT	Primary ERAT hit. There is no secondary ERAT	0, 1, 2, 3
PM_LSU0_FALSE_LHS	False LHS match detected	0, 1, 2, 3
PM_LSU0_L1_CAM_CANCEL	ls0 l1 tm cam cancel	0, 1, 2, 3
PM_LSU0_LDMX_FIN	New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): "The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region." This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).	0, 1, 2, 3
PM_LSU0_LMQ_S0_VALID	Slot 0 of LMQ valid	0, 1, 2, 3
PM_LSU0_LRQ_S0_VALID_CYC	Slot 0 of LRQ valid	0, 1, 2, 3
PM_LSU0_SET_MPRED	Set prediction(set-p) miss. The entry was not found in the Set prediction table	0, 1, 2, 3
PM_LSU0_SRQ_S0_VALID_CYC	Slot 0 of SRQ valid	0, 1, 2, 3
PM_LSU0_STORE_REJECT	All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met	0, 1, 2, 3
PM_LSU0_TM_L1_HIT	Load tm hit in L1	0, 1, 2, 3
PM_LSU0_TM_L1_MISS	Load tm L1 miss	0, 1, 2, 3
PM_LSU1_ERAT_HIT	Primary ERAT hit. There is no secondary ERAT	0, 1, 2, 3
PM_LSU1_FALSE_LHS	False LHS match detected	0, 1, 2, 3
PM_LSU1_L1_CAM_CANCEL	ls1 l1 tm cam cancel	0, 1, 2, 3
PM_LSU1_LDMX_FIN	New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): "The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region." This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).	0, 1, 2, 3
PM_LSU1_SET_MPRED	Set prediction(set-p) miss. The entry was not found in the Set prediction table	0, 1, 2, 3
PM_LSU1_STORE_REJECT	All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met	0, 1, 2, 3
PM_LSU1_TM_L1_HIT	Load tm hit in L1	0, 1, 2, 3
PM_LSU1_TM_L1_MISS	Load tm L1 miss	0, 1, 2, 3
PM_LSU2_3_LRQF_FULL_CYC	Counts the number of cycles the LRQF is full. LRQF is the queue that holds loads between finish and completion. If it fills up, instructions stay in LRQ until completion, potentially backing up the LRQ	0, 1, 2, 3
PM_LSU2_ERAT_HIT	Primary ERAT hit. There is no secondary ERAT	0, 1, 2, 3
PM_LSU2_FALSE_LHS	False LHS match detected	0, 1, 2, 3
PM_LSU2_L1_CAM_CANCEL	ls2 l1 tm cam cancel	0, 1, 2, 3
PM_LSU2_LDMX_FIN	New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): "The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region." This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).	0, 1, 2, 3
PM_LSU2_SET_MPRED	Set prediction(set-p) miss. The entry was not found in the Set prediction table	0, 1, 2, 3
PM_LSU2_STORE_REJECT	All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met	0, 1, 2, 3
PM_LSU2_TM_L1_HIT	Load tm hit in L1	0, 1, 2, 3
PM_LSU2_TM_L1_MISS	Load tm L1 miss	0, 1, 2, 3
PM_LSU3_ERAT_HIT	Primary ERAT hit. There is no secondary ERAT	0, 1, 2, 3
PM_LSU3_FALSE_LHS	False LHS match detected	0, 1, 2, 3
PM_LSU3_L1_CAM_CANCEL	ls3 l1 tm cam cancel	0, 1, 2, 3
PM_LSU3_LDMX_FIN	New P9 instruction LDMX. The definition of this new PMU event is (from the ldmx RFC02491): "The thread has executed an ldmx instruction that accessed a doubleword that contains an effective address within an enabled section of the Load Monitored region." This event, therefore, should not occur if the FSCR has disabled the load monitored facility (FSCR[52]) or disabled the EBB facility (FSCR[56]).	0, 1, 2, 3
PM_LSU3_SET_MPRED	Set prediction(set-p) miss. The entry was not found in the Set prediction table	0, 1, 2, 3
PM_LSU3_STORE_REJECT	All internal store rejects cause the instruction to go back to the SRQ and go to sleep until woken up to try again after the condition has been met	0, 1, 2, 3
PM_LSU3_TM_L1_HIT	Load tm hit in L1	0, 1, 2, 3
PM_LSU3_TM_L1_MISS	Load tm L1 miss	0, 1, 2, 3
PM_LSU_DERAT_MISS	DERAT Reloaded due to a DERAT miss	1
PM_LSU_FIN	LSU Finished a PPC instruction (up to 4 per cycle)	2
PM_LSU_FLUSH_ATOMIC	Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed	0, 1, 2, 3
PM_LSU_FLUSH_CI	Load was not issued to LSU as a cache inhibited (non-cacheable) load but it was later determined to be cache inhibited	0, 1, 2, 3
PM_LSU_FLUSH_EMSH	An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address	0, 1, 2, 3
PM_LSU_FLUSH_LARX_STCX	A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches	0, 1, 2, 3
PM_LSU_FLUSH_LHL_SHL	The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).	0, 1, 2, 3
PM_LSU_FLUSH_LHS	Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed	0, 1, 2, 3
PM_LSU_FLUSH_NEXT	LSU flush next reported at flush time. Sometimes these also come with an exception	0, 1, 2, 3
PM_LSU_FLUSH_OTHER	Other LSU flushes including: Sync (sync ack from L2 caused search of LRQ for oldest snooped load, This will either signal a Precise Flush of the oldest snooped loa or a Flush Next PPC)	0, 1, 2, 3
PM_LSU_FLUSH_RELAUNCH_MISS	If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent	0, 1, 2, 3
PM_LSU_FLUSH_SAO	A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush	0, 1, 2, 3
PM_LSU_FLUSH_UE	Correctable ECC error on reload data, reported at critical data forward time	0, 1, 2, 3
PM_LSU_FLUSH_WRK_ARND	LSU workaround flush. These flushes are setup with programmable scan only latches to perform various actions when the flush macro receives a trigger from the dbg macros. These actions include things like flushing the next op encountered for a particular thread or flushing the next op that is NTC op that is encountered on a particular slice. The kind of flush that the workaround is setup to perform is highly variable.	0, 1, 2, 3
PM_LSU_LMQ_FULL_CYC	Counts the number of cycles the LMQ is full	0, 1, 2, 3
PM_LSU_LMQ_SRQ_EMPTY_CYC	Cycles in which the LSU is empty for all threads (lmq and srq are completely empty)	1
PM_LSU_NCST	Asserts when a i=1 store op is sent to the nest. No record of issue pipe (LS0/LS1) is maintained so this is for both pipes. Probably don't need separate LS0 and LS1	0, 1, 2, 3
PM_LSU_REJECT_ERAT_MISS	LSU Reject due to ERAT (up to 4 per cycles)	1
PM_LSU_REJECT_LHS	LSU Reject due to LHS (up to 4 per cycle)	3
PM_LSU_REJECT_LMQ_FULL	LSU Reject due to LMQ full (up to 4 per cycles)	2
PM_LSU_SRQ_FULL_CYC	Cycles in which the Store Queue is full on all 4 slices. This is event is not per thread. All the threads will see the same count for this core resource	0
PM_LSU_STCX	STCX sent to nest, i.e. total	0, 1, 2, 3
PM_LSU_STCX_FAIL	LSU_STCX_FAIL	0, 1, 2, 3
PM_LWSYNC	Lwsync instruction decoded and transferred	0, 1, 2, 3
PM_MATH_FLOP_CMPL	Math flop instruction completed	3
PM_MEM_CO	Memory castouts from this thread	3
PM_MEM_LOC_THRESH_IFU	Local Memory above threshold for IFU speculation control	0
PM_MEM_LOC_THRESH_LSU_HIGH	Local memory above threshold for LSU medium	3
PM_MEM_LOC_THRESH_LSU_MED	Local memory above threshold for data prefetch	0
PM_MEM_PREF	Memory prefetch for this thread. Includes L4	1
PM_MEM_READ	Reads from Memory from this thread (includes data/inst/xlate/l1prefetch/inst prefetch). Includes L4	0
PM_MEM_RWITM	Memory Read With Intent to Modify for this thread	2
PM_MRK_BACK_BR_CMPL	Marked branch instruction completed with a target address less than current instruction address	2
PM_MRK_BR_2PATH	marked branches which are not strongly biased	0
PM_MRK_BR_CMPL	Branch Instruction completed	0
PM_MRK_BR_MPRED_CMPL	Marked Branch Mispredicted	2
PM_MRK_BR_TAKEN_CMPL	Marked Branch Taken completed	0
PM_MRK_BRU_FIN	bru marked instr finish	1
PM_MRK_DATA_FROM_DL2L3_MOD	The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load	2
PM_MRK_DATA_FROM_DL2L3_MOD_CYC	Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load	3
PM_MRK_DATA_FROM_DL2L3_SHR	The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load	0
PM_MRK_DATA_FROM_DL2L3_SHR_CYC	Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked load	1
PM_MRK_DATA_FROM_DL4	The processor's data cache was reloaded from another chip's L4 on a different Node or Group (Distant) due to a marked load	0
PM_MRK_DATA_FROM_DL4_CYC	Duration in cycles to reload from another chip's L4 on a different Node or Group (Distant) due to a marked load	1
PM_MRK_DATA_FROM_DMEM	The processor's data cache was reloaded from another chip's memory on the same Node or Group (Distant) due to a marked load	2
PM_MRK_DATA_FROM_DMEM_CYC	Duration in cycles to reload from another chip's memory on the same Node or Group (Distant) due to a marked load	3
PM_MRK_DATA_FROM_L2	The processor's data cache was reloaded from local core's L2 due to a marked load	1
PM_MRK_DATA_FROM_L21_MOD	The processor's data cache was reloaded with Modified (M) data from another core's L2 on the same chip due to a marked load	3
PM_MRK_DATA_FROM_L21_MOD_CYC	Duration in cycles to reload with Modified (M) data from another core's L2 on the same chip due to a marked load	2
PM_MRK_DATA_FROM_L21_SHR	The processor's data cache was reloaded with Shared (S) data from another core's L2 on the same chip due to a marked load	1
PM_MRK_DATA_FROM_L21_SHR_CYC	Duration in cycles to reload with Shared (S) data from another core's L2 on the same chip due to a marked load	0
PM_MRK_DATA_FROM_L2_CYC	Duration in cycles to reload from local core's L2 due to a marked load	0
PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST	The processor's data cache was reloaded from local core's L2 with load hit store conflict due to a marked load	1
PM_MRK_DATA_FROM_L2_DISP_CONFLICT_LDHITST_CYC	Duration in cycles to reload from local core's L2 with load hit store conflict due to a marked load	0
PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER	The processor's data cache was reloaded from local core's L2 with dispatch conflict due to a marked load	1
PM_MRK_DATA_FROM_L2_DISP_CONFLICT_OTHER_CYC	Duration in cycles to reload from local core's L2 with dispatch conflict due to a marked load	2
PM_MRK_DATA_FROM_L2_MEPF	The processor's data cache was reloaded from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load	3
PM_MRK_DATA_FROM_L2_MEPF_CYC	Duration in cycles to reload from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked load	2
PM_MRK_DATA_FROM_L2MISS	The processor's data cache was reloaded from a location other than the local core's L2 due to a marked load	3
PM_MRK_DATA_FROM_L2MISS_CYC	Duration in cycles to reload from a location other than the local core's L2 due to a marked load	2
PM_MRK_DATA_FROM_L2_NO_CONFLICT	The processor's data cache was reloaded from local core's L2 without conflict due to a marked load	1
PM_MRK_DATA_FROM_L2_NO_CONFLICT_CYC	Duration in cycles to reload from local core's L2 without conflict due to a marked load	0
PM_MRK_DATA_FROM_L3	The processor's data cache was reloaded from local core's L3 due to a marked load	3
PM_MRK_DATA_FROM_L31_ECO_MOD	The processor's data cache was reloaded with Modified (M) data from another core's ECO L3 on the same chip due to a marked load	3
PM_MRK_DATA_FROM_L31_ECO_MOD_CYC	Duration in cycles to reload with Modified (M) data from another core's ECO L3 on the same chip due to a marked load	2
PM_MRK_DATA_FROM_L31_ECO_SHR	The processor's data cache was reloaded with Shared (S) data from another core's ECO L3 on the same chip due to a marked load	1
PM_MRK_DATA_FROM_L31_ECO_SHR_CYC	Duration in cycles to reload with Shared (S) data from another core's ECO L3 on the same chip due to a marked load	0
PM_MRK_DATA_FROM_L31_MOD	The processor's data cache was reloaded with Modified (M) data from another core's L3 on the same chip due to a marked load	1
PM_MRK_DATA_FROM_L31_MOD_CYC	Duration in cycles to reload with Modified (M) data from another core's L3 on the same chip due to a marked load	0
PM_MRK_DATA_FROM_L31_SHR	The processor's data cache was reloaded with Shared (S) data from another core's L3 on the same chip due to a marked load	3
PM_MRK_DATA_FROM_L31_SHR_CYC	Duration in cycles to reload with Shared (S) data from another core's L3 on the same chip due to a marked load	2
PM_MRK_DATA_FROM_L3_CYC	Duration in cycles to reload from local core's L3 due to a marked load	2
PM_MRK_DATA_FROM_L3_DISP_CONFLICT	The processor's data cache was reloaded from local core's L3 with dispatch conflict due to a marked load	0
PM_MRK_DATA_FROM_L3_DISP_CONFLICT_CYC	Duration in cycles to reload from local core's L3 with dispatch conflict due to a marked load	1
PM_MRK_DATA_FROM_L3_MEPF	The processor's data cache was reloaded from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked load	1
PM_MRK_DATA_FROM_L3_MEPF_CYC	Duration in cycles to reload from local core's L3 without dispatch conflicts hit on Mepf state due to a marked load	0
PM_MRK_DATA_FROM_L3MISS	The processor's data cache was reloaded from a location other than the local core's L3 due to a marked load	1
PM_MRK_DATA_FROM_L3MISS_CYC	Duration in cycles to reload from a location other than the local core's L3 due to a marked load	0
PM_MRK_DATA_FROM_L3_NO_CONFLICT	The processor's data cache was reloaded from local core's L3 without conflict due to a marked load	2
PM_MRK_DATA_FROM_L3_NO_CONFLICT_CYC	Duration in cycles to reload from local core's L3 without conflict due to a marked load	3
PM_MRK_DATA_FROM_LL4	The processor's data cache was reloaded from the local chip's L4 cache due to a marked load	0
PM_MRK_DATA_FROM_LL4_CYC	Duration in cycles to reload from the local chip's L4 cache due to a marked load	1
PM_MRK_DATA_FROM_LMEM	The processor's data cache was reloaded from the local chip's Memory due to a marked load	2
PM_MRK_DATA_FROM_LMEM_CYC	Duration in cycles to reload from the local chip's Memory due to a marked load	3
PM_MRK_DATA_FROM_MEMORY	The processor's data cache was reloaded from a memory location including L4 from local remote or distant due to a marked load	1
PM_MRK_DATA_FROM_MEMORY_CYC	Duration in cycles to reload from a memory location including L4 from local remote or distant due to a marked load	0
PM_MRK_DATA_FROM_OFF_CHIP_CACHE	The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load	1
PM_MRK_DATA_FROM_OFF_CHIP_CACHE_CYC	Duration in cycles to reload either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked load	0
PM_MRK_DATA_FROM_ON_CHIP_CACHE	The processor's data cache was reloaded either shared or modified data from another core's L2/L3 on the same chip due to a marked load	3
PM_MRK_DATA_FROM_ON_CHIP_CACHE_CYC	Duration in cycles to reload either shared or modified data from another core's L2/L3 on the same chip due to a marked load	2
PM_MRK_DATA_FROM_RL2L3_MOD	The processor's data cache was reloaded with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load	0
PM_MRK_DATA_FROM_RL2L3_MOD_CYC	Duration in cycles to reload with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load	1
PM_MRK_DATA_FROM_RL2L3_SHR	The processor's data cache was reloaded with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load	2
PM_MRK_DATA_FROM_RL2L3_SHR_CYC	Duration in cycles to reload with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked load	3
PM_MRK_DATA_FROM_RL4	The processor's data cache was reloaded from another chip's L4 on the same Node or Group ( Remote) due to a marked load	2
PM_MRK_DATA_FROM_RL4_CYC	Duration in cycles to reload from another chip's L4 on the same Node or Group ( Remote) due to a marked load	3
PM_MRK_DATA_FROM_RMEM	The processor's data cache was reloaded from another chip's memory on the same Node or Group ( Remote) due to a marked load	0
PM_MRK_DATA_FROM_RMEM_CYC	Duration in cycles to reload from another chip's memory on the same Node or Group ( Remote) due to a marked load	1
PM_MRK_DCACHE_RELOAD_INTV	Combined Intervention event	3
PM_MRK_DERAT_MISS	Erat Miss (TLB Access) All page sizes	2
PM_MRK_DERAT_MISS_16G	Marked Data ERAT Miss (Data TLB Access) page size 16G	3
PM_MRK_DERAT_MISS_16M	Marked Data ERAT Miss (Data TLB Access) page size 16M	2
PM_MRK_DERAT_MISS_1G	Marked Data ERAT Miss (Data TLB Access) page size 1G. Implies radix translation	2
PM_MRK_DERAT_MISS_2M	Marked Data ERAT Miss (Data TLB Access) page size 2M. Implies radix translation	1
PM_MRK_DERAT_MISS_4K	Marked Data ERAT Miss (Data TLB Access) page size 4K	1
PM_MRK_DERAT_MISS_64K	Marked Data ERAT Miss (Data TLB Access) page size 64K	1
PM_MRK_DFU_FIN	Decimal Unit marked Instruction Finish	1
PM_MRK_DPTEG_FROM_DL2L3_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_DL2L3_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on a different Node or Group (Distant), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_MRK_DPTEG_FROM_DL4	A Page Table Entry was loaded into the TLB from another chip's L4 on a different Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_MRK_DPTEG_FROM_DMEM	A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group (Distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_L2	A Page Table Entry was loaded into the TLB from local core's L2 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_L21_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_L21_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L2 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_MRK_DPTEG_FROM_L2_MEPF	A Page Table Entry was loaded into the TLB from local core's L2 hit without dispatch conflicts on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_L2MISS	A Page Table Entry was loaded into the TLB from a location other than the local core's L2 due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_L2_NO_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L2 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_L3	A Page Table Entry was loaded into the TLB from local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_L31_ECO_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_L31_ECO_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's ECO L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_MRK_DPTEG_FROM_L31_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another core's L3 on the same chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_L31_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another core's L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_L3_DISP_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L3 with dispatch conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_MRK_DPTEG_FROM_L3_MEPF	A Page Table Entry was loaded into the TLB from local core's L3 without dispatch conflicts hit on Mepf state. due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_L3MISS	A Page Table Entry was loaded into the TLB from a location other than the local core's L3 due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_L3_NO_CONFLICT	A Page Table Entry was loaded into the TLB from local core's L3 without conflict due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_LL4	A Page Table Entry was loaded into the TLB from the local chip's L4 cache due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_LMEM	A Page Table Entry was loaded into the TLB from the local chip's Memory due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_MEMORY	A Page Table Entry was loaded into the TLB from a memory location including L4 from local remote or distant due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_OFF_CHIP_CACHE	A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on a different chip (remote or distant) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	3
PM_MRK_DPTEG_FROM_ON_CHIP_CACHE	A Page Table Entry was loaded into the TLB either shared or modified data from another core's L2/L3 on the same chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_RL2L3_MOD	A Page Table Entry was loaded into the TLB with Modified (M) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_RL2L3_SHR	A Page Table Entry was loaded into the TLB with Shared (S) data from another chip's L2 or L3 on the same Node or Group (Remote), as this chip due to a marked data side request.. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	0
PM_MRK_DPTEG_FROM_RL4	A Page Table Entry was loaded into the TLB from another chip's L4 on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	1
PM_MRK_DPTEG_FROM_RMEM	A Page Table Entry was loaded into the TLB from another chip's memory on the same Node or Group ( Remote) due to a marked data side request. When using Radix Page Translation, this count excludes PDE reloads. Only PTE reloads are included	2
PM_MRK_DTLB_MISS	Marked dtlb miss	3
PM_MRK_DTLB_MISS_16G	Marked Data TLB Miss page size 16G	1
PM_MRK_DTLB_MISS_16M	Marked Data TLB Miss page size 16M	3
PM_MRK_DTLB_MISS_1G	Marked Data TLB reload (after a miss) page size 2M. Implies radix translation was used	0
PM_MRK_DTLB_MISS_4K	Marked Data TLB Miss page size 4k	1
PM_MRK_DTLB_MISS_64K	Marked Data TLB Miss page size 64K	2
PM_MRK_FAB_RSP_BKILL	Marked store had to do a bkill	3
PM_MRK_FAB_RSP_BKILL_CYC	cycles L2 RC took for a bkill	0
PM_MRK_FAB_RSP_CLAIM_RTY	Sampled store did a rwitm and got a rty	2
PM_MRK_FAB_RSP_DCLAIM	Marked store had to do a dclaim	2
PM_MRK_FAB_RSP_DCLAIM_CYC	cycles L2 RC took for a dclaim	1
PM_MRK_FAB_RSP_RD_RTY	Sampled L2 reads retry count	3
PM_MRK_FAB_RSP_RD_T_INTV	Sampled Read got a T intervention	0
PM_MRK_FAB_RSP_RWITM_CYC	cycles L2 RC took for a rwitm	3
PM_MRK_FAB_RSP_RWITM_RTY	Sampled store did a rwitm and got a rty	1
PM_MRK_FXU_FIN	fxu marked instr finish	1
PM_MRK_IC_MISS	Marked instruction experienced I cache miss	3
PM_MRK_INST	An instruction was marked. Includes both Random Instruction Sampling (RIS) at decode time and Random Event Sampling (RES) at the time the configured event happens	1
PM_MRK_INST_CMPL	marked instruction completed	3
PM_MRK_INST_DECODED	An instruction was marked at decode time. Random Instruction Sampling (RIS) only	1
PM_MRK_INST_DISP	The thread has dispatched a randomly sampled marked instruction	0
PM_MRK_INST_FIN	marked instruction finished	2
PM_MRK_INST_FROM_L3MISS	Marked instruction was reloaded from a location beyond the local chiplet	3
PM_MRK_INST_ISSUED	Marked instruction issued	0
PM_MRK_INST_TIMEO	marked Instruction finish timeout (instruction lost)	3
PM_MRK_L1_ICACHE_MISS	sampled Instruction suffered an icache Miss	0
PM_MRK_L1_RELOAD_VALID	Marked demand reload	0
PM_MRK_L2_RC_DISP	Marked Instruction RC dispatched in L2	1
PM_MRK_L2_RC_DONE	Marked RC done	2
PM_MRK_L2_TM_REQ_ABORT	TM abort	0
PM_MRK_L2_TM_ST_ABORT_SISTER	TM marked store abort for this thread	2
PM_MRK_LARX_FIN	Larx finished	3
PM_MRK_LD_MISS_EXPOSED_CYC	Marked Load exposed Miss (use edge detect to count #)	0
PM_MRK_LD_MISS_L1	Marked DL1 Demand Miss counted at exec time. Note that this count is per slice, so if a load spans multiple slices this event will increment multiple times for a single load.	1
PM_MRK_LD_MISS_L1_CYC	Marked ld latency	0
PM_MRK_LSU_DERAT_MISS	Marked derat reload (miss) for any page size	2
PM_MRK_LSU_FIN	lsu marked instr PPC finish	3
PM_MRK_LSU_FLUSH_ATOMIC	Quad-word loads (lq) are considered atomic because they always span at least 2 slices. If a snoop or store from another thread changes the data the load is accessing between the 2 or 3 pieces of the lq instruction, the lq will be flushed	0, 1, 2, 3
PM_MRK_LSU_FLUSH_EMSH	An ERAT miss was detected after a set-p hit. Erat tracker indicates fail due to tlbmiss and the instruction gets flushed because the instruction was working on the wrong address	0, 1, 2, 3
PM_MRK_LSU_FLUSH_LARX_STCX	A larx is flushed because an older larx has an LMQ reservation for the same thread. A stcx is flushed because an older stcx is in the LMQ. The flush happens when the older larx/stcx relaunches	0, 1, 2, 3
PM_MRK_LSU_FLUSH_LHL_SHL	The instruction was flushed because of a sequential load/store consistency. If a load or store hits on an older load that has either been snooped (for loads) or has stale data (for stores).	0, 1, 2, 3
PM_MRK_LSU_FLUSH_LHS	Effective Address alias flush : no EA match but Real Address match. If the data has not yet been returned for this load, the instruction will just be rejected, but if it has returned data, it will be flushed	0, 1, 2, 3
PM_MRK_LSU_FLUSH_RELAUNCH_MISS	If a load that has already returned data and has to relaunch for any reason then gets a miss (erat, setp, data cache), it will often be flushed at relaunch time because the data might be inconsistent	0, 1, 2, 3
PM_MRK_LSU_FLUSH_SAO	A load-hit-load condition with Strong Address Ordering will have address compare disabled and flush	0, 1, 2, 3
PM_MRK_LSU_FLUSH_UE	Correctable ECC error on reload data, reported at critical data forward time	0, 1, 2, 3
PM_MRK_NTC_CYC	Cycles during which the marked instruction is next to complete (completion is held up because the marked instruction hasn't completed yet)	1
PM_MRK_NTF_FIN	Marked next to finish instruction finished	1
PM_MRK_PROBE_NOP_CMPL	Marked probeNops completed	0
PM_MRK_RUN_CYC	Run cycles in which a marked instruction is in the pipeline	0
PM_MRK_STALL_CMPLU_CYC	Number of cycles the marked instruction is experiencing a stall while it is next to complete (NTC)	2
PM_MRK_ST_CMPL	Marked store completed and sent to nest	2
PM_MRK_ST_CMPL_INT	marked store finished with intervention	2
PM_MRK_STCX_FAIL	marked stcx failed	2
PM_MRK_STCX_FIN	Number of marked stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed	1
PM_MRK_ST_DONE_L2	marked store completed in L2 ( RC machine done)	0
PM_MRK_ST_DRAIN_TO_L2DISP_CYC	cycles to drain st from core to L2	2
PM_MRK_ST_FWD	Marked st forwards	2
PM_MRK_ST_L2DISP_TO_CMPL_CYC	cycles from L2 rc disp to l2 rc completion	0
PM_MRK_ST_NEST	Marked store sent to nest	1
PM_MRK_TEND_FAIL	Nested or not nested tend failed for a marked tend instruction	0, 1, 2, 3
PM_MRK_VSU_FIN	VSU marked instr finish	2
PM_MULT_MRK	mult marked instr	2
PM_NEST_REF_CLK	Multiply by 4 to obtain the number of PB cycles	2
PM_NON_DATA_STORE	All ops that drain from s2q to L2 and contain no data	0, 1, 2, 3
PM_NON_FMA_FLOP_CMPL	Non FMA instruction completed	3
PM_NON_MATH_FLOP_CMPL	Non FLOP operation completed	3
PM_NON_TM_RST_SC	Non-TM snp rst TM SC	1
PM_NTC_ALL_FIN	Cycles after all instructions have finished to group completed	1
PM_NTC_FIN	Cycles in which the oldest instruction in the pipeline (NTC) finishes. This event is used to account for cycles in which work is being completed in the CPI stack	1
PM_NTC_ISSUE_HELD_ARB	The NTC instruction is being held at dispatch because it lost arbitration onto the issue pipe to another instruction (from the same thread or a different thread)	1
PM_NTC_ISSUE_HELD_DARQ_FULL	The NTC instruction is being held at dispatch because there are no slots in the DARQ for it	0
PM_NTC_ISSUE_HELD_OTHER	The NTC instruction is being held at dispatch during regular pipeline cycles, or because the VSU is busy with multi-cycle instructions, or because of a write-back collision with VSU	2
PM_PARTIAL_ST_FIN	Any store finished by an LSU slice	2
PM_PMC1_OVERFLOW	Overflow from counter 1	1
PM_PMC1_REWIND	PMC1_REWIND	3
PM_PMC1_SAVED	PMC1 Rewind Value saved	3
PM_PMC2_OVERFLOW	Overflow from counter 2	2
PM_PMC2_REWIND	PMC2 Rewind Event (did not match condition)	2
PM_PMC2_SAVED	PMC2 Rewind Value saved	0
PM_PMC3_OVERFLOW	Overflow from counter 3	3
PM_PMC3_REWIND	PMC3 rewind event. A rewind happens when a speculative event (such as latency or CPI stack) is selected on PMC3 and the stall reason or reload source did not match the one programmed in PMC3. When this occurs, the count in PMC3 will not change.	0
PM_PMC3_SAVED	PMC3 Rewind Value saved	3
PM_PMC4_OVERFLOW	Overflow from counter 4	0
PM_PMC4_REWIND	PMC4 Rewind Event	0
PM_PMC4_SAVED	PMC4 Rewind Value saved (matched condition)	2
PM_PMC5_OVERFLOW	Overflow from counter 5	0
PM_PMC6_OVERFLOW	Overflow from counter 6	2
PM_PROBE_NOP_DISP	ProbeNops dispatched	3
PM_PTE_PREFETCH	PTE prefetches	0, 1, 2, 3
PM_PTESYNC	ptesync instruction counted when the instruction is decoded and transmitted	0, 1, 2, 3
PM_PUMP_CPRED	Pump prediction correct. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	0
PM_PUMP_MPRED	Pump misprediction. Counts across all types of pumps for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	3
PM_RADIX_PWC_L1_HIT	A radix translation attempt missed in the TLB and only the first level page walk cache was a hit.	0
PM_RADIX_PWC_L1_PDE_FROM_L2	A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L2 data cache	1
PM_RADIX_PWC_L1_PDE_FROM_L3	A Page Directory Entry was reloaded to a level 1 page walk cache from the core's L3 data cache	2
PM_RADIX_PWC_L1_PDE_FROM_L3MISS	A Page Directory Entry was reloaded to a level 1 page walk cache from beyond the core's L3 data cache. The source could be local/remote/distant memory or another core's cache	3
PM_RADIX_PWC_L2_HIT	A radix translation attempt missed in the TLB but hit on both the first and second levels of page walk cache.	1
PM_RADIX_PWC_L2_PDE_FROM_L2	A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L2 data cache	1
PM_RADIX_PWC_L2_PDE_FROM_L3	A Page Directory Entry was reloaded to a level 2 page walk cache from the core's L3 data cache	2
PM_RADIX_PWC_L2_PTE_FROM_L2	A Page Table Entry was reloaded to a level 2 page walk cache from the core's L2 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation	0
PM_RADIX_PWC_L2_PTE_FROM_L3	A Page Table Entry was reloaded to a level 2 page walk cache from the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation	3
PM_RADIX_PWC_L2_PTE_FROM_L3MISS	A Page Table Entry was reloaded to a level 2 page walk cache from beyond the core's L3 data cache. This implies that level 3 and level 4 PWC accesses were not necessary for this translation. The source could be local/remote/distant memory or another core's cache	3
PM_RADIX_PWC_L3_HIT	A radix translation attempt missed in the TLB but hit on the first, second, and third levels of page walk cache.	2
PM_RADIX_PWC_L3_PDE_FROM_L2	A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L2 data cache	1
PM_RADIX_PWC_L3_PDE_FROM_L3	A Page Directory Entry was reloaded to a level 3 page walk cache from the core's L3 data cache	0
PM_RADIX_PWC_L3_PTE_FROM_L2	A Page Table Entry was reloaded to a level 3 page walk cache from the core's L2 data cache. This implies that a level 4 PWC access was not necessary for this translation	1
PM_RADIX_PWC_L3_PTE_FROM_L3	A Page Table Entry was reloaded to a level 3 page walk cache from the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation	2
PM_RADIX_PWC_L3_PTE_FROM_L3MISS	A Page Table Entry was reloaded to a level 3 page walk cache from beyond the core's L3 data cache. This implies that a level 4 PWC access was not necessary for this translation. The source could be local/remote/distant memory or another core's cache	3
PM_RADIX_PWC_L4_PTE_FROM_L2	A Page Table Entry was reloaded to a level 4 page walk cache from the core's L2 data cache. This is the deepest level of PWC possible for a translation	0
PM_RADIX_PWC_L4_PTE_FROM_L3	A Page Table Entry was reloaded to a level 4 page walk cache from the core's L3 data cache. This is the deepest level of PWC possible for a translation	3
PM_RADIX_PWC_L4_PTE_FROM_L3MISS	A Page Table Entry was reloaded to a level 4 page walk cache from beyond the core's L3 data cache. This is the deepest level of PWC possible for a translation. The source could be local/remote/distant memory or another core's cache	2
PM_RADIX_PWC_MISS	A radix translation attempt missed in the TLB and all levels of page walk cache.	3
PM_RC0_BUSY	RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)	0
PM_RC0_BUSY	RC mach 0 Busy. Used by PMU to sample ave RC lifetime (mach0 used as sample point)	1
PM_RC_USAGE	Continuous 16 cycle (2to1) window where this signals rotates thru sampling each RC machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running	0
PM_RD_CLEARING_SC	Read clearing SC	3
PM_RD_FORMING_SC	Read forming SC	3
PM_RD_HIT_PF	RD machine hit L3 PF machine	1
PM_RUN_CYC	Run_cycles	1
PM_RUN_CYC_SMT2_MODE	Cycles in which this thread's run latch is set and the core is in SMT2 mode	2
PM_RUN_CYC_SMT4_MODE	Cycles in which this thread's run latch is set and the core is in SMT4 mode	1
PM_RUN_CYC_ST_MODE	Cycles run latch is set and core is in ST mode	0
PM_RUN_INST_CMPL	Run_Instructions	3
PM_RUN_PURR	Run_PURR	3
PM_RUN_SPURR	Run SPURR	0
PM_S2Q_FULL	Cycles during which the S2Q is full	0, 1, 2, 3
PM_SCALAR_FLOP_CMPL	Scalar flop operation completed	3
PM_SHL_CREATED	Store-Hit-Load Table Entry Created	0, 1, 2, 3
PM_SHL_ST_DEP_CREATED	Store-Hit-Load Table Read Hit with entry Enabled	0, 1, 2, 3
PM_SHL_ST_DISABLE	Store-Hit-Load Table Read Hit with entry Disabled (entry was disabled due to the entry shown to not prevent the flush)	0, 1, 2, 3
PM_SLB_TABLEWALK_CYC	Cycles when a tablewalk is pending on this thread on the SLB table	0, 1, 2, 3
PM_SN0_BUSY	SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)	0
PM_SN0_BUSY	SN mach 0 Busy. Used by PMU to sample ave SN lifetime (mach0 used as sample point)	1
PM_SN_HIT	Any port snooper hit L3. Up to 4 can happen in a cycle but we only count 1	3
PM_SN_INVL	Any port snooper detects a store to a line in the Sx state and invalidates the line. Up to 4 can happen in a cycle but we only count 1	2
PM_SN_MISS	Any port snooper L3 miss or collision. Up to 4 can happen in a cycle but we only count 1	3
PM_SNOOP_TLBIE	TLBIE snoop	0, 1, 2, 3
PM_SNP_TM_HIT_M	Snp TM st hit M/Mu	2
PM_SNP_TM_HIT_T	Snp TM sthit T/Tn/Te	2
PM_SN_USAGE	Continuous 16 cycle (2to1) window where this signals rotates thru sampling each SN machine busy. PMU uses this wave to then do 16 cyc count to sample total number of machs running	2
PM_SPACEHOLDER_0x0000040062	SPACE_HOLDER for event 0x0000040062	3
PM_SPACEHOLDER_0x0000040064	SPACE_HOLDER for event 0x0000040064	3
PM_SP_FLOP_CMPL	SP instruction completed	3
PM_SRQ_EMPTY_CYC	Cycles in which the SRQ has at least one (out of four) empty slice	3
PM_SRQ_SYNC_CYC	A sync is in the S2Q (edge detect to count)	0, 1, 2, 3
PM_STALL_END_ICT_EMPTY	The number a times the core transitioned from a stall to ICT-empty for this thread	0
PM_ST_CAUSED_FAIL	Non-TM Store caused any thread to fail	0
PM_ST_CMPL	Stores completed from S2Q (2nd-level store queue).	1
PM_STCX_FAIL	stcx failed	0
PM_STCX_FIN	Number of stcx instructions finished. This includes instructions in the speculative path of a branch that may be flushed	1
PM_STCX_SUCCESS_CMPL	Number of stcx instructions that completed successfully	0, 1, 2, 3
PM_ST_FIN	Store finish count. Includes speculative activity	1
PM_ST_FWD	Store forwards that finished	1
PM_ST_MISS_L1	Store Missed L1	2
PM_STOP_FETCH_PENDING_CYC	Fetching is stopped due to an incoming instruction that will result in a flush	0, 1, 2, 3
PM_SUSPENDED	Counter OFF	0
PM_SUSPENDED	Counter OFF	1
PM_SUSPENDED	Counter OFF	2
PM_SUSPENDED	Counter OFF	3
PM_SYNC_MRK_BR_LINK	Marked Branch and link branch that can cause a synchronous interrupt	0
PM_SYNC_MRK_BR_MPRED	Marked Branch mispredict that can cause a synchronous interrupt	0
PM_SYNC_MRK_FX_DIVIDE	Marked fixed point divide that can cause a synchronous interrupt	0
PM_SYNC_MRK_L2HIT	Marked L2 Hits that can throw a synchronous interrupt	0
PM_SYNC_MRK_L2MISS	Marked L2 Miss that can throw a synchronous interrupt	0
PM_SYNC_MRK_L3MISS	Marked L3 misses that can throw a synchronous interrupt	0
PM_SYNC_MRK_PROBE_NOP	Marked probeNops which can cause synchronous interrupts	0
PM_SYS_PUMP_CPRED	Initial and Final Pump Scope was system pump for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	2
PM_SYS_PUMP_MPRED	Final Pump Scope (system) mispredicted. Either the original scope was too small (Chip/Group) or the original scope was System and it should have been smaller. Counts for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	2
PM_SYS_PUMP_MPRED_RTY	Final Pump Scope (system) ended up larger than Initial Pump Scope (Chip/Group) for all data types excluding data prefetch (demand load,inst prefetch,inst fetch,xlate)	3
PM_TABLEWALK_CYC	Cycles when an instruction tablewalk is active	0
PM_TABLEWALK_CYC_PREF	tablewalk qualified for pte prefetches	0, 1, 2, 3
PM_TAGE_CORRECT	The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time	0, 1, 2, 3
PM_TAGE_CORRECT_TAKEN_CMPL	The TAGE overrode BHT direction prediction and it was correct. Counted at completion for taken branches only	0, 1, 2, 3
PM_TAGE_OVERRIDE_WRONG	The TAGE overrode BHT direction prediction but it was incorrect. Counted at completion for taken branches only	0, 1, 2, 3
PM_TAGE_OVERRIDE_WRONG_SPEC	The TAGE overrode BHT direction prediction and it was correct. Includes taken and not taken and is counted at execution time	0, 1, 2, 3
PM_TAKEN_BR_MPRED_CMPL	Total number of taken branches that were incorrectly predicted as not-taken. This event counts branches completed and does not include speculative instructions	1
PM_TB_BIT_TRANS	timebase event	2
PM_TEND_PEND_CYC	TEND latency per thread	0, 1, 2, 3
PM_THRD_ALL_RUN_CYC	Cycles in which all the threads have the run latch set	1
PM_THRD_CONC_RUN_INST	PPC Instructions Finished by this thread when all threads in the core had the run-latch set	2
PM_THRD_PRIO_0_1_CYC	Cycles thread running at priority level 0 or 1	0, 1, 2, 3
PM_THRD_PRIO_2_3_CYC	Cycles thread running at priority level 2 or 3	0, 1, 2, 3
PM_THRD_PRIO_4_5_CYC	Cycles thread running at priority level 4 or 5	0, 1, 2, 3
PM_THRD_PRIO_6_7_CYC	Cycles thread running at priority level 6 or 7	0, 1, 2, 3
PM_THRESH_ACC	This event increments every time the threshold event counter ticks. Thresholding must be enabled (via MMCRA) and the thresholding start event must occur for this counter to increment. It will stop incrementing when the thresholding stop event occurs or when thresholding is disabled, until the next time a configured thresholding start event occurs.	1
PM_THRESH_EXC_1024	Threshold counter exceeded a value of 1024	2
PM_THRESH_EXC_128	Threshold counter exceeded a value of 128	3
PM_THRESH_EXC_2048	Threshold counter exceeded a value of 2048	3
PM_THRESH_EXC_256	Threshold counter exceed a count of 256	0
PM_THRESH_EXC_32	Threshold counter exceeded a value of 32	1
PM_THRESH_EXC_4096	Threshold counter exceed a count of 4096	0
PM_THRESH_EXC_512	Threshold counter exceeded a value of 512	1
PM_THRESH_EXC_64	Threshold counter exceeded a value of 64	2
PM_THRESH_MET	threshold exceeded	0
PM_THRESH_NOT_MET	Threshold counter did not meet threshold	3
PM_TLB_HIT	Number of times the TLB had the data required by the instruction. Applies to both HPT and RPT	0
PM_TLBIE_FIN	tlbie finished	2
PM_TLB_MISS	TLB Miss (I + D)	1
PM_TM_ABORTS	Number of TM transactions aborted	2
PM_TMA_REQ_L2	addrs only req to L2 only on the first one,Indication that Load footprint is not expanding	0, 1, 2, 3
PM_TM_CAM_OVERFLOW	L3 TM cam overflow during L2 co of SC	0
PM_TM_CAP_OVERFLOW	TM Footprint Capacity Overflow	3
PM_TM_FAIL_CONF_NON_TM	TM aborted because a conflict occurred with a non-transactional access by another processor	0, 1, 2, 3
PM_TM_FAIL_CONF_TM	TM aborted because a conflict occurred with another transaction.	0, 1, 2, 3
PM_TM_FAIL_FOOTPRINT_OVERFLOW	TM aborted because the tracking limit for transactional storage accesses was exceeded.. Asynchronous	0, 1, 2, 3
PM_TM_FAIL_NON_TX_CONFLICT	Non transactional conflict from LSU, gets reported to TEXASR	0, 1, 2, 3
PM_TM_FAIL_SELF	TM aborted because a self-induced conflict occurred in Suspended state, due to one of the following: a store to a storage location that was previously accessed transactionally	0, 1, 2, 3
PM_TM_FAIL_TLBIE	Transaction failed because there was a TLBIE hit in the bloom filter	0, 1, 2, 3
PM_TM_FAIL_TX_CONFLICT	Transactional conflict from LSU, gets reported to TEXASR	0, 1, 2, 3
PM_TM_FAV_CAUSED_FAIL	TM Load (fav) caused another thread to fail	1
PM_TM_FAV_TBEGIN	Dispatch time Favored tbegin	0, 1, 2, 3
PM_TM_LD_CAUSED_FAIL	Non-TM Load caused any thread to fail	0
PM_TM_LD_CONF	TM Load (fav or non-fav) ran into conflict (failed)	1
PM_TM_NESTED_TBEGIN	Completion Tm nested tbegin	0, 1, 2, 3
PM_TM_NESTED_TEND	Completion time nested tend	0, 1, 2, 3
PM_TM_NON_FAV_TBEGIN	Dispatch time non favored tbegin	0, 1, 2, 3
PM_TM_OUTER_TBEGIN	Completion time outer tbegin	0, 1, 2, 3
PM_TM_OUTER_TBEGIN_DISP	Number of outer tbegin instructions dispatched. The dispatch unit determines whether the tbegin instruction is outer or nested. This is a speculative count, which includes flushed instructions	3
PM_TM_OUTER_TEND	Completion time outer tend	0, 1, 2, 3
PM_TM_PASSED	Number of TM transactions that passed	1
PM_TM_RST_SC	TM-snp rst RM SC	1
PM_TM_SC_CO	L3 castout TM SC line	0
PM_TM_ST_CAUSED_FAIL	TM Store (fav or non-fav) caused another thread to fail	2
PM_TM_ST_CONF	TM Store (fav or non-fav) ran into conflict (failed)	2
PM_TM_TABORT_TRECLAIM	Completion time tabortnoncd, tabortcd, treclaim	0, 1, 2, 3
PM_TM_TRANS_RUN_CYC	run cycles in transactional state	0
PM_TM_TRANS_RUN_INST	Run instructions completed in transactional state (gated by the run latch)	2
PM_TM_TRESUME	TM resume instruction completed	0, 1, 2, 3
PM_TM_TSUSPEND	TM suspend instruction completed	0, 1, 2, 3
PM_TM_TX_PASS_RUN_CYC	cycles spent in successful transactions	1
PM_TM_TX_PASS_RUN_INST	Run instructions spent in successful transactions	3
PM_VECTOR_FLOP_CMPL	Vector FP instruction completed	3
PM_VECTOR_LD_CMPL	Number of vector load instructions completed	3
PM_VECTOR_ST_CMPL	Number of vector store instructions completed	3
PM_VSU_DP_FSQRT_FDIV	vector versions of fdiv,fsqrt	2
PM_VSU_FIN	VSU instruction finished. Up to 4 per cycle	1
PM_VSU_FSQRT_FDIV	four flops operation (fdiv,fsqrt) Scalar Instructions only	3
PM_VSU_NON_FLOP_CMPL	Non FLOP operation completed	3
PM_XLATE_HPT_MODE	LSU reports every cycle the thread is in HPT translation mode (as opposed to radix mode)	0, 1, 2, 3
PM_XLATE_MISS	The LSU requested a line from L2 for translation. It may be satisfied from any source beyond L2. Includes speculative instructions	0, 1, 2, 3
PM_XLATE_RADIX_MODE	LSU reports every cycle the thread is in radix translation mode (as opposed to HPT mode)	0, 1, 2, 3

Rules of Optimization: Rule 1: Don't do it. Rule 2 (for experts only): Don't do it yet. - M.A. Jackson

2020/07/20