8bit floatingpoint dot product to singleprecision (vector, by element). This instruction computes the fused sumofproducts of a group of four 8bit floatingpoint values held in each 32bit element of the first source vector and a group of four 8bit floatingpoint values in an indexed 32bit element of the second source vector. The singleprecision sumofproducts are scaled by 2^{UInt(FPMR.LSCALE)}, before being destructively added without intermediate rounding to the corresponding singleprecision elements of the destination vector.
The 8bit floatingpoint groups within the second source vector are specified using an immediate index.
The 8bit floatingpoint encoding format for the elements of the first source vector is selected by FPMR.F8S1. The 8bit floatingpoint encoding format for the elements of the second source vector is selected by FPMR.F8S2.
31  30  29  28  27  26  25  24  23  22  21  20  19  18  17  16  15  14  13  12  11  10  9  8  7  6  5  4  3  2  1  0 
0  Q  0  0  1  1  1  1  0  0  L  M  Rm  0  0  0  0  H  0  Rn  Rd  
U  size  opcode 
if !IsFeatureImplemented(FEAT_FP8DOT4) then UNDEFINED; integer n = UInt(Rn); integer m = UInt(M:Rm); integer d = UInt(Rd); integer i = UInt(H:L); constant integer datasize = if Q == '1' then 128 else 64; constant integer esize = 32; constant integer elements = datasize DIV esize;
<Vd> 
Is the name of the SIMD&FP destination register, encoded in the "Rd" field. 
<Ta> 
Is an arrangement specifier,
encoded in

<Vn> 
Is the name of the first SIMD&FP source register, encoded in the "Rn" field. 
<Tb> 
Is an arrangement specifier,
encoded in

<Vm> 
Is the name of the second SIMD&FP source register, encoded in the "M:Rm" fields. 
<index> 
Is the immediate index of a 32bit group of four 8bit values, in the range 0 to 3, encoded in the "H:L" fields. 
CheckFPMREnabled(); CheckFPAdvSIMDEnabled64(); bits(datasize) operand1 = V[n, datasize]; bits(128) operand2 = V[m, 128]; bits(datasize) operand3 = V[d, datasize]; bits(datasize) result; for e = 0 to elements1 bits(esize) op1 = Elem[operand1, e, esize]; bits(esize) op2 = Elem[operand2, i, esize]; bits(esize) sum = Elem[operand3, e, esize]; sum = FP8DotAddFP(sum, op1, op2, FPCR, FPMR); Elem[result, e, esize] = sum; V[d, datasize] = result;
Internal version only: aarchmrs v202403_relA, pseudocode v202403_rel, sve v202403_rel ; Build timestamp: 20240326T09:45
Copyright © 20102024 Arm Limited or its affiliates. All rights reserved. This document is NonConfidential.