Does Suppress All Exceptions also suppress MXCSR updates?

The question

Many x86 instructions from AVX512 and AVX10 families (encoded with EVEX prefix) allow specifying “suppress all exceptions” (SAE) flag in instruction’s opcode (EVEX.b field). While the intention of avoiding throwing exceptions on certain floating point inputs is clear, it is not clear what is happening with individual exception status bits inside the MXCSR register. Should the status bits be updated in cases when e.g. invalid inputs, result overflow or precision loss have happened after an instruction with SAE has completed? Or should they remain untouched as if nothing suspicious was observed?

Intel SDM Volume 2, section 2.7.8 says this:

When EVEX.b is set, “suppress all exceptions” is implied. The processor behaves as if all MXCSR masking controls are set

OK, masking controls are explained. But what about MXCSR status bits?

If a certain numerical exception happens when its matching MXCSR mask is set, the processor records the situation in the matching MXCSR status flag and continues without an exception. Later, the program may inspect the status flags to realize that at least one of FP instructions has recently encountered problems, and so it may reevaluate the results of current FP computation.

With SAE set, it is not clear if the accumulated information about recent FP exception gets stored anywhere. Not all applications would benefit from such deliberate sloppiness.

The experiment

I wrote this silly small program to check the hardware behavior on my system.

The program uses EVEX-encoded machine instruction for adding two vectors of doubles. In both vectors, I pass signaling not-a-number (SNaN) as input. It should result in the FP Invalid Operation exception. The MXCSR has been set up to allow all FP exceptions to be converted into architectural exceptions, unless something else (such as SAE) suppresses them.

The {sn-sae} part in the inline assembly section instructs the assembler to encode SAE bit (EVEX.b) into the opcode.

As the final step, the program reads and prints out the value from MXCSR after the operation. The hypothesis is that, if the MXCSR status flags do get updated despite SAE, we will see non-zero value printed out. Otherwise, the value of zero will be printed out.

Finally, if an exception does get reported, the program will not be able to reach its end and will be terminated with SIGFPE or another signal by the OS.

// sae.c
// Compile with gcc -g -mavx512f sae.c
#include 
#include 
#include 

int main()
{
    const uint64_t snan = 0x7ff0000000000001ULL;
    uint64_t zmm1[8] = {snan, };
    uint64_t zmm2[8] = {snan, };
    uint64_t zmm3[8] = {0};

    const uint32_t all_unmasked = 0;
    __builtin_ia32_ldmxcsr(all_unmasked);
    __asm__ __volatile__(
    "vmovdqu64 (%0), %%zmm1;" // load source registers with SNaN
    "vmovdqu64 (%1), %%zmm2;"
    "vmovdqu64 (%2), %%zmm3;"
   "vaddpd %{rn-sae%},%%zmm1,%%zmm2,%%zmm3;" // Invalid Operation suppressed
//     "vaddpd %%zmm1,%%zmm2,%%zmm3;" // Invalid Operation causes SIGFPE
    :
    :"r"(&zmm1), "r"(&zmm2), "r"(&zmm3)
    :"zmm1", "zmm2", "zmm3"
    );

    uint32_t final_mxcsr = __builtin_ia32_stmxcsr();    
    printf("MXCSR = %#x\n", final_mxcsr);
    return 0;
}

Results

On my system, the program terminates correctly and prints MXCSR = 0. It tells us that SAE affects both masking the exception and updating the status flags.

Out of curiosity, I rerun the program with VADDPD instruction with encoded without SAE (see the commented-out line inside the assembly block), and it did crash with SIGFPE. So the input assumptions about SNaN to cause an FP exception were correct.

Conclusion

At the end, I have found this statement about the interaction between SAE and MXCSR flags in the SDM chapter 15:

The SAE effect is as if all the MXCSR mask bits are set, and none of the MXCSR flags will be updated

The behavior I observed with my program is aligned with the SDM. Weird but OK.


Written by Grigory Rechistov in Uncategorized on 26.01.2024. Tags: sae, avx512, evex, mxcsr,


Copyright © 2024 Grigory Rechistov