Compiler generating breakpoint instructions for simple maths

If you read this expecting a compiler bug then prepare to be disappointed! This is just an “I didn’t know the compiler did that” observation that I stumbled across today whilst handling an oops report. I thought it was interesting.

So…

If we hand the following C to the GNU C compiler (including the RISCstar toolchain) then it will generate breakpoint instructions (or “undefined” instructions on x86-64):

int f(int x, int y) {
    return x / (y ? __builtin_clz(y) : 0);
}

The ternary would normally be an inline function but has been fully expanded in the example above. It has been defensively coded to avoid calling __builtin_clz(0) (which is undefined). Sadly in our case the output is used as a divisor meaning that, although the ternary has a defined value, there is still a divide-by-zero if y is zero… and that is also undefined.

Compilers can do pretty much anything when there is undefined behaviour (including removing the zero check entirely) but gcc’s behaviour is interesting. It recognises there is no point in generating any real code to implement behaviour that is known to be undefined so it generates a breakpoint instead!

f:
        beq     a1,zero,.L7  # Branch to .L7 if argument 1 is zero
        clzw    a1,a1        # Count Leading Zeros in Word 
        divw    a0,a0,a1     # Divide Word
        ret
.L7:
        ebreak               # Breakpoint

Doing a little archaeology led me to discover that gcc has been doing this since gcc 7.1 was released in 2017. However I’ve never seen it before. Perhaps I just don’t write enough undefined behavior bugs!

Anyhow, if you actually do find this is interesting, this is the Compiler Explorer sandpit I used to examine things a little more:

1 Like

what happens if you throw this code to gcc before 2017? will it generates a div/0 code? :wink:

Pretty much, yes. We actually have to switch to A64 assembler to see what the codegen looks like for older gcc versions since the (upstream) RISC-V support doesn’t go back that far.

So for A64 recent gcc compilers compile that fragment to:

f:
        cbz     w1, .L6
        clz     w1, w1
        sdiv    w0, w0, w1
        ret
.L6:
        brk #1000

Whilst gcc 6 does the divide-by-zero via a conditional select:

f:
        clz     w2, w1
        cmp     w1, 0
        csel    w1, w2, w1, ne
        sdiv    w0, w0, w1
        ret

x86-64 is similar to A64 but using a branch rather than a conditional select in gcc 6.

Finally, if you are curious, clang simply assumes it doesn’t need to worry about the undefined case (which it doesn’t, it is undefined) and removes the zero check entirely. That means it ends up dividing by whatever the clz (or equivalent) instruction returns on 0 (which varies depending on instruction set).

Interesting. I tried to understand the arm assembly code, especially what will happen ‘sdiv’ divide-by-zero. The pseudo code in Arm’s website actually saying the behavior of ‘sdiv’ devide-by-zero is undefined (aka. implementation dependent). GCC’s implementation, insertion of ‘ebreak’, unified the behavior, which is pretty good.

if ConditionPassed() then
    EncodingSpecificOperations();
    if SInt(R[m]) == 0 then
        if IntegerZeroDivideTrappingEnabled() then
            GenerateIntegerZeroDivide();
        else
            result = 0;
    else
        result = RoundTowardsZero(SInt(R[n]) / SInt(R[m]));
    R[d] = result<31:0>;