> But you typically can’t prove that. There’s lots of code where you could prove...

dwattttt · on Nov 28, 2022

As I recall, the compiler didn't know it had found undefined behaviour. An optimisation pass saw "this pointer is deferenced", and from that inferred that if execution continued, the pointer can't be null.

If the pointer can't be null, then code that only executes when it is null is dead code that can be pruned.

Voila, null check removed. And most relevantly, it didn't at any point know "this is undefined behaviour". At worst it assumed that dereferencing a null would mean it wouldn't keep executing.

hoseja · on Nov 28, 2022

It removed redundant check then, why not warn about that? gcc -Wpedantic even warns about empty statements fcol.

joosters · on Nov 28, 2022

The compiler didn't find UB. What it saw was a pointer dereference, followed by some code later on that checked if the pointer was null.

Various optimisation phases in compilers try to establish the possible values (or ranges) of variables, and later phases can then use this to improve calculations and comparisons. It's very generic, and useful in many circumstances. For example, if the compiler can see that an integer variable 'i' can only take the values 0-5, it could optimise away a later check of 'i<10'.

In this specific case, the compiler reasoned that the pointer variable could not be zero, and so checks for it being zero were pointless.

caf · on Nov 29, 2022

Yes - and the original post here is the same:

    if (x < 0)
        return 0;

The compiler now knows x's possible range is non-negative.

    int32_t i = x * 0x1ff / 0xffff;

A non-negative multiplied and divided by positive numbers means that i's possible range is also non-negative (this is where the undefinedness of integer overflow comes in - x * 0x1ff can't have a negative result without overflow occurring).

    if (i >= 0 && i < sizeof(tab)) {

The first conditional is trivially true now, because of our established bounds on i, so it can just be replaced with "true". This is what causes the code to behave contrary to the OP's expectations: with his execution environment in the overflow case we can end up with a negative value in i.

sokoloff · on Nov 28, 2022

It is probably more precise to say “if the pointer is null, then it doesn’t matter what I do here, so I am permitted to eliminate this” than to say that it can’t be null here. (It can’t be both null and defined behavior.)

joosters · on Nov 28, 2022

I'm not sure that's right. The compiler isn't tracking undefined behaviour, it is tracking possible values. It just happens that one specific input into determining these values is the fact "a valid program can't dereference a null pointer", so if the source code ever dereferences a pointer, the compiler is free to reason that the pointer cannot therefore be null.

In essence, the compiler is allowed to assume that your code is valid and will only do valid things.

andrewaylett · on Nov 28, 2022

Consider function inlining, or use of a macro to for some generic code. For safety, we include a null check in the inlined code. But then we call it from a site where the variable is known to not be null.

The compiler hasn't found UB through static analysis, it has found a redundant null check.

xenadu02 · on Nov 28, 2022

> I was referring to the situation where a null check was deleted because the compiler found UB through static analysis.

You can say that but in practice -Onone is fairly close to what you're asking for already. Most people are 100% unwilling to live with that performance tradeoff. We know that because almost no one builds production software without optimizations enabled.

The compiler is not intelligent. It just tries to make deductions that let it optimize programs to run faster. 99.999% of the time when it removes a "useless" null check (aka branch that has to be predicted and eat up branch prediction buffer space and bloats up the number of instructions) it really is useless. The compiler can't tell the difference between the useless ones and security critical ones because all of them look the same and are illegal by the rules of the language.

Even if you mandate that null checks can't be removed that doesn't fix all the other situations where inserting the relevant safety checks have huge perf costs or where making something safe reduces to the halting problem.

FWIW I agree that the committee should undertake an effort to convert UB to implementation-defined where possible... for example just mandate twos complement integer representations and make signed integer overflow ID.

To illustrate the complexity: most loops end up using an int which is 32-bit on most 64-bit platforms so if you require signed integer wrapping that slows down all loops because the compiler must insert artificial checks to make the 64-bit register perform 32-bit wrapping and we can't change the size of int at this point.

caf · on Nov 29, 2022

FWIW I agree that the committee should undertake an effort to convert UB to implementation-defined where possible... for example just mandate twos complement integer representations and make signed integer overflow ID.

To accomodate trapping implementations you'd have to make it "implementation-defined or an implementation-defined signal is raised" which it happens is exactly the wording for when an out-of-range value is assigned to a signed type. In practice it means you have to avoid it in your code anyway because "an implementation-defined signal is raised" means "your program may abort and you can't stop it".

gpderetta · on Nov 28, 2022

But again, the compiler did not find UB through static analysis. The compiler inferred that the pointer could not be null and removed a redundant check.

For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

vintermann · on Nov 28, 2022

The compiler made a dangerous assumption that the standard permits ("the author surely has guaranteed, through means I can't analyze, that this pointer will never be null").

Then it encountered evidence explicitly contradicting that assumption (a meaningless null check), and it handled it not by changing its assumption, but by quietly removing the evidence.

> For example you would you not expect a compiler to remove a redundant bound check if it can infer that an index can't be out of range?

If it can infer it from actually good evidence, sure. But using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence that only the C standard could come up with.

gpderetta · on Nov 28, 2022

> using "a pointer was dereferenced" as evidence "this pointer is safe to dereference" is comically bad evidence

Do you think the compiler would be right to remove the second check here?

   if (!x) std::abort();
   if (!x) return;
   ... = *x;

What about changing std::abort with the following?

   [[noreturn]] void my_abort();

How's that different form a check after dereferencing a pointer? In both cases the check can be removed because dataflow or control flow analysis.

What if my_abort returns instead? Or another thread changes x after the fact?

vintermann · on Nov 28, 2022

If I had written the above code, I had clearly done something wrong. I would not want the compiler to remove the second check. I'd want it to (at the very least) warn me about an unreachable return statement, so that I could remove the actual meaningless code.

It's been long enough since I wrote C that I'm not familiar with that noreturn syntax or the contract I guess it implies, but control flow analysis which can prove the code will never be run, should all ideally warn me about it so that I can remove it in the source code, not quietly remove it from the object code.

I'm not demanding that it should happen in every case, but the cases where it's undecidable whether a statement is reachable or not, obviously it's undecidable for purposes of optimizing away the statement too.

gpderetta · on Nov 28, 2022

The first check might be in a completely different function in another module (for example a postcondition check before a return). Removing dead code is completely normal and desirable, warning every time it happens would be completely pointless and wrong.

xigoi · on Nov 28, 2022

   if (!x) std::abort();
   if (!x) return;

In this case, the compiler should warn that the second statement will never be executed, instead of just silently removing it.

gpderetta · on Nov 28, 2022

   int *x = libX_foo();
   if (!x) {
      return;
   }
   ...

libX_foo from libX gets at some point updated to abort if the return value would be null. After interprocedural analysis (possibly during LTO) the compiler infers that the if statement is redundant.

Should the compiler complain? Should you remove the check?

Consider that libX_foo returning not-null might not be part of the contract and just an implementation detail of this version.

xigoi · on Nov 28, 2022

> Should the compiler complain? Should you remove the check?

Yes and yes.

> Consider that libX_foo returning not-null might not be part of the contract and just an implementation detail of this version.

How is it an “implementation detail” whether a procedure can return null? That's always an important part of its interface.

cesarb · on Nov 28, 2022

> How is it an “implementation detail” whether a procedure can return null? That's always an important part of its interface.

In gpderetta's example, the interface contract for that function says "it can return null" (which is why the calling code has to check for null). The implementation for this particular version of the libX code, however, never returns null. That is, when the calling code is linked together with that particular version of the libX interface, and the compiler can see both the caller and the implementation (due to link-time optimization or similar), it can remove the null check in the caller. But it shouldn't complain, because the null check is correct, and will be used when the program is linked with a different version of the libX code which happens to be able to return null.

For a more concrete example: libX_foo is a function which does some calculations, and allocates temporary memory for these calculations, and this temporary allocation can fail. A later version of libX_foo changes the code so it no longer needs a temporary memory allocation, so it no longer can fail.

And LTO is not even necessary. It could be an inline function defined in a header coming from libX (this kind of thing is very common in C++ with template-heavy code). The program still cannot assume a particular version of libX, so it still needs the null check, even though in some versions of libX the compiler will remove it.

gpderetta · on Nov 28, 2022

Thanks for elaborating on this.

I mentioned LTO because compilation units were seen in the past as safe optimization barriers.

gpderetta · on Nov 28, 2022

The contract is that libX_foo can return null. But a specific implementation might not. Now you need to remove the caller side check to shut up the compiler which will leave you exposed to a future update making full use of the contract.

Also consider code that call libX_foo via a pointer. After specialization the compiler might see that the check is redundant, but you can't remove the check because the function might still be called with other function pointers making full use of the contract.

xigoi · on Nov 28, 2022

> The contract is that libX_foo can return null.

I'd expect any reasonable library to say “libX_foo returns null if [something happens]”. What use is there in a procedure that can just return null whenever it feels like it?

account42 · on Nov 29, 2022

It returns null when it fails to do its task for some reason. It is not unreasonable for the condition for that failure to be complex enough or change over time so it doesn't make sense to spell it out in the interface contract.