Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

> - "Undefined behavior" means that C implementations are allowed to assume that the respective runtime condition does not ever occur, and for example can generate optimized code based on that assumption.

Please note that the article is making the specific argument that this interpretation of UB is an incorrect interpretation. The author is arguing that you, me, the llvm and gcc teams are wrong to interpret UB that way.

Linux had a bug in it a few years ago; the code would dereference a pointer, then check if it was null, then returned an error state if it was null, or continued performing the important part of the function. The compiler deduced that if the pointer had been null when it was dereferenced, that's UB, so the null check was unnecessary, and optimized the null check out. The trouble was that in that context, a null pointer dereference didn't trap, (because it was kernel code? not sure.) so the bug was not detected. It ended up being an exploitable security vulnerability in the kernel, I think a local privilege escalation.

The article is making the argument that the compiler should not be free to optimize out the null check before subsequent dereferences. The compiler is permitted to summon nasal demons where the pointer is dereferenced the first time, but should not be free to summon nasal demons at later lines of code, after the no-nasal-demons-please check.

(The linux kernel now uses -fno-delete-null-pointer-checks to ensure that doesn't happen again. The idea is that even though it was a bug that UB was invoked, the failure behavior should be safe instead of granting root privileges to an unprivileged user.)

Fun with NULL pointers part 1 https://lwn.net/Articles/342330/

Fun with NULL pointers part 2 https://lwn.net/Articles/342420/



> The trouble was that in that context, a null pointer dereference didn't trap, (because it was kernel code? not sure.) so the bug was not detected.

Yes, because it was kernel code. Because that dereference is completely legal in kernel code. The C code was fine, assuming that it was compiled with appropriate kernel flags. This was not a bug in Linux, at least not on the level of the C code itself.

> The linux kernel now uses -fno-delete-null-pointer-checks to ensure that doesn't happen again.

I also seem to remember that it was already using other "please compile this as kernel code" flags that should have implied "no-delete-null-pointer-checks" behavior, and that the lack of this implication was considered a bug in GCC and fixed.


By the way, dereferencing NULL is a well defined behaviour on every computer architecture: you are basically reading at address 0 of memory. It just causes a crash if you have an operating system since it will cause a page fault, but in kernel mode or in devices without an OS is a legit thing to do (and even useful in some cases).

Why should C compilers make it undefined? The standard doesn't mandate that undefined behaviour should change the semantic of the program. Just define all the undefined behaviour that you could, to me keeping them undefined makes no sense (even from the standard point, everyone knows that if you overflow an int it wraps around, why should it be undefined??)


NULL is not required to have a bit representation of all zeroes. If you are programming for a low-level hardware device, it might be worth your while to get a C implementation that does not represent the NULL pointer this way.


Pointers are hardly even required to have a bit representation at all! [1]

This is one of the most common impedance mismatches between programmers and the C spec. C does not mention how the machine should handle memory. Variables and pointers are just abstract constructs there. While many programmers think their programs are a giant char* on their DRAM stick that can be fiddled with at any time and in any way they please.

([1] Actually this is not entirely true but it is better to think of them this way for the sake of not making more assumptions on the memory which are UB. Pointers are allowed to be converted to integer types - but with the caveat that alot of behaviour around it is implementation defined and of course surrounded with a big dose of UB as well!)


I’m not totally sure, but I think dereferencing a zero pointer is theoretically not undefined behavior in C, just so long as you didn’t obtain the zero pointer by initializing a pointer with a null pointer constant.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: