If C is just a portable assembler then what if the assembly itself has undefined...

kmeisthax · on May 20, 2021

This exists, but the effect of undefined behavior in CPU architectures is a little bit more forgiving than the interpretation of UB in C to mean "literally the entire program has no meaning". Instead, usually the program will execute correctly up to the invalid instruction, and then something happens, and then the CPU will continue executing from that state. It's actually fairly difficult to build an instruction with undefined behavior that contaminates unrelated parts of the program.

Though it HAS happened: notably, brucedawson explains here [1] that the 360 has an instruction so badly thought out that merely having it in an executable page is enough to make your program otherwise meaningless due to speculative execution.

[1] https://randomascii.wordpress.com/2018/01/07/finding-a-cpu-d...

masklinn · on May 20, 2021

> This exists, but the effect of undefined behavior in CPU architectures is a little bit more forgiving than the interpretation of UB in C to mean "literally the entire program has no meaning".

That is not quite what the interpretation of UB in C is, AFAIK. UB in C is generally interpreted as meaning that any path which would trigger an UB is invalid, because if it will be invalid once the UB is reached, well, if we know for sure we're going to UB for the instruction before it, we can say that we're already in UB.

Whole-program invalidity can occur when the compiler manages to infer that no execution path is UB-free, in which case yes the program is meaningless. More generally, programs will go off the rail as far ahead of the UB as the compiler managed to determine that the UB would unfailingly be reached.

And it's usually because the compiler works backwards: if a path would trigger an UB, that path can not be legally taken, therefore it can be deleted. That's why e.g. `if (a > a + 1)` gets deleted, that expression makes sense if you assume signed integers can overflow, but the compiler assumes signed integers can't overflow, therefore this expression can never be true, therefore it's dead code.

This is important, because many such UBs get generated from macro expansion and optimisations (mainly inlining), so the assumption of UB-impossibility (and thus dead code) enables not just specific optimisations but a fair amount of DCE, which reduces function size, which triggers further inlining, and thus the optimisations build upon one another.

coliveira · on May 20, 2021

The situation is different, because a CPU is by definition an interpreter. It does't perform code transformation, at least not at a higher level as a compiler. The CPU only looks at the next few instructions and perform them. A compiler, however, is responsible for taking a large coding unit and produce a transformation that is efficient. That process requires thinking about what is invalid code and operate on that.

infogulch · on May 20, 2021

Wow! Interesting to see hints that meltdown exists years before it was officially published.

mhh__ · on May 20, 2021

Skimmed the article and didn't see a reference to it, you may be interested to know that our good friends and protectors at the NSA may have stumbled on to Meltdown-like issues in the mid 90s

https://en.wikipedia.org/wiki/Meltdown_(security_vulnerabili...

infogulch · on May 20, 2021

I see the NSA strategy for 'securing' the nation against technology threats in their 'unique' way was going strong back in 1995.

coliveira · on May 20, 2021

They "secure" the country by exploiting vulnerabilities and leaving everyone else in the dark. They see the world as just a game between them and other foreign surveillance institutions.

dooglius · on May 20, 2021

There actually is a fair amount of truly undefined behavior for CPUs, but it's always at system/kernel mode rather than userspace for security reasons. You can search an ARM ISA for "UNPREDICTABLE" to see examples.

jart · on May 20, 2021

I seem to recall that Intel and AMD CPUs will behave in strange and unusual ways, particularly when it comes to things like bitshift op flags, if you shift by out of range values, or by 0 or 1. So I guess undefined behaviors in C are somewhat consistent with CPUs. But as other people mentioned Intel is much more forgiving than X3J11. If you ever wanted to find all the dirty corner cases that exist between ANSI and hardware, I swear, try writing C functions that emulate the hardware, and then fuzz the two lockstep style. It's harder than you'd think. [don't click here if you intend to try that: https://github.com/jart/cosmopolitan/blob/master/tool/build/...]

tlb · on May 20, 2021

It does: reading uninitialized memory, simultaneous writes from multiple threads, using memory below the stack pointer with interrupts enabled, ...

Some of C's UB is due to this, some of it is due to the compiler.

mhh__ · on May 20, 2021

It's not though is it