I go back and forth on this argument when it comes to codegen. Like, you could make the same argument that protobuf shouldn't output C code. It should output an object file that you can link into whatever compiled language you want. C, fortran, C++, rust, who cares. As you say, the linking model is simple and works well.
Why do we generate big C/C++ strings instead, and compile those? Because object files have lots of compiler/platform/architecture specific stuff in them. Outputting C (or C++) then compiling it is a much more convenient way to generate those object files, because it works on every OS, compiler and architecture. Even systems that you don't know about, or that don't exist yet.
I hear what you're saying and I'm torn about it. I mean, aren't binary blobs just a simpler version of the problem protobuf faces? C code is already the most portable way to make object files. Why wouldn't we use it?
> Like, you could make the same argument that protobuf shouldn't output C code.
The case at hand is an 8MB static array of integers. Obviously yes, of course, absolutely: you choose the correct/simple/obvious/trivialest implementation strategy for the problem. That's exactly what I'm saying!
In the case of protobuf (static generation of an otherwise arbitrarily complicated data structure with reasonably bounded size), code generation makes a ton of sense.
And the simple/obvious/trivialest solution is to write an array literal with 4 millions integers in it, while fighting with Microsoft's link.exe is anything but. Even using rc.exe and loading that data from the resource section is a non-trivial amount of additional work.
Come on. Both binutils and nasm can generate perfectly working PE object files. I don't know the answer off the top of my head, but I bet anything even pure MSVC has a simple answer here. Dealing with the nonsense in the linked article is something you do for a quick hack or to test compiler performance, but (as demonstrated!) it scales poorly. It's terrible engineering, period. Use the right tools, even if they aren't ISO-specified. And if you can't or won't, please don't get into fights on the internet justifying the resulting hackery.
> I bet anything even pure MSVC has a simple answer here
You lose your bet, because it doesn't, neither its inline assembler nor actual MASM shipped with Visual Studio support any "incbin"-like directives. I guess you can generate an .asm with a huge db/dd, I guess, if you don't like a large literal array in .c files, but that's it.
> Come on. Both binutils and nasm can generate perfectly working PE object files. I don't know the answer off the top of my head, but I bet anything even pure MSVC has a simple answer here.
Right, meaning you have to implement N solutions instead of just one. It's a common enough and useful enough feature for the language to support it. I think it would be a different story if linkers were covered by the language specification.
I remain shocked at how controversial this is. Yes. Yes, implementing N trivial and easily maintained solutions is clearly better than one portable hack.
Clearly many people disagree. And given that C now has #embed, I don't even think I'd consider it to be a hack.
> I remain shocked at how controversial this is.
I am a bit shocked that you think the right solution to making data statically available to the rest of your program is somehow outside the scope of the programming language.
The comparison wasn't to #embed[1], but to an 8MB static array. You're winning an argument against a strawman, not me. For the record, I think #embed (given tooling that supports it) would be an excellent choice! That's not a defense of the technique under discussion though.
[1] Which FWIW is much less portable as an issue of practical engineering than assembler or binutils tooling!
Why do we generate big C/C++ strings instead, and compile those? Because object files have lots of compiler/platform/architecture specific stuff in them. Outputting C (or C++) then compiling it is a much more convenient way to generate those object files, because it works on every OS, compiler and architecture. Even systems that you don't know about, or that don't exist yet.
I hear what you're saying and I'm torn about it. I mean, aren't binary blobs just a simpler version of the problem protobuf faces? C code is already the most portable way to make object files. Why wouldn't we use it?