> So RAII isn't the big evil monster, and we need to stop talking about RAII, gl...

zozbot234 · 2025-12-05T00:02:18 1764892938

> Every adherent of managed memory languages should take it as a personal insult that people are choosing to write modern terminal emulators in Rust and Zig.

How so? Garbage collection has inherent performance overhead wrt. manual memory management, and Rust now addresses this by providing the desired guarantees of managed memory without the overhead of GC.

A modern terminal emulator is not going to involve complex reference graphs where objects may cyclically reference one another with no clearly-defined "owner"; which is the one key scenario where GC is an actual necessity even in a low-level systems language. What do they even need GC for? Rather, they should tweak the high-level design of their program to emsure that object lifetimes are properly accounted for without that costly runtime support.

raggi · 2025-12-05T01:36:03 1764898563

> How so? Garbage collection has inherent performance overhead wrt. manual memory management, and Rust now addresses this by providing the desired guarantees of managed memory without the overhead of GC.

I somewhat disagree, specifically on the implicit claim that all GC has overhead and alternatives do not. Rust does a decent job of giving you some ergonomics to get started, but it is still quite unergonomic to fix once you have multiple different allocation problems to deal with. Zig flips that a bit on it's head, it's more painful to get started, but the pain level stays more consistent throughout deeper problems. Ideally though I want a better blend of both - to give a still not super concrete version of what I mean, I mean I want something that can be setup by the systems oriented developer say, near the top of a request path, and it becomes a more implicit dependency for most downstream code with low ceremony and allowing for progressive understanding of contributors way down the call chain who in most cases don't need to care - meanwhile enabling an easy escape hatch when it matters.

I think people make far too much of a distinction between a GC and an allocator, but the reality is that all allocators in common use in high level OS environments are a form of GC. That's of course not what they're talking about, but it's also a critical distinction.

The main difference between what people _call a GC_ and those allocators is that a typical "GC" pauses the program "badly" at malloc time, and a typical allocator pauses a program "badly" at free time (more often than not). It's a bit of a common oddity really, both "GC's" and "allocators" could do things "the other way around" as a common code path. Both models otherwise pool memory and in higher performance tunings have to over-allocate. There are lots of commonly used "faster" allocators in use today that also bypass their own duties at smarter allocation by simply using mmap pools, but those scale poorly: mmap stalls can be pretty unpredictable and have cross-thread side effects that are often undesirable too.

The second difference which I think is more commonly internalized is that typically "the GC" is wired into the runtime in various ways, such as into the scheduler (Go, most dynlangs, etc), and has significant implications at the FFI boundary.

It would be possible to be more explicit about a general purpose allocator that has more GC-like semantics, but also provides the system level malloc/free style API as well as a language assisted more automated API with clever semantics or additional integrations. I guess fil-C has one such system (I've not studied their implementation). I'm not aware of implicit constraints which dictate that there are only two kinds of APIs, fully implicit and intertwined logarithmic GCs, or general purpose allocators which do most of their smart work in free.

My point is I don't really like the GC vs. not-GC arguments very much - I think it's one of the many over-generalizations we have as an industry that people rally hard around and it has been implicitly limiting how far we try to reach for new designs at this boundary. I do stand by a lot of reasoning for systems work that the fully implicitly integrated GC's (Java, Go, various dynlangs) generally are far too opaque for scalable (either very big or very small) systems work and they're unpleasant to deal with once you're forced to. At the same time for that same scalable work you still don't get to ignore the GC you are actually using in the allocator you're using. You don't get to ignore issues like restarting your program that has a 200+GB heap has huge page allocation costs, no matter what middleware set that up. Similarly you don't want a logarithmic allocation strategy on most embedded or otherwise resource constrained systems, those designs are only ok for servers, they're bad for batteries and other parts of total system financial cost in many deployments.

I'd like to see more work explicitly blending these lines, logarithmically allocating GC's scale poorly in many similar ways to more naive mmap based allocators. There are practical issues you run into with overallocation and the solution is to do something more complex than the classical literature. I'd like to see more of this work implemented as standalone modules rather than almost always being implicitly baked into the language/runtime. It's an area that we implicitly couple stuff too much, and again good on Zig for pushing the boundary on a few of these in the standard language and library model it has (and seemingly now also taking the same approach for IO scheduling - that's great).

zozbot234 · 2025-12-05T11:18:17 1764933497

> I somewhat disagree, specifically on the implicit claim that all GC has overhead and alternatives do not.

Not a claim I made. Obviously there are memory management styles (such as stack allocation, pure static memory or pluggable "arenas"/local allocators) that are even lower overhead than a generic heap allocator, and the Rust project does its best to try and support these styles wherever they might be relevant, especially in deep embedded code.

In principle it ought to be also possible to make GC's themselves a "pluggable" feature (the design space is so huge and complex that picking a one-size-fits-all implementation and making it part of the language itself is just not very sensible) to be used only when absolutely required - a bit like allocators in Zig - but this does require some careful design work because the complete systems-level interface to a full tracing GC (including requirements wrt. any invariants that might be involved in correct tracing, read-write barriers, pauses, concurrency etc. etc.) is vastly more complex than one to a simple allocator.

raggi · 2025-12-06T20:22:25 1765052545

I agree precise tracing GC's are a huge mess of integration so making them pluggable is likely a pain. I think I'm fairly down on the notion of tracing GC's though and I'd rather have other lifetime tracking mechanisms.

riku_iki · 2025-12-09T18:30:21 1765305021

> I somewhat disagree, specifically on the implicit claim that all GC has overhead and alternatives do not.

all GC have overhead, but specific allocation pattern: allocators with arena has as minimum overhead as possible, and are orders of magnitude faster because of this. Feel free to tell why this statement is wrong.

simonask · 2025-12-05T00:21:54 1764894114

Go ahead, invent a GC that doesn’t require at least 2-4x the program’s working set of memory, and that doesn’t drizzle the code with little branches and memory barriers.

You will be very rich.

raggi · 2025-12-05T01:03:32 1764896612

Can you give some examples of " ... not excruciatingly painful" and why you think they're inherent to RAII?