People have complained about the build time of proc macros for ages in the commu...

mmastrac · on Aug 19, 2023

Yeah, there are already binaries in the crates.io ecosystem, and I'm certain that almost none of these people have audited a `build.rs` file or a proc macro implementation which effectively runs as you, completely unsandboxed.

EDIT: I was wrong, this is not actually `watt` -- it may have been re-using code from the project.

This one of those pile-ons where everyone gets excited about having a cause-du-jour to feel passionate about, while simultaneously ignoring issues that are far more pressing.

Arnavion · on Aug 19, 2023

You keep saying this but I suggest you actually look at the code. The precompiled binary is not a sandboxed WASM binary. Despite the name "watt" it has nothing to do with https://github.com/dtolnay/watt . `watt::bytecode` refers to the serialization protocol used by the proc macro shim and the precompiled binary to transfer the token stream over stdio, not anything related to WASM.

Also it's worth noting that even if it was a sandboxed binary ala https://github.com/dtolnay/watt , it's not obvious that distributions or users would be satisfied with that. For example Zig had this discussion with the own WASM blob compiler that they use as part of bootstrapping. https://news.ycombinator.com/item?id=33915321 . As I suggested there, distributions might be okay with building their own golden blobs that they maintain themselves instead of using upstream's, and that could even work in this Rust case for distributions that only care about a single copy of serde for compiling everything. But it's hard for the average user doing `cargo build` for their own projects with cargo registry in `~/.cargo` to do the same replacement.

amluto · on Aug 19, 2023

A really nice (IMO) solution would be to build a wasm blob reproducibly and to ship the blob’s hash, along with a way to download the blob, as part of a release. Then distros could build the blob, confirm that the hash matches, and ship a package that is built from source and nonetheless bit-for-bit identical to the upstream binary.

mmastrac · on Aug 19, 2023

You're right. I misread that code and edited the parent post.

monocasa · on Aug 19, 2023

> almost none of these people have audited a `build.rs` file or a proc macro implementation which effectively runs as you, completely unsandboxed

The biggest orgs tend to run all of their builds sandboxed, including what happens with build.rs. It's part of how they enforce dependency management day to day, but also helps protect against supply chain attacks.

So not everyone does, but enough people do that you can rely on their complaints for the more well trodden parts of the ecosystem.

mmastrac · on Aug 19, 2023

> The biggest orgs tend to run all of their builds sandboxed, including what happens with build.rs. It's part of how they enforce dependency management day to day, but also helps protect against supply chain attacks.

Sandboxing doesn't completely prevent supply-chain attacks. You can avoid getting persistent malware on your CI machines, sure. Exfiltration of tokens and secrets during build? Maybe in a small handful of CI setups where admins have carefully split the source fetch, or limited CI network access to Nexus. Exfiltration of tokens and secrets and/or backdooring after the software has been deployed to production? No, build-time sandboxing doesn't help here at all.

On all developer machines as well? No. Very few big orgs do this and only for mission-critical stuff. Some very important ones have docker-based sandboxed workflows or SSH-to-sandboxed-cluster workflows, or air-gapped laptops but that's very, very rare (I worked in air-gapped environment for a bit and it was a massive pain).

monocasa · on Aug 19, 2023

> Sandboxing doesn't completely prevent supply-chain attacks.

Correct, it's more a defense in depth technique, not a complete defense.

> On all developer machines as well? No. Very few big orgs do this and only for mission-critical stuff.

All builds at Google for instance use the model I laid out including 'developer builds'.

mmastrac · on Aug 19, 2023

Oh wow. I'd be very interested in hearing how they sandbox rust-analyzer. I found a discussion of supporting the analyzer itself by generating config files [1][2], but not how you can sandbox it.

That would be extremely useful as the analyzer is a pretty juicy target and also runs proc-macros/build.rs scripts.

[1] https://github.com/bazelbuild/rules_rust/pull/384

[2] https://bazelbuild.github.io/rules_rust/rust_analyzer.html

mike-cardwell · on Aug 20, 2023

Sandboxing rust-analyzer is fairly easy. Here's how I did it:

https://www.grepular.com/Sandbox_Rust_Development_with_Rust_...

Well, that's a bit out of date now as I use podman, to get around the sudo issues. But that's the basic idea.

fnordpiglet · on Aug 19, 2023

Yeah everywhere I’ve been at back to 1995 does this for high security environments, and if we detected an issue we had security response teams that worked with maintainers and others to remediate.

For lower environments we generally used 3 month or more embargoes on precompiled stuff we couldn’t easily compile ourselves to mitigate some of the supply chain issues if we weren’t directly managing the chain of trust for the binary.

mvolfik · on Aug 19, 2023

I'm quite ambivalent on this issue overall, but let me just point out:

build.rs absolutely is a glaring security hole in the sense you say, but compared to that, this is much worse. You can verify the build.rs code that you download (at least in theory, and some people in banks or distro packages probably actually do), but binaries are orders of magnitude more difficult to inspect, and with the current Rust build system pretty much irreproducible.

Ygg2 · on Aug 19, 2023

> build.rs absolutely is a glaring security hole in the sense you say, but compared to that, this is much worse. You can verify the build.rs code that you download

In theory you can compile your own blob, but you'll need musl and whatnot to make a universal Linux build. Code for making the blob is there in the repo.

build.rs is at best equal. It can access your locally available DB, and transmit your data.

WirelessGigabit · on Aug 19, 2023

The problem here is that nothing in that build is pinned. It builds with nightly, but doesn't define which version / date.

We also don't know how it's build. Ideally there is a Docker container out there that does just an import of source code and then builds. No apk install or apt install (you'd do that in a base published layer). Referenced with an SHA256.

We then use this Docker container to pull in the source code AND its dependencies based on a Cargo.lock. Which... isn't there. So we don't know the exact dependencies that went in.

(Even if there were a Cargo.lock, we need to make sure we actually respect it. I believe cargo install by default ignores the lock file and tries to get the latest version that matches).

darthdeus · on Aug 19, 2023

build.rs is a source file that you can audit. A binary that has no reproducible build is not auditable even if anyone wanted to.

A single person does not audit all of their dependency tree, but many people do read the source code of some if not many of their dependencies, and as a community we can figure out when something is fishy, like in this case.

But when there are binaries involved, nobody can do anything.

This isn't the same as installing a signed binary from a linux package manager that has a checksum and a verified build system. It's a random binary blob someone made in a way that nobody else can check, and it's just "trust me bro there's nothing bad in it".

nullc · on Aug 20, 2023

> and it's just "trust me bro there's nothing bad in it".

The developer should be very concerned about what happens if his system(s) are compromised and the attacker slips a backdoor into these binaries-- it will be difficult to impossible to convince people that the developer himself didn't do it intentionally. Their opacity and immediacy make them much more interesting targets for attack than the source itself (and its associated build scripts).

Saving a few seconds on the first compile on the some other developers computer hardly seems worth that risk.

And at the meta level, we should probably worry about the security practices of someone who isn't worrying about that risk-- what else aren't they worrying about?

amluto · on Aug 19, 2023

At least Linux, OpenBSD, and (with more annoyance) Windows make it relatively straightforward to run things like build.rs in a sandbox. I wonder why Cargo doesn’t do this.

There was also a project (that dtolnay was involved in, I believe!) a few years ago to compile proc macros to wasm.

mmastrac · on Aug 19, 2023

EDIT: I'm wrong, it's not watt -- just shared the same package name.

amluto · on Aug 19, 2023

This looks like a better link:

https://github.com/dtolnay/watt

This would be a lot more compelling if it were integrated into rustc and cargo.

mmastrac · on Aug 19, 2023

I suspect this move is partly to light a fire and force the Rust project to work on this sooner rather than later. We've needed this for years.

MarkMarine · on Aug 19, 2023

It’s probably true that most people commenting don’t audit the builds of transitive deps, but the original issue was a distro that couldn’t distribute precompiled binaries, I’m going to guess this has something to do with their license.

I think having an exit path for those that want to compile from source is important, and I can’t understand the reluctance to provide that.

__jem · on Aug 19, 2023

Well, there is an exit path for those who want to compile from source. If you mean build from source for Cargo users, I believe there's issues with how feature flags interact with transitive dependencies that make this difficult. At least, there's comments on the issue that speak to this. Maybe someone more familiar with Cargo can chime in.