Hacker Newsnew | past | comments | ask | show | jobs | submitlogin

>Wasm has verification specification. This verified subset makes security exploits seen in those older technologies outright impossible

Both Java and .NET verify their bytecode.

>Wasm bytecode is trivial (as it gets) to turn into machine code

JVM and .NET bytecodes aren't supercomplicated either.

Probably the only real differences are: 1) WASM was designed to be more modular and slimmer from the start, while Java and .NET were designed to be fat; currently there are modularization efforts, but it's too late 2) WASM is an open standard from the start and so browser vendors implement it without plugins

Other than that, it feels like WASM is a reinvention of what already existed before.



AFAIK the big new thing in WASM is that it enforces 'structured control flow' - so it's a bit more like a high level AST than an assembly-style virtual ISA. Not sure how much of that matters in practice, but AFAIK that was the one important feature that enabled the proper validation of WASM bytecode.


I don't think there's any significant advance in the bytecode beyond e.g. JVM bytecode.

The difference is in the surface area of the standard library -- Java applets exposed a lot of stuff that turned out to have a lot of security holes, and it was basically impossible to guarantee there weren't further holes. In WASM, the linear memory and very simple OS interface makes the sandboxing much more tractable.


I worked on JVM bytecode for a significant number of years before working on Wasm. JVM bytecode verification is non-trivial, not only to specify, but to implement efficiently. In Java 6 the class file format introduced stack maps to tame a worst-case O(n^3) bytecode verification overhead, which had become a DoS attack vector. Structured control flow makes Wasm validation effectively linear and vastly simpler to understand and vet. Wasm cleaned up a number of JVM bytecode issues, such as massive redundancy between class files (duplicate constant pool entries), length limitations (Wasm uses LEBs everywhere), typing of locals, more arithmetic instructions, with signedness and floating point that closer matches hardware, addition of SIMD, explicit tail calls, and now first-class functions and a lower-level object model.


Are they validating code to the same degree though? Like, there are obviously learned lessons in how WASM is designed, but at the same time JVM byte code being at a slightly higher level of abstraction can outright make certain incorrect code impossible to express, so it may not be apples to oranges.

What I’m thinking of is simply memory corruption issues from the linear memory model, and while these can only corrupt the given process, not anything outside, it is still not something the JVM allows.


Wasm bytecode verification is more strict than JVM bytecode verification. For example, JVM locals don't have declared types, they are inferred by the abstract interpretation algorithm (one of the reasons for the afore-mentioned O(n^3) worst case). In Wasm bytecode, all locals have declared types.

Wasm GC also introduces non-null reference types, and the validation algorithm guarantees that locals of declared non-null type cannot be used before being initialized. That's also done as part of the single-pass verification.

Wasm GC has a lower-level object model and type system than the JVM (basically structs, arrays, and first-class functions, to which object models are lowered), so it's possible that a higher-level type system, when lowered to Wasm GC, may not be enforceable at the bytecode level. So you could, e.g. screw up the virtual dispatch sequence of a Java method call and end up with a Wasm runtime type error.


Thx for this perspective and info. Regarding "signedness and floating point that closer matches hardware", I'm not seeing unsigned integers. Are they supported? I see only:

> Two’s complement signed integers in 32 bits and optionally 64 bits.

https://webassembly.org/docs/portability/#assumptions-for-ef...

And nothing suggesting unsigned ints here:

https://webassembly.org/features/


Signed and unsigned are just different views on the same bits. CPU registers don't carry signedness either after all, the value they carry is neither signed nor unsigned until you look at the bits and decide to "view" them as a signed or unsigned number.

With the two's complement convention, the concept of 'signedness' only matters when a narrow integer value needs to be extended to a wider value (e.g. 8-bit to 16-bit), specifically whether the new bits needs to be replicated from the narrow value's topmost bit (for signed extension) or set to zero (for unsigned extension).

It would be interesting to speculate what a high level language would look like with such sign-agnostic "Schroedinger's integer types").


CPU instruction sets do account for signed vs unsigned integers. SHR vs SAR for example. It's part of the ISAs. I'm calling this out as AFAIK, the JVM has no support for unsigned ints and so that in turn makes WASM a little more compelling.

https://en.wikibooks.org/wiki/X86_Assembly/Shift_and_Rotate


Yes some instructions do - but surprisingly few (for instance there's signed/unsigned mul/div instructions, but add/sub are 'sign-agnostic'). The important part is that any 'signedness' is associated with the operation, and not with the operands or results.


Well, it has compiler intrinsics for unsigned numbers, for what it’s worth.


Wasm makes no distinction between signed and unsigned integers as variables, only calling them integers. The relevant operations are split between signed and unsigned.

https://webassembly.github.io/spec/core/appendix/index-instr...

See how there's only i32.load and i32.eq, but there's i32.lt_u and i32.lt_s. Loading bits from memory or comparing them is the same operation bit for bit for each of signed and unsigned. However, less than requires knowing the desired signess, and is split between signed and unsigned.


I stand corrected! That’s great information, thanks. I didn’t know JVM bytecode had so many problems.




Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact

Search: