mirror of
https://github.com/superseriousbusiness/gotosocial.git
synced 2024-12-22 18:22:11 +00:00
1e7b32490d
This allows for building GoToSocial with [SQLite transpiled to WASM](https://github.com/ncruces/go-sqlite3) and accessed through [Wazero](https://wazero.io/).
1588 lines
84 KiB
Markdown
1588 lines
84 KiB
Markdown
# Notable rationale of wazero
|
||
|
||
## Zero dependencies
|
||
|
||
Wazero has zero dependencies to differentiate itself from other runtimes which
|
||
have heavy impact usually due to CGO. By avoiding CGO, wazero avoids
|
||
prerequisites such as shared libraries or libc, and lets users keep features
|
||
like cross compilation.
|
||
|
||
Avoiding go.mod dependencies reduces interference on Go version support, and
|
||
size of a statically compiled binary. However, doing so brings some
|
||
responsibility into the project.
|
||
|
||
Go's native platform support is good: We don't need platform-specific code to
|
||
get monotonic time, nor do we need much work to implement certain features
|
||
needed by our compiler such as `mmap`. That said, Go does not support all
|
||
common operating systems to the same degree. For example, Go 1.18 includes
|
||
`Mprotect` on Linux and Darwin, but not FreeBSD.
|
||
|
||
The general tradeoff the project takes from a zero dependency policy is more
|
||
explicit support of platforms (in the compiler runtime), as well a larger and
|
||
more technically difficult codebase.
|
||
|
||
At some point, we may allow extensions to supply their own platform-specific
|
||
hooks. Until then, one end user impact/tradeoff is some glitches trying
|
||
untested platforms (with the Compiler runtime).
|
||
|
||
### Why do we use CGO to implement system calls on darwin?
|
||
|
||
wazero is dependency and CGO free by design. In some cases, we have code that
|
||
can optionally use CGO, but retain a fallback for when that's disabled. The only
|
||
operating system (`GOOS`) we use CGO by default in is `darwin`.
|
||
|
||
Unlike other operating systems, regardless of `CGO_ENABLED`, Go always uses
|
||
"CGO" mechanisms in the runtime layer of `darwin`. This is explained in
|
||
[Statically linked binaries on Mac OS X](https://developer.apple.com/library/archive/qa/qa1118/_index.html#//apple_ref/doc/uid/DTS10001666):
|
||
|
||
> Apple does not support statically linked binaries on Mac OS X. A statically
|
||
> linked binary assumes binary compatibility at the kernel system call
|
||
> interface, and we do not make any guarantees on that front. Rather, we strive
|
||
> to ensure binary compatibility in each dynamically linked system library and
|
||
> framework.
|
||
|
||
This plays to our advantage for system calls that aren't yet exposed in the Go
|
||
standard library, notably `futimens` for nanosecond-precision timestamp
|
||
manipulation.
|
||
|
||
### Why not x/sys
|
||
|
||
Going beyond Go's SDK limitations can be accomplished with their [x/sys library](https://pkg.go.dev/golang.org/x/sys/unix).
|
||
For example, this includes `zsyscall_freebsd_amd64.go` missing from the Go SDK.
|
||
|
||
However, like all dependencies, x/sys is a source of conflict. For example,
|
||
x/sys had to be in order to upgrade to Go 1.18.
|
||
|
||
If we depended on x/sys, we could get more precise functionality needed for
|
||
features such as clocks or more platform support for the compiler runtime.
|
||
|
||
That said, formally supporting an operating system may still require testing as
|
||
even use of x/sys can require platform-specifics. For example, [mmap-go](https://github.com/edsrzf/mmap-go)
|
||
uses x/sys, but also mentions limitations, some not surmountable with x/sys
|
||
alone.
|
||
|
||
Regardless, we may at some point introduce a separate go.mod for users to use
|
||
x/sys as a platform plugin without forcing all users to maintain that
|
||
dependency.
|
||
|
||
## Project structure
|
||
|
||
wazero uses internal packages extensively to balance API compatibility desires for end users with the need to safely
|
||
share internals between compilers.
|
||
|
||
End-user packages include `wazero`, with `Config` structs, `api`, with shared types, and the built-in `wasi` library.
|
||
Everything else is internal.
|
||
|
||
We put the main program for wazero into a directory of the same name to match conventions used in `go install`,
|
||
notably the name of the folder becomes the binary name. We chose to use `cmd/wazero` as it is common practice
|
||
and less surprising than `wazero/wazero`.
|
||
|
||
### Internal packages
|
||
|
||
Most code in wazero is internal, and it is acknowledged that this prevents external implementation of facets such as
|
||
compilers or decoding. It also prevents splitting this code into separate repositories, resulting in a larger monorepo.
|
||
This also adds work as more code needs to be centrally reviewed.
|
||
|
||
However, the alternative is neither secure nor viable. To allow external implementation would require exporting symbols
|
||
public, such as the `CodeSection`, which can easily create bugs. Moreover, there's a high drift risk for any attempt at
|
||
external implementations, compounded not just by wazero's code organization, but also the fast moving Wasm and WASI
|
||
specifications.
|
||
|
||
For example, implementing a compiler correctly requires expertise in Wasm, Golang and assembly. This requires deep
|
||
insight into how internals are meant to be structured and the various tiers of testing required for `wazero` to result
|
||
in a high quality experience. Even if someone had these skills, supporting external code would introduce variables which
|
||
are constants in the central one. Supporting an external codebase is harder on the project team, and could starve time
|
||
from the already large burden on the central codebase.
|
||
|
||
The tradeoffs of internal packages are a larger codebase and responsibility to implement all standard features. It also
|
||
implies thinking about extension more as forking is not viable for reasons above also. The primary mitigation of these
|
||
realities are friendly OSS licensing, high rigor and a collaborative spirit which aim to make contribution in the shared
|
||
codebase productive.
|
||
|
||
### Avoiding cyclic dependencies
|
||
|
||
wazero shares constants and interfaces with internal code by a sharing pattern described below:
|
||
* shared interfaces and constants go in one package under root: `api`.
|
||
* user APIs and structs depend on `api` and go into the root package `wazero`.
|
||
* e.g. `InstantiateModule` -> `/wasm.go` depends on the type `api.Module`.
|
||
* implementation code can also depend on `api` in a corresponding package under `/internal`.
|
||
* Ex package `wasm` -> `/internal/wasm/*.go` and can depend on the type `api.Module`.
|
||
|
||
The above guarantees no cyclic dependencies at the cost of having to re-define symbols that exist in both packages.
|
||
For example, if `wasm.Store` is a type the user needs access to, it is narrowed by a cover type in the `wazero`:
|
||
|
||
```go
|
||
type runtime struct {
|
||
s *wasm.Store
|
||
}
|
||
```
|
||
|
||
This is not as bad as it sounds as mutations are only available via configuration. This means exported functions are
|
||
limited to only a few functions.
|
||
|
||
### Avoiding security bugs
|
||
|
||
In order to avoid security flaws such as code insertion, nothing in the public API is permitted to write directly to any
|
||
mutable symbol in the internal package. For example, the package `api` is shared with internal code. To ensure
|
||
immutability, the `api` package cannot contain any mutable public symbol, such as a slice or a struct with an exported
|
||
field.
|
||
|
||
In practice, this means shared functionality like memory mutation need to be implemented by interfaces.
|
||
|
||
Here are some examples:
|
||
* `api.Memory` protects access by exposing functions like `WriteFloat64Le` instead of exporting a buffer (`[]byte`).
|
||
* There is no exported symbol for the `[]byte` representing the `CodeSection`
|
||
|
||
Besides security, this practice prevents other bugs and allows centralization of validation logic such as decoding Wasm.
|
||
|
||
## API Design
|
||
|
||
### Why is `context.Context` inconsistent?
|
||
|
||
It may seem strange that only certain API have an initial `context.Context`
|
||
parameter. We originally had a `context.Context` for anything that might be
|
||
traced, but it turned out to be only useful for lifecycle and host functions.
|
||
|
||
For instruction-scoped aspects like memory updates, a context parameter is too
|
||
fine-grained and also invisible in practice. For example, most users will use
|
||
the compiler engine, and its memory, global or table access will never use go's
|
||
context.
|
||
|
||
### Why does `api.ValueType` map to uint64?
|
||
|
||
WebAssembly allows functions to be defined either by the guest or the host,
|
||
with signatures expressed as WebAssembly types. For example, `i32` is a 32-bit
|
||
type which might be interpreted as signed. Function signatures can have zero or
|
||
more parameters or results even if WebAssembly 1.0 allows up to one result.
|
||
|
||
The guest can export functions, so that the host can call it. In the case of
|
||
wazero, the host is Go and an exported function can be called via
|
||
`api.Function`. `api.Function` allows users to supply parameters and read
|
||
results as a slice of uint64. For example, if there are no results, an empty
|
||
slice is returned. The user can learn the signature via `FunctionDescription`,
|
||
which returns the `api.ValueType` corresponding to each parameter or result.
|
||
`api.ValueType` defines the mapping of WebAssembly types to `uint64` values for
|
||
reason described in this section. The special case of `v128` is also mentioned
|
||
below.
|
||
|
||
wazero maps each value type to a uint64 values because it holds the largest
|
||
type in WebAssembly 1.0 (i64). A slice allows you to express empty (e.g. a
|
||
nullary signature), for example a start function.
|
||
|
||
Here's an example of calling a function, noting this syntax works for both a
|
||
signature `(param i32 i32) (result i32)` and `(param i64 i64) (result i64)`
|
||
```go
|
||
x, y := uint64(1), uint64(2)
|
||
results, err := mod.ExportedFunction("add").Call(ctx, x, y)
|
||
if err != nil {
|
||
log.Panicln(err)
|
||
}
|
||
fmt.Printf("%d + %d = %d\n", x, y, results[0])
|
||
```
|
||
|
||
WebAssembly does not define an encoding strategy for host defined parameters or
|
||
results. This means the encoding rules above are defined by wazero instead. To
|
||
address this, we clarified mapping both in `api.ValueType` and added helper
|
||
functions like `api.EncodeF64`. This allows users conversions typical in Go
|
||
programming, and utilities to avoid ambiguity and edge cases around casting.
|
||
|
||
Alternatively, we could have defined a byte buffer based approach and a binary
|
||
encoding of value types in and out. For example, an empty byte slice would mean
|
||
no values, while a non-empty could use a binary encoding for supported values.
|
||
This could work, but it is more difficult for the normal case of i32 and i64.
|
||
It also shares a struggle with the current approach, which is that value types
|
||
were added after WebAssembly 1.0 and not all of them have an encoding. More on
|
||
this below.
|
||
|
||
In summary, wazero chose an approach for signature mapping because there was
|
||
none, and the one we chose biases towards simplicity with integers and handles
|
||
the rest with documentation and utilities.
|
||
|
||
#### Post 1.0 value types
|
||
|
||
Value types added after WebAssembly 1.0 stressed the current model, as some
|
||
have no encoding or are larger than 64 bits. While problematic, these value
|
||
types are not commonly used in exported (extern) functions. However, some
|
||
decisions were made and detailed below.
|
||
|
||
For example `externref` has no guest representation. wazero chose to map
|
||
references to uint64 as that's the largest value needed to encode a pointer on
|
||
supported platforms. While there are two reference types, `externref` and
|
||
`functype`, the latter is an internal detail of function tables, and the former
|
||
is rarely if ever used in function signatures as of the end of 2022.
|
||
|
||
The only value larger than 64 bits is used for SIMD (`v128`). Vectorizing via
|
||
host functions is not used as of the end of 2022. Even if it were, it would be
|
||
inefficient vs guest vectorization due to host function overhead. In other
|
||
words, the `v128` value type is unlikely to be in an exported function
|
||
signature. That it requires two uint64 values to encode is an internal detail
|
||
and not worth changing the exported function interface `api.Function`, as doing
|
||
so would break all users.
|
||
|
||
### Interfaces, not structs
|
||
|
||
All exported types in public packages, regardless of configuration vs runtime, are interfaces. The primary benefits are
|
||
internal flexibility and avoiding people accidentally mis-initializing by instantiating the types on their own vs using
|
||
the `NewXxx` constructor functions. In other words, there's less support load when things can't be done incorrectly.
|
||
|
||
Here's an example:
|
||
```go
|
||
rt := &RuntimeConfig{} // not initialized properly (fields are nil which shouldn't be)
|
||
rt := RuntimeConfig{} // not initialized properly (should be a pointer)
|
||
rt := wazero.NewRuntimeConfig() // initialized properly
|
||
```
|
||
|
||
There are a few drawbacks to this, notably some work for maintainers.
|
||
* Interfaces are decoupled from the structs implementing them, which means the signature has to be repeated twice.
|
||
* Interfaces have to be documented and guarded at time of use, that 3rd party implementations aren't supported.
|
||
* As of Golang 1.21, interfaces are still [not well supported](https://github.com/golang/go/issues/5860) in godoc.
|
||
|
||
## Config
|
||
|
||
wazero configures scopes such as Runtime and Module using `XxxConfig` types. For example, `RuntimeConfig` configures
|
||
`Runtime` and `ModuleConfig` configure `Module` (instantiation). In all cases, config types begin defaults and can be
|
||
customized by a user, e.g., selecting features or a module name override.
|
||
|
||
### Why don't we make each configuration setting return an error?
|
||
No config types create resources that would need to be closed, nor do they return errors on use. This helps reduce
|
||
resource leaks, and makes chaining easier. It makes it possible to parse configuration (ex by parsing yaml) independent
|
||
of validating it.
|
||
|
||
Instead of:
|
||
```
|
||
cfg, err = cfg.WithFS(fs)
|
||
if err != nil {
|
||
return err
|
||
}
|
||
cfg, err = cfg.WithName(name)
|
||
if err != nil {
|
||
return err
|
||
}
|
||
mod, err = rt.InstantiateModuleWithConfig(ctx, code, cfg)
|
||
if err != nil {
|
||
return err
|
||
}
|
||
```
|
||
|
||
There's only one call site to handle errors:
|
||
```
|
||
cfg = cfg.WithFS(fs).WithName(name)
|
||
mod, err = rt.InstantiateModuleWithConfig(ctx, code, cfg)
|
||
if err != nil {
|
||
return err
|
||
}
|
||
```
|
||
|
||
This allows users one place to look for errors, and also the benefit that if anything internally opens a resource, but
|
||
errs, there's nothing they need to close. In other words, users don't need to track which resources need closing on
|
||
partial error, as that is handled internally by the only code that can read configuration fields.
|
||
|
||
### Why are configuration immutable?
|
||
While it seems certain scopes like `Runtime` won't repeat within a process, they do, possibly in different goroutines.
|
||
For example, some users create a new runtime for each module, and some re-use the same base module configuration with
|
||
only small updates (ex the name) for each instantiation. Making configuration immutable allows them to be safely used in
|
||
any goroutine.
|
||
|
||
Since config are immutable, changes apply via return val, similar to `append` in a slice.
|
||
|
||
For example, both of these are the same sort of error:
|
||
```go
|
||
append(slice, element) // bug as only the return value has the updated slice.
|
||
cfg.WithName(next) // bug as only the return value has the updated name.
|
||
```
|
||
|
||
Here's an example of correct use: re-assigning explicitly or via chaining.
|
||
```go
|
||
cfg = cfg.WithName(name) // explicit
|
||
|
||
mod, err = rt.InstantiateModuleWithConfig(ctx, code, cfg.WithName(name)) // implicit
|
||
if err != nil {
|
||
return err
|
||
}
|
||
```
|
||
|
||
### Why aren't configuration assigned with option types?
|
||
The option pattern is a familiar one in Go. For example, someone defines a type `func (x X) err` and uses it to update
|
||
the target. For example, you could imagine wazero could choose to make `ModuleConfig` from options vs chaining fields.
|
||
|
||
Ex instead of:
|
||
```go
|
||
type ModuleConfig interface {
|
||
WithName(string) ModuleConfig
|
||
WithFS(fs.FS) ModuleConfig
|
||
}
|
||
|
||
struct moduleConfig {
|
||
name string
|
||
fs fs.FS
|
||
}
|
||
|
||
func (c *moduleConfig) WithName(name string) ModuleConfig {
|
||
ret := *c // copy
|
||
ret.name = name
|
||
return &ret
|
||
}
|
||
|
||
func (c *moduleConfig) WithFS(fs fs.FS) ModuleConfig {
|
||
ret := *c // copy
|
||
ret.setFS("/", fs)
|
||
return &ret
|
||
}
|
||
|
||
config := r.NewModuleConfig().WithFS(fs)
|
||
configDerived := config.WithName("name")
|
||
```
|
||
|
||
An option function could be defined, then refactor each config method into an name prefixed option function:
|
||
```go
|
||
type ModuleConfig interface {
|
||
}
|
||
struct moduleConfig {
|
||
name string
|
||
fs fs.FS
|
||
}
|
||
|
||
type ModuleConfigOption func(c *moduleConfig)
|
||
|
||
func ModuleConfigName(name string) ModuleConfigOption {
|
||
return func(c *moduleConfig) {
|
||
c.name = name
|
||
}
|
||
}
|
||
|
||
func ModuleConfigFS(fs fs.FS) ModuleConfigOption {
|
||
return func(c *moduleConfig) {
|
||
c.fs = fs
|
||
}
|
||
}
|
||
|
||
func (r *runtime) NewModuleConfig(opts ...ModuleConfigOption) ModuleConfig {
|
||
ret := newModuleConfig() // defaults
|
||
for _, opt := range opts {
|
||
opt(&ret.config)
|
||
}
|
||
return ret
|
||
}
|
||
|
||
func (c *moduleConfig) WithOptions(opts ...ModuleConfigOption) ModuleConfig {
|
||
ret := *c // copy base config
|
||
for _, opt := range opts {
|
||
opt(&ret.config)
|
||
}
|
||
return ret
|
||
}
|
||
|
||
config := r.NewModuleConfig(ModuleConfigFS(fs))
|
||
configDerived := config.WithOptions(ModuleConfigName("name"))
|
||
```
|
||
|
||
wazero took the path of the former design primarily due to:
|
||
* interfaces provide natural namespaces for their methods, which is more direct than functions with name prefixes.
|
||
* parsing config into function callbacks is more direct vs parsing config into a slice of functions to do the same.
|
||
* in either case derived config is needed and the options pattern is more awkward to achieve that.
|
||
|
||
There are other reasons such as test and debug being simpler without options: the above list is constrained to conserve
|
||
space. It is accepted that the options pattern is common in Go, which is the main reason for documenting this decision.
|
||
|
||
### Why aren't config types deeply structured?
|
||
wazero's configuration types cover the two main scopes of WebAssembly use:
|
||
* `RuntimeConfig`: This is the broadest scope, so applies also to compilation
|
||
and instantiation. e.g. This controls the WebAssembly Specification Version.
|
||
* `ModuleConfig`: This affects modules instantiated after compilation and what
|
||
resources are allowed. e.g. This defines how or if STDOUT is captured. This
|
||
also allows sub-configuration of `FSConfig`.
|
||
|
||
These default to a flat definition each, with lazy sub-configuration only after
|
||
proven to be necessary. A flat structure is easier to work with and is also
|
||
easy to discover. Unlike the option pattern described earlier, more
|
||
configuration in the interface doesn't taint the package namespace, only
|
||
`ModuleConfig`.
|
||
|
||
We default to a flat structure to encourage simplicity. If we eagerly broke out
|
||
all possible configurations into sub-types (e.g. ClockConfig), it would be hard
|
||
to notice configuration sprawl. By keeping the config flat, it is easy to see
|
||
the cognitive load we may be adding to our users.
|
||
|
||
In other words, discomfort adding more configuration is a feature, not a bug.
|
||
We should only add new configuration rarely, and before doing so, ensure it
|
||
will be used. In fact, this is why we support using context fields for
|
||
experimental configuration. By letting users practice, we can find out if a
|
||
configuration was a good idea or not before committing to it, and potentially
|
||
sprawling our types.
|
||
|
||
In reflection, this approach worked well for the nearly 1.5 year period leading
|
||
to version 1.0. We've only had to create a single sub-configuration, `FSConfig`,
|
||
and it was well understood why when it occurred.
|
||
|
||
## Why does `ModuleConfig.WithStartFunctions` default to `_start`?
|
||
|
||
We formerly had functions like `StartWASICommand` that would verify
|
||
preconditions and start WASI's `_start` command. However, this caused confusion
|
||
because both many languages compiled a WASI dependency, and many did so
|
||
inconsistently.
|
||
|
||
The conflict is that exported functions need to use features the language
|
||
runtime provides, such as garbage collection. There's a "chicken-egg problem"
|
||
where `_start` needs to complete in order for exported behavior to work.
|
||
|
||
For example, unlike `GOOS=wasip1` in Go 1.21, TinyGo's "wasi" target supports
|
||
function exports. So, the only way to use FFI style is via the "wasi" target.
|
||
Not explicitly calling `_start` before an ABI such as wapc-go, would crash, due
|
||
to setup not happening (e.g. to implement `panic`). Other embedders such as
|
||
Envoy also called `_start` for the same reason. To avoid a common problem for
|
||
users unaware of WASI, and also to simplify normal use of WASI (e.g. `main`),
|
||
we added `_start` to `ModuleConfig.WithStartFunctions`.
|
||
|
||
In cases of multiple initializers, such as in wapc-go, users can override this
|
||
to add the others *after* `_start`. Users who want to explicitly control
|
||
`_start`, such as some of our unit tests, can clear the start functions and
|
||
remove it.
|
||
|
||
This decision was made in 2022, and holds true in 2023, even with the
|
||
introduction of "wasix". It holds because "wasix" is backwards compatible with
|
||
"wasip1". In the future, there will be other ways to start applications, and
|
||
may not be backwards compatible with "wasip1".
|
||
|
||
Most notably WASI "Preview 2" is not implemented in a way compatible with
|
||
wasip1. Its start function is likely to be different, and defined in the
|
||
wasi-cli "world". When the design settles, and it is implemented by compilers,
|
||
wazero will attempt to support "wasip2". However, it won't do so in a way that
|
||
breaks existing compilers.
|
||
|
||
In other words, we won't remove `_start` if "wasip2" continues a path of an
|
||
alternate function name. If we did, we'd break existing users despite our
|
||
compatibility promise saying we don't. The most likely case is that when we
|
||
build-in something incompatible with "wasip1", that start function will be
|
||
added to the start functions list in addition to `_start`.
|
||
|
||
See http://wasix.org
|
||
See https://github.com/WebAssembly/wasi-cli
|
||
|
||
## Runtime == Engine+Store
|
||
wazero defines a single user-type which combines the specification concept of `Store` with the unspecified `Engine`
|
||
which manages them.
|
||
|
||
### Why not multi-store?
|
||
Multi-store isn't supported as the extra tier complicates lifecycle and locking. Moreover, in practice it is unusual for
|
||
there to be an engine that has multiple stores which have multiple modules. More often, it is the case that there is
|
||
either 1 engine with 1 store and multiple modules, or 1 engine with many stores, each having 1 non-host module. In worst
|
||
case, a user can use multiple runtimes until "multi-store" is better understood.
|
||
|
||
If later, we have demand for multiple stores, that can be accomplished by overload. e.g. `Runtime.InstantiateInStore` or
|
||
`Runtime.Store(name) Store`.
|
||
|
||
## Exit
|
||
|
||
### Why do we only return a `sys.ExitError` on a non-zero exit code?
|
||
|
||
It is reasonable to think an exit error should be returned, even if the code is
|
||
success (zero). Even on success, the module is no longer functional. For
|
||
example, function exports would error later. However, wazero does not. The only
|
||
time `sys.ExitError` is on error (non-zero).
|
||
|
||
This decision was to improve performance and ergonomics for guests that both
|
||
use WASI (have a `_start` function), and also allow custom exports.
|
||
Specifically, Rust, TinyGo and normal wasi-libc, don't exit the module during
|
||
`_start`. If they did, it would invalidate their function exports. This means
|
||
it is unlikely most compilers will change this behavior.
|
||
|
||
`GOOS=waspi1` from Go 1.21 does exit during `_start`. However, it doesn't
|
||
support other exports besides `_start`, and `_start` is not defined to be
|
||
called multiple times anyway.
|
||
|
||
Since `sys.ExitError` is not always returned, we added `Module.IsClosed` for
|
||
defensive checks. This helps integrators avoid calling functions which will
|
||
always fail.
|
||
|
||
### Why panic with `sys.ExitError` after a host function exits?
|
||
|
||
Currently, the only portable way to stop processing code is via panic. For
|
||
example, WebAssembly "trap" instructions, such as divide by zero, are
|
||
implemented via panic. This ensures code isn't executed after it.
|
||
|
||
When code reaches the WASI `proc_exit` instruction, we need to stop processing.
|
||
Regardless of the exit code, any code invoked after exit would be in an
|
||
inconsistent state. This is likely why unreachable instructions are sometimes
|
||
inserted after exit: https://github.com/emscripten-core/emscripten/issues/12322
|
||
|
||
## WASI
|
||
|
||
Unfortunately, (WASI Snapshot Preview 1)[https://github.com/WebAssembly/WASI/blob/snapshot-01/phases/snapshot/docs.md] is not formally defined enough, and has APIs with ambiguous semantics.
|
||
This section describes how Wazero interprets and implements the semantics of several WASI APIs that may be interpreted differently by different wasm runtimes.
|
||
Those APIs may affect the portability of a WASI application.
|
||
|
||
### Why don't we attempt to pass wasi-testsuite on user-defined `fs.FS`?
|
||
|
||
While most cases work fine on an `os.File` based implementation, we won't
|
||
promise wasi-testsuite compatibility on user defined wrappers of `os.DirFS`.
|
||
The only option for real systems is to use our `sysfs.FS`.
|
||
|
||
There are a lot of areas where windows behaves differently, despite the
|
||
`os.File` abstraction. This goes well beyond file locking concerns (e.g.
|
||
`EBUSY` errors on open files). For example, errors like `ACCESS_DENIED` aren't
|
||
properly mapped to `EPERM`. There are trickier parts too. `FileInfo.Sys()`
|
||
doesn't return enough information to build inodes needed for WASI. To rebuild
|
||
them requires the full path to the underlying file, not just its directory
|
||
name, and there's no way for us to get that information. At one point we tried,
|
||
but in practice things became tangled and functionality such as read-only
|
||
wrappers became untenable. Finally, there are version-specific behaviors which
|
||
are difficult to maintain even in our own code. For example, go 1.20 opens
|
||
files in a different way than versions before it.
|
||
|
||
### Why aren't WASI rules enforced?
|
||
|
||
The [snapshot-01](https://github.com/WebAssembly/WASI/blob/snapshot-01/phases/snapshot/docs.md) version of WASI has a
|
||
number of rules for a "command module", but only the memory export rule is enforced. If a "_start" function exists, it
|
||
is enforced to be the correct signature and succeed, but the export itself isn't enforced. It follows that this means
|
||
exports are not required to be contained to a "_start" function invocation. Finally, the "__indirect_function_table"
|
||
export is also not enforced.
|
||
|
||
The reason for the exceptions are that implementations aren't following the rules. For example, TinyGo doesn't export
|
||
"__indirect_function_table", so crashing on this would make wazero unable to run TinyGo modules. Similarly, modules
|
||
loaded by wapc-go don't always define a "_start" function. Since "snapshot-01" is not a proper version, and certainly
|
||
not a W3C recommendation, there's no sense in breaking users over matters like this.
|
||
|
||
### Why is I/O configuration not coupled to WASI?
|
||
|
||
WebAssembly System Interfaces (WASI) is a formalization of a practice that can be done anyway: Define a host function to
|
||
access a system interface, such as writing to STDOUT. WASI stalled at snapshot-01 and as of early 2023, is being
|
||
rewritten entirely.
|
||
|
||
This instability implies a need to transition between WASI specs, which places wazero in a position that requires
|
||
decoupling. For example, if code uses two different functions to call `fd_write`, the underlying configuration must be
|
||
centralized and decoupled. Otherwise, calls using the same file descriptor number will end up writing to different
|
||
places.
|
||
|
||
In short, wazero defined system configuration in `ModuleConfig`, not a WASI type. This allows end-users to switch from
|
||
one spec to another with minimal impact. This has other helpful benefits, as centralized resources are simpler to close
|
||
coherently (ex via `Module.Close`).
|
||
|
||
In reflection, this worked well as more ABI became usable in wazero.
|
||
|
||
### Background on `ModuleConfig` design
|
||
|
||
WebAssembly 1.0 (20191205) specifies some aspects to control isolation between modules ([sandboxing](https://en.wikipedia.org/wiki/Sandbox_(computer_security))).
|
||
For example, `wasm.Memory` has size constraints and each instance of it is isolated from each other. While `wasm.Memory`
|
||
can be shared, by exporting it, it is not exported by default. In fact a WebAssembly Module (Wasm) has no memory by
|
||
default.
|
||
|
||
While memory is defined in WebAssembly 1.0 (20191205), many aspects are not. Let's use an example of `exec.Cmd` as for
|
||
example, a WebAssembly System Interfaces (WASI) command is implemented as a module with a `_start` function, and in many
|
||
ways acts similar to a process with a `main` function.
|
||
|
||
To capture "hello world" written to the console (stdout a.k.a. file descriptor 1) in `exec.Cmd`, you would set the
|
||
`Stdout` field accordingly, perhaps to a buffer. In WebAssembly 1.0 (20191205), the only way to perform something like
|
||
this is via a host function (ex `HostModuleFunctionBuilder`) and internally copy memory corresponding to that string
|
||
to a buffer.
|
||
|
||
WASI implements system interfaces with host functions. Concretely, to write to console, a WASI command `Module` imports
|
||
"fd_write" from "wasi_snapshot_preview1" and calls it with the `fd` parameter set to 1 (STDOUT).
|
||
|
||
The [snapshot-01](https://github.com/WebAssembly/WASI/blob/snapshot-01/phases/snapshot/docs.md) version of WASI has no
|
||
means to declare configuration, although its function definitions imply configuration for example if fd 1 should exist,
|
||
and if so where should it write. Moreover, snapshot-01 was last updated in late 2020 and the specification is being
|
||
completely rewritten as of early 2022. This means WASI as defined by "snapshot-01" will not clarify aspects like which
|
||
file descriptors are required. While it is possible a subsequent version may, it is too early to tell as no version of
|
||
WASI has reached a stage near W3C recommendation. Even if it did, module authors are not required to only use WASI to
|
||
write to console, as they can define their own host functions, such as they did before WASI existed.
|
||
|
||
wazero aims to serve Go developers as a primary function, and help them transition between WASI specifications. In
|
||
order to do this, we have to allow top-level configuration. To ensure isolation by default, `ModuleConfig` has WithXXX
|
||
that override defaults to no-op or empty. One `ModuleConfig` instance is used regardless of how many times the same WASI
|
||
functions are imported. The nil defaults allow safe concurrency in these situations, as well lower the cost when they
|
||
are never used. Finally, a one-to-one mapping with `Module` allows the module to close the `ModuleConfig` instead of
|
||
confusing users with another API to close.
|
||
|
||
Naming, defaults and validation rules of aspects like `STDIN` and `Environ` are intentionally similar to other Go
|
||
libraries such as `exec.Cmd` or `syscall.SetEnv`, and differences called out where helpful. For example, there's no goal
|
||
to emulate any operating system primitive specific to Windows (such as a 'c:\' drive). Moreover, certain defaults
|
||
working with real system calls are neither relevant nor safe to inherit: For example, `exec.Cmd` defaults to read STDIN
|
||
from a real file descriptor ("/dev/null"). Defaulting to this, vs reading `io.EOF`, would be unsafe as it can exhaust
|
||
file descriptors if resources aren't managed properly. In other words, blind copying of defaults isn't wise as it can
|
||
violate isolation or endanger the embedding process. In summary, we try to be similar to normal Go code, but often need
|
||
act differently and document `ModuleConfig` is more about emulating, not necessarily performing real system calls.
|
||
|
||
## File systems
|
||
|
||
### Motivation on `sys.FS`
|
||
|
||
The `sys.FS` abstraction in wazero was created because of limitations in
|
||
`fs.FS`, and `fs.File` in Go. Compilers targeting `wasip1` may access
|
||
functionality that writes new files. The ability to overcome this was requested
|
||
even before wazero was named this, via issue #21 in March 2021.
|
||
|
||
A month later, golang/go#45757 was raised by someone else on the same topic. As
|
||
of July 2023, this has not resolved to a writeable file system abstraction.
|
||
|
||
Over the next year more use cases accumulated, consolidated in March 2022 into
|
||
#390. This closed in January 2023 with a milestone of providing more
|
||
functionality, limited to users giving a real directory. This didn't yet expose
|
||
a file abstraction for general purpose use. Internally, this used `os.File`.
|
||
However, a wasm module instance is a virtual machine. Only supporting `os.File`
|
||
breaks sand-boxing use cases. Moreover, `os.File` is not an interface. Even
|
||
though this abstracts functionality, it does allow interception use cases.
|
||
|
||
Hence, a few days later in January 2023, we had more issues asking to expose an
|
||
abstraction, #1013 and later #1532, on use cases like masking access to files.
|
||
In other words, the use case requests never stopped, and aren't solved by
|
||
exposing only real files.
|
||
|
||
In summary, the primary motivation for exposing a replacement for `fs.FS` and
|
||
`fs.File` was around repetitive use case requests for years, around
|
||
interception and the ability to create new files, both virtual and real files.
|
||
While some use cases are solved with real files, not all are. Regardless, an
|
||
interface approach is necessary to ensure users can intercept I/O operations.
|
||
|
||
### Why doesn't `sys.File` have a `Fd()` method?
|
||
|
||
There are many features we could expose. We could make File expose underlying
|
||
file descriptors in case they are supported, for integration of system calls
|
||
that accept multiple ones, namely `poll` for multiplexing. This special case is
|
||
described in a subsequent section.
|
||
|
||
As noted above, users have been asking for a file abstraction for over two
|
||
years, and a common answer was to wait. Making users wait is a problem,
|
||
especially so long. Good reasons to make people wait are stabilization. Edge
|
||
case features are not a great reason to hold abstractions from users.
|
||
|
||
Another reason is implementation difficulty. Go did not attempt to abstract
|
||
file descriptors. For example, unlike `fs.ReadFile` there is no `fs.FdFile`
|
||
interface. Most likely, this is because file descriptors are an implementation
|
||
detail of common features. Programming languages, including Go, do not require
|
||
end users to know about file descriptors. Types such as `fs.File` can be used
|
||
without any knowledge of them. Implementations may or may not have file
|
||
descriptors. For example, in Go, `os.DirFS` has underlying file descriptors
|
||
while `embed.FS` does not.
|
||
|
||
Despite this, some may want to expose a non-standard interface because
|
||
`os.File` has `Fd() uintptr` to return a file descriptor. Mainly, this is
|
||
handy to integrate with `syscall` package functions (on `GOOS` values that
|
||
declare them). Notice, though that `uintptr` is unsafe and not an abstraction.
|
||
Close inspection will find some `os.File` types internally use `poll.FD`
|
||
instead, yet this is not possible to use abstractly because that type is not
|
||
exposed. For example, `plan9` uses a different type than `poll.FD`. In other
|
||
words, even in real files, `Fd()` is not wholly portable, despite it being
|
||
useful on many operating systems with the `syscall` package.
|
||
|
||
The reasons above, why Go doesn't abstract `FdFile` interface are a subset of
|
||
reasons why `sys.File` does not. If we exposed `File.Fd()` we not only would
|
||
have to declare all the edge cases that Go describes including impact of
|
||
finalizers, we would have to describe these in terms of virtualized files.
|
||
Then, we would have to reason with this value vs our existing virtualized
|
||
`sys.FileTable`, mapping whatever type we return to keys in that table, also
|
||
in consideration of garbage collection impact. The combination of issues like
|
||
this could lead down a path of not implementing a file system abstraction at
|
||
all, and instead a weak key mapped abstraction of the `syscall` package. Once
|
||
we finished with all the edge cases, we would have lost context of the original
|
||
reason why we started.. simply to allow file write access!
|
||
|
||
When wazero attempts to do more than what the Go programming language team, it
|
||
has to be carefully evaluated, to:
|
||
* Be possible to implement at least for `os.File` backed files
|
||
* Not be confusing or cognitively hard for virtual file systems and normal use.
|
||
* Affordable: custom code is solely the responsible by the core team, a much
|
||
smaller group of individuals than who maintain the Go programming language.
|
||
|
||
Due to problems well known in Go, consideration of the end users who constantly
|
||
ask for basic file system functionality, and the difficulty virtualizing file
|
||
descriptors at multiple levels, we don't expose `Fd()` and likely won't ever
|
||
expose `Fd()` on `sys.File`.
|
||
|
||
### Why does `sys.File` have a `Poll()` method, while `sys.FS` does not?
|
||
|
||
wazero exposes `File.Poll` which allows one-at-a-time poll use cases,
|
||
requested by multiple users. This not only includes abstract tests such as
|
||
Go 1.21 `GOOS=wasip1`, but real use cases including python and container2wasm
|
||
repls, as well listen sockets. The main use cases is non-blocking poll on a
|
||
single file. Being a single file, this has no risk of problems such as
|
||
head-of-line blocking, even when emulated.
|
||
|
||
The main use case of multi-poll are bidirectional network services, something
|
||
not used in `GOOS=wasip1` standard libraries, but could be in the future.
|
||
Moving forward without a multi-poller allows wazero to expose its file system
|
||
abstraction instead of continuing to hold back it back for edge cases. We'll
|
||
continue discussion below regardless, as rationale was requested.
|
||
|
||
You can loop through multiple `sys.File`, using `File.Poll` to see if an event
|
||
is ready, but there is a head-of-line blocking problem. If a long timeout is
|
||
used, bad luck could have a file that has nothing to read or write before one
|
||
that does. This could cause more blocking than necessary, even if you could
|
||
poll the others just after with a zero timeout. What's worse than this is if
|
||
unlimited blocking was used (`timeout=-1`). The host implementations could use
|
||
goroutines to avoid this, but interrupting a "forever" poll is problematic. All
|
||
of these are reasons to consider a multi-poll API, but do not require exporting
|
||
`File.Fd()`.
|
||
|
||
Should multi-poll becomes critical, `sys.FS` could expose a `Poll` function
|
||
like below, despite it being the non-portable, complicated if possible to
|
||
implement on all platforms and virtual file systems.
|
||
```go
|
||
ready, errno := fs.Poll([]sys.PollFile{{f1, sys.POLLIN}, {f2, sys.POLLOUT}}, timeoutMillis)
|
||
```
|
||
|
||
A real filesystem could handle this by using an approach like the internal
|
||
`unix.Poll` function in Go, passing file descriptors on unix platforms, or
|
||
returning `sys.ENOSYS` for unsupported operating systems. Implementation for
|
||
virtual files could have a strategy around timeout to avoid the worst case of
|
||
head-of-line blocking (unlimited timeout).
|
||
|
||
Let's remember that when designing abstractions, it is not best to add an
|
||
interface for everything. Certainly, Go doesn't, as evidenced by them not
|
||
exposing `poll.FD` in `os.File`! Such a multi-poll could be limited to
|
||
built-in filesystems in the wazero repository, avoiding complexity of trying to
|
||
support and test this abstractly. This would still permit multiplexing for CLI
|
||
users, and also permit single file polling as exists now.
|
||
|
||
### Why doesn't wazero implement the working directory?
|
||
|
||
An early design of wazero's API included a `WithWorkDirFS` which allowed
|
||
control over which file a relative path such as "./config.yml" resolved to,
|
||
independent of the root file system. This intended to help separate concerns
|
||
like mutability of files, but it didn't work and was removed.
|
||
|
||
Compilers that target wasm act differently with regard to the working
|
||
directory. For example, wasi-libc, used by TinyGo,
|
||
tracks working directory changes in compiled wasm instead: initially "/" until
|
||
code calls `chdir`. Zig assumes the first pre-opened file descriptor is the
|
||
working directory.
|
||
|
||
The only place wazero can standardize a layered concern is via a host function.
|
||
Since WASI doesn't use host functions to track the working directory, we can't
|
||
standardize the storage and initial value of it.
|
||
|
||
Meanwhile, code may be able to affect the working directory by compiling
|
||
`chdir` into their main function, using an argument or ENV for the initial
|
||
value (possibly `PWD`). Those unable to control the compiled code should only
|
||
use absolute paths in configuration.
|
||
|
||
See
|
||
* https://github.com/golang/go/blob/go1.20/src/syscall/fs_js.go#L324
|
||
* https://github.com/WebAssembly/wasi-libc/pull/214#issue-673090117
|
||
* https://github.com/ziglang/zig/blob/53a9ee699a35a3d245ab6d1dac1f0687a4dcb42c/src/main.zig#L32
|
||
|
||
### Why ignore the error returned by io.Reader when n > 1?
|
||
|
||
Per https://pkg.go.dev/io#Reader, if we receive an error, any bytes read should
|
||
be processed first. At the syscall abstraction (`fd_read`), the caller is the
|
||
processor, so we can't process the bytes inline and also return the error (as
|
||
`EIO`).
|
||
|
||
Let's assume we want to return the bytes read on error to the caller. This
|
||
implies we at least temporarily ignore the error alongside them. The choice
|
||
remaining is whether to persist the error returned with the read until a
|
||
possible next call, or ignore the error.
|
||
|
||
If we persist an error returned, it would be coupled to a file descriptor, but
|
||
effectively it is boolean as this case coerces to `EIO`. If we track a "last
|
||
error" on a file descriptor, it could be complicated for a couple reasons
|
||
including whether the error is transient or permanent, or if the error would
|
||
apply to any FD operation, or just read. Finally, there may never be a
|
||
subsequent read as perhaps the bytes leading up to the error are enough to
|
||
satisfy the processor.
|
||
|
||
This decision boils down to whether or not to track an error bit per file
|
||
descriptor or not. If not, the assumption is that a subsequent operation would
|
||
also error, this time without reading any bytes.
|
||
|
||
The current opinion is to go with the simplest path, which is to return the
|
||
bytes read and ignore the error the there were any. Assume a subsequent
|
||
operation will err if it needs to. This helps reduce the complexity of the code
|
||
in wazero and also accommodates the scenario where the bytes read are enough to
|
||
satisfy its processor.
|
||
|
||
### File descriptor allocation strategy
|
||
|
||
File descriptor allocation currently uses a strategy similar the one implemented
|
||
by unix systems: when opening a file, the lowest unused number is picked.
|
||
|
||
The WASI standard documents that programs cannot expect that file descriptor
|
||
numbers will be allocated with a lowest-first strategy, and they should instead
|
||
assume the values will be random. Since _random_ is a very imprecise concept in
|
||
computers, we technically satisfying the implementation with the descriptor
|
||
allocation strategy we use in Wazero. We could imagine adding more _randomness_
|
||
to the descriptor selection process, however this should never be used as a
|
||
security measure to prevent applications from guessing the next file number so
|
||
there are no strong incentives to complicate the logic.
|
||
|
||
### Why does `FSConfig.WithDirMount` not match behaviour with `os.DirFS`?
|
||
|
||
It may seem that we should require any feature that seems like a standard
|
||
library in Go, to behave the same way as the standard library. Doing so would
|
||
present least surprise to Go developers. In the case of how we handle
|
||
filesystems, we break from that as it is incompatible with the expectations of
|
||
WASI, the most commonly implemented filesystem ABI.
|
||
|
||
The main reason is that `os.DirFS` is a virtual filesystem abstraction while
|
||
WASI is an abstraction over syscalls. For example, the signature of `fs.Open`
|
||
does not permit use of flags. This creates conflict on what default behaviors
|
||
to take when Go implemented `os.DirFS`. On the other hand, `path_open` can pass
|
||
flags, and in fact tests require them to be honored in specific ways.
|
||
|
||
This conflict requires us to choose what to be more compatible with, and which
|
||
type of user to surprise the least. We assume there will be more developers
|
||
compiling code to wasm than developers of custom filesystem plugins, and those
|
||
compiling code to wasm will be better served if we are compatible with WASI.
|
||
Hence on conflict, we prefer WASI behavior vs the behavior of `os.DirFS`.
|
||
|
||
See https://github.com/WebAssembly/wasi-testsuite
|
||
See https://github.com/golang/go/issues/58141
|
||
|
||
## Why is our `Readdir` function more like Go's `os.File` than POSIX `readdir`?
|
||
|
||
At one point we attempted to move from a bulk `Readdir` function to something
|
||
more like the POSIX `DIR` struct, exposing functions like `telldir`, `seekdir`
|
||
and `readdir`. However, we chose the design more like `os.File.Readdir`,
|
||
because it performs and fits wasip1 better.
|
||
|
||
### wasip1/wasix
|
||
|
||
`fd_readdir` in wasip1 (and so also wasix) is like `getdents` in Linux, not
|
||
`readdir` in POSIX. `getdents` is more like Go's `os.File.Readdir`.
|
||
|
||
We currently have an internal type `sys.DirentCache` which only is used by
|
||
wasip1 or wasix. When `HostModuleBuilder` adds support for instantiation state,
|
||
we could move this to the `wasi_snapshot_preview1` package. Meanwhile, all
|
||
filesystem code is internal anyway, so this special-case is acceptable.
|
||
|
||
### wasip2
|
||
|
||
`directory-entry-stream` in wasi-filesystem preview2 is defined in component
|
||
model, not an ABI, but in wasmtime it is a consuming iterator. A consuming
|
||
iterator is easy to support with anything (like `Readdir(1)`), even if it is
|
||
inefficient as you can neither bulk read nor skip. The implementation of the
|
||
preview1 adapter (uses preview2) confirms this. They use a dirent cache similar
|
||
in some ways to our `sysfs.DirentCache`. As there is no seek concept in
|
||
preview2, they interpret the cookie as numeric and read on repeat entries when
|
||
a cache wasn't available. Note: we currently do not skip-read like this as it
|
||
risks buffering large directories, and no user has requested entries before the
|
||
cache, yet.
|
||
|
||
Regardless, wasip2 is not complete until the end of 2023. We can defer design
|
||
discussion until after it is stable and after the reference impl wasmtime
|
||
implements it.
|
||
|
||
See
|
||
* https://github.com/WebAssembly/wasi-filesystem/blob/ef9fc87c07323a6827632edeb6a7388b31266c8e/example-world.md#directory_entry_stream
|
||
* https://github.com/bytecodealliance/wasmtime/blob/b741f7c79d72492d17ab8a29c8ffe4687715938e/crates/wasi/src/preview2/preview2/filesystem.rs#L286-L296
|
||
* https://github.com/bytecodealliance/preview2-prototyping/blob/e4c04bcfbd11c42c27c28984948d501a3e168121/crates/wasi-preview1-component-adapter/src/lib.rs#L2131-L2137
|
||
* https://github.com/bytecodealliance/preview2-prototyping/blob/e4c04bcfbd11c42c27c28984948d501a3e168121/crates/wasi-preview1-component-adapter/src/lib.rs#L936
|
||
|
||
### wasip3
|
||
|
||
`directory-entry-stream` is documented to change significantly in wasip3 moving
|
||
from synchronous to synchronous streams. This is dramatically different than
|
||
POSIX `readdir` which is synchronous.
|
||
|
||
Regardless, wasip3 is not complete until after wasip2, which means 2024 or
|
||
later. We can defer design discussion until after it is stable and after the
|
||
reference impl wasmtime implements it.
|
||
|
||
See
|
||
* https://github.com/WebAssembly/WASI/blob/ddfe3d1dda5d1473f37ecebc552ae20ce5fd319a/docs/WitInWasi.md#Streams
|
||
* https://docs.google.com/presentation/d/1MNVOZ8hdofO3tI0szg_i-Yoy0N2QPU2C--LzVuoGSlE/edit#slide=id.g1270ef7d5b6_0_662
|
||
|
||
### How do we implement `Pread` with an `fs.File`?
|
||
|
||
`ReadAt` is the Go equivalent to `pread`: it does not affect, and is not
|
||
affected by, the underlying file offset. Unfortunately, `io.ReaderAt` is not
|
||
implemented by all `fs.File`. For example, as of Go 1.19, `embed.openFile` does
|
||
not.
|
||
|
||
The initial implementation of `fd_pread` instead used `Seek`. To avoid a
|
||
regression, we fall back to `io.Seeker` when `io.ReaderAt` is not supported.
|
||
|
||
This requires obtaining the initial file offset, seeking to the intended read
|
||
offset, and resetting the file offset the initial state. If this final seek
|
||
fails, the file offset is left in an undefined state. This is not thread-safe.
|
||
|
||
While seeking per read seems expensive, the common case of `embed.openFile` is
|
||
only accessing a single int64 field, which is cheap.
|
||
|
||
### Pre-opened files
|
||
|
||
WASI includes `fd_prestat_get` and `fd_prestat_dir_name` functions used to
|
||
learn any directory paths for file descriptors open at initialization time.
|
||
|
||
For example, `__wasilibc_register_preopened_fd` scans any file descriptors past
|
||
STDERR (1) and invokes `fd_prestat_dir_name` to learn any path prefixes they
|
||
correspond to. Zig's `preopensAlloc` does similar. These pre-open functions are
|
||
not used again after initialization.
|
||
|
||
wazero supports stdio pre-opens followed by any mounts e.g `.:/`. The guest
|
||
path is a directory and its name, e.g. "/" is returned by `fd_prestat_dir_name`
|
||
for file descriptor 3 (STDERR+1). The first longest match wins on multiple
|
||
pre-opens, which allows a path like "/tmp" to match regardless of order vs "/".
|
||
|
||
See
|
||
* https://github.com/WebAssembly/wasi-libc/blob/a02298043ff551ce1157bc2ee7ab74c3bffe7144/libc-bottom-half/sources/preopens.c
|
||
* https://github.com/ziglang/zig/blob/9cb06f3b8bf9ea6b5e5307711bc97328762d6a1d/lib/std/fs/wasi.zig#L50-L53
|
||
|
||
### fd_prestat_dir_name
|
||
|
||
`fd_prestat_dir_name` is a WASI function to return the path of the pre-opened
|
||
directory of a file descriptor. It has the following three parameters, and the
|
||
third `path_len` has ambiguous semantics.
|
||
|
||
* `fd`: a file descriptor
|
||
* `path`: the offset for the result path
|
||
* `path_len`: In wazero, `FdPrestatDirName` writes the result path string to
|
||
`path` offset for the exact length of `path_len`.
|
||
|
||
Wasmer considers `path_len` to be the maximum length instead of the exact
|
||
length that should be written.
|
||
See https://github.com/wasmerio/wasmer/blob/3463c51268ed551933392a4063bd4f8e7498b0f6/lib/wasi/src/syscalls/mod.rs#L764
|
||
|
||
The semantics in wazero follows that of wasmtime.
|
||
See https://github.com/bytecodealliance/wasmtime/blob/2ca01ae9478f199337cf743a6ab543e8c3f3b238/crates/wasi-common/src/snapshots/preview_1.rs#L578-L582
|
||
|
||
Their semantics match when `path_len` == the length of `path`, so in practice
|
||
this difference won't matter match.
|
||
|
||
## fd_readdir
|
||
|
||
### Why does "wasi_snapshot_preview1" require dot entries when POSIX does not?
|
||
|
||
In October 2019, WASI project knew requiring dot entries ("." and "..") was not
|
||
documented in preview1, not required by POSIX and problematic to synthesize.
|
||
For example, Windows runtimes backed by `FindNextFileW` could not return these.
|
||
A year later, the tag representing WASI preview 1 (`snapshot-01`) was made.
|
||
This did not include the requested change of making dot entries optional.
|
||
|
||
The `phases/snapshot/docs.md` document was altered in subsequent years in
|
||
significant ways, often in lock-step with wasmtime or wasi-libc. In January
|
||
2022, `sock_accept` was added to `phases/snapshot/docs.md`, a document later
|
||
renamed to later renamed to `legacy/preview1/docs.md`.
|
||
|
||
As a result, the ABI and behavior remained unstable: The `snapshot-01` tag was
|
||
not an effective basis of portability. A test suite was requested well before
|
||
this tag, in April 2019. Meanwhile, compliance had no meaning. Developers had
|
||
to track changes to the latest doc, while clarifying with wasi-libc or wasmtime
|
||
behavior. This lack of stability could have permitted a fix to the dot entries
|
||
problem, just as it permitted changes desired by other users.
|
||
|
||
In November 2022, the wasi-testsuite project began and started solidifying
|
||
expectations. This quickly led to changes in runtimes and the spec doc. WASI
|
||
began importing tests from wasmtime as required behaviors for all runtimes.
|
||
Some changes implied changes to wasi-libc. For example, `readdir` began to
|
||
imply inode fan-outs, which caused performance regressions. Most notably a
|
||
test merged in January required dot entries. Tests were merged without running
|
||
against any runtime, and even when run ad-hoc only against Linux. Hence,
|
||
portability issues mentioned over three years earlier did not trigger any
|
||
failure until wazero (which tests Windows) noticed.
|
||
|
||
In the same month, wazero requested to revert this change primarily because
|
||
Go does not return them from `os.ReadDir`, and materializing them is
|
||
complicated due to tests also requiring inodes. Moreover, they are discarded by
|
||
not just Go, but other common programming languages. This was rejected by the
|
||
WASI lead for preview1, but considered for the completely different ABI named
|
||
preview2.
|
||
|
||
In February 2023, the WASI chair declared that new rule requiring preview1 to
|
||
return dot entries "was decided by the subgroup as a whole", citing meeting
|
||
notes. According to these notes, the WASI lead stated incorrectly that POSIX
|
||
conformance required returning dot entries, something it explicitly says are
|
||
optional. In other words, he said filtering them out would make Preview1
|
||
non-conforming, and asked if anyone objects to this. The co-chair was noted to
|
||
say "Because there are existing P1 programs, we shouldn’t make changes like
|
||
this." No other were recorded to say anything.
|
||
|
||
In summary, preview1 was changed retrospectively to require dot entries and
|
||
preview2 was changed to require their absence. This rule was reverse engineered
|
||
from wasmtime tests, and affirmed on two false premises:
|
||
|
||
* POSIX compliance requires dot entries
|
||
* POSIX literally says these are optional
|
||
* WASI cannot make changes because there are existing P1 programs.
|
||
* Changes to Preview 1 happened before and after this topic.
|
||
|
||
As of June 2023, wasi-testsuite still only runs on Linux, so compliance of this
|
||
rule on Windows is left to runtimes to decide to validate. The preview2 adapter
|
||
uses fake cookies zero and one to refer to dot dirents, uses a real inode for
|
||
the dot(".") entry and zero inode for dot-dot("..").
|
||
|
||
See https://github.com/WebAssembly/wasi-filesystem/issues/3
|
||
See https://github.com/WebAssembly/WASI/tree/snapshot-01
|
||
See https://github.com/WebAssembly/WASI/issues/9
|
||
See https://github.com/WebAssembly/WASI/pull/458
|
||
See https://github.com/WebAssembly/wasi-testsuite/pull/32
|
||
See https://github.com/WebAssembly/wasi-libc/pull/345
|
||
See https://github.com/WebAssembly/wasi-testsuite/issues/52
|
||
See https://github.com/WebAssembly/WASI/pull/516
|
||
See https://github.com/WebAssembly/meetings/blob/main/wasi/2023/WASI-02-09.md#should-preview1-fd_readdir-filter-out--and-
|
||
See https://github.com/bytecodealliance/preview2-prototyping/blob/e4c04bcfbd11c42c27c28984948d501a3e168121/crates/wasi-preview1-component-adapter/src/lib.rs#L1026-L1041
|
||
|
||
### Why are dot (".") and dot-dot ("..") entries problematic?
|
||
|
||
When reading a directory, dot (".") and dot-dot ("..") entries are problematic.
|
||
For example, Go does not return them from `os.ReadDir`, and materializing them
|
||
is complicated (at least dot-dot is).
|
||
|
||
A directory entry has stat information in it. The stat information includes
|
||
inode which is used for comparing file equivalence. In the simple case of dot,
|
||
we could materialize a special entry to expose the same info as stat on the fd
|
||
would return. However, doing this and not doing dot-dot would cause confusion,
|
||
and dot-dot is far more tricky. To back-fill inode information about a parent
|
||
directory would be costly and subtle. For example, the pre-open (mount) of the
|
||
directory may be different than its logical parent. This is easy to understand
|
||
when considering the common case of mounting "/" and "/tmp" as pre-opens. To
|
||
implement ".." from "/tmp" requires information from a separate pre-open, this
|
||
includes state to even know the difference. There are easier edge cases as
|
||
well, such as the decision to not return ".." from a root path. In any case,
|
||
this should start to explain that faking entries when underlying stdlib doesn't
|
||
return them is tricky and requires quite a lot of state.
|
||
|
||
Another issue is around the `Dirent.Off` value of a directory entry, sometimes
|
||
called a "cookie" in Linux man pagers. When the host operating system or
|
||
library function does not return dot entries, to support functions such as
|
||
`seekdir`, you still need a value for `Dirent.Off`. Naively, you can synthesize
|
||
these by choosing sequential offsets zero and one. However, POSIX strictly says
|
||
offsets should be treated opaquely. The backing filesystem could use these to
|
||
represent real entries. For example, a directory with one entry could use zero
|
||
as the `Dirent.Off` value. If you also used zero for the "." dirent, there
|
||
would be a clash. This means if you synthesize `Dirent.Off` for any entry, you
|
||
need to synthesize this value for all entries. In practice, the simplest way is
|
||
using an incrementing number, such as done in the WASI preview2 adapter.
|
||
|
||
Working around these issues causes expense to all users of wazero, so we'd
|
||
then look to see if that would be justified or not. However, the most common
|
||
compilers involved in end user questions, as of early 2023 are TinyGo, Rust and
|
||
Zig. All of these compile code which ignores dot and dot-dot entries. In other
|
||
words, faking these entries would not only cost our codebase with complexity,
|
||
but it would also add unnecessary overhead as the values aren't commonly used.
|
||
|
||
The final reason why we might do this, is an end users or a specification
|
||
requiring us to. As of early 2023, no end user has raised concern over Go and
|
||
by extension wazero not returning dot and dot-dot. The snapshot-01 spec of WASI
|
||
does not mention anything on this point. Also, POSIX has the following to say,
|
||
which summarizes to "these are optional"
|
||
|
||
> The readdir() function shall not return directory entries containing empty names. If entries for dot or dot-dot exist, one entry shall be returned for dot and one entry shall be returned for dot-dot; otherwise, they shall not be returned.
|
||
|
||
Unfortunately, as described above, the WASI project decided in early 2023 to
|
||
require dot entries in both the spec and the wasi-testsuite. For only this
|
||
reason, wazero adds overhead to synthesize dot entries despite it being
|
||
unnecessary for most users.
|
||
|
||
See https://pubs.opengroup.org/onlinepubs/9699919799/functions/readdir.html
|
||
See https://github.com/golang/go/blob/go1.20/src/os/dir_unix.go#L108-L110
|
||
See https://github.com/bytecodealliance/preview2-prototyping/blob/e4c04bcfbd11c42c27c28984948d501a3e168121/crates/wasi-preview1-component-adapter/src/lib.rs#L1026-L1041
|
||
|
||
### Why don't we pre-populate an inode for the dot-dot ("..") entry?
|
||
|
||
We only populate an inode for dot (".") because wasi-testsuite requires it, and
|
||
we likely already have it (because we cache it). We could attempt to populate
|
||
one for dot-dot (".."), but chose not to.
|
||
|
||
Firstly, wasi-testsuite does not require the inode of dot-dot, possibly because
|
||
the wasip2 adapter doesn't populate it (but we don't really know why).
|
||
|
||
The only other reason to populate it would be to avoid wasi-libc's stat fanout
|
||
when it is missing. However, wasi-libc explicitly doesn't fan-out to lstat on
|
||
the ".." entry on a zero ino.
|
||
|
||
Fetching dot-dot's inode despite the above not only doesn't help wasi-libc, but
|
||
it also hurts languages that don't use it, such as Go. These languages would
|
||
pay a stat syscall penalty even if they don't need the inode. In fact, Go
|
||
discards both dot entries!
|
||
|
||
In summary, there are no significant upsides in attempting to pre-fetch
|
||
dot-dot's inode, and there are downsides to doing it anyway.
|
||
|
||
See
|
||
* https://github.com/WebAssembly/wasi-libc/blob/bd950eb128bff337153de217b11270f948d04bb4/libc-bottom-half/cloudlibc/src/libc/dirent/readdir.c#L87-L94
|
||
* https://github.com/WebAssembly/wasi-testsuite/blob/main/tests/rust/src/bin/fd_readdir.rs#L108
|
||
* https://github.com/bytecodealliance/preview2-prototyping/blob/e4c04bcfbd11c42c27c28984948d501a3e168121/crates/wasi-preview1-component-adapter/src/lib.rs#L1037
|
||
|
||
### Why don't we require inodes to be non-zero?
|
||
|
||
We don't require a non-zero value for `Dirent.Ino` because doing so can prevent
|
||
a real one from resolving later via `Stat_t.Ino`.
|
||
|
||
We define `Ino` like `d_ino` in POSIX which doesn't special-case zero. It can
|
||
be zero for a few reasons:
|
||
|
||
* The file is not a regular file or directory.
|
||
* The underlying filesystem does not support inodes. e.g. embed:fs
|
||
* A directory doesn't include inodes, but a later stat can. e.g. Windows
|
||
* The backend is based on wasi-filesystem (a.k.a wasip2), which has
|
||
`directory_entry.inode` optional, and might remove it entirely.
|
||
|
||
There are other downsides to returning a zero inode in widely used compilers:
|
||
|
||
* File equivalence utilities, like `os.SameFile` will not work.
|
||
* wasi-libc's `wasip1` mode will call `lstat` and attempt to retrieve a
|
||
non-zero value (unless the entry is named "..").
|
||
|
||
A new compiler may accidentally skip a `Dirent` with a zero `Ino` if emulating
|
||
a non-POSIX function and re-using `Dirent.Ino` for `d_fileno`.
|
||
|
||
* Linux `getdents` doesn't define `d_fileno` must be non-zero
|
||
* BSD `getdirentries` is implementation specific. For example, OpenBSD will
|
||
return dirents with a zero `d_fileno`, but Darwin will skip them.
|
||
|
||
The above shouldn't be a problem, even in the case of BSD, because `wasip1` is
|
||
defined more in terms of `getdents` than `getdirentries`. The bottom half of
|
||
either should treat `wasip1` (or any similar ABI such as wasix or wasip2) as a
|
||
different operating system and either use different logic that doesn't skip, or
|
||
synthesize a fake non-zero `d_fileno` when `d_ino` is zero.
|
||
|
||
However, this has been a problem. Go's `syscall.ParseDirent` utility is shared
|
||
for all `GOOS=unix`. For simplicity, this abstracts `direntIno` with data from
|
||
`d_fileno` or `d_ino`, and drops if either are zero, even if `d_fileno` is the
|
||
only field with zero explicitly defined. This led to a change to special case
|
||
`GOOS=wasip1` as otherwise virtual files would be unconditionally skipped.
|
||
|
||
In practice, this problem is rather unique due to so many compilers relying on
|
||
wasi-libc, which tolerates a zero inode. For example, while issues were
|
||
reported about the performance regression when wasi-libc began doing a fan-out
|
||
on zero `Dirent.Ino`, no issues were reported about dirents being dropped as a
|
||
result.
|
||
|
||
In summary, rather than complicating implementation and forcing non-zero inodes
|
||
for a rare case, we permit zero. We instead document this topic thoroughly, so
|
||
that emerging compilers can re-use the research and reference it on conflict.
|
||
We also document that `Ino` should be non-zero, so that users implementing that
|
||
field will attempt to get it.
|
||
|
||
See
|
||
* https://github.com/WebAssembly/wasi-filesystem/pull/81
|
||
* https://github.com/WebAssembly/wasi-libc/blob/bd950eb128bff337153de217b11270f948d04bb4/libc-bottom-half/cloudlibc/src/libc/dirent/readdir.c#L87-L94
|
||
* https://linux.die.net/man/3/getdents
|
||
* https://www.unix.com/man-page/osx/2/getdirentries/
|
||
* https://man.openbsd.org/OpenBSD-5.4/getdirentries.2
|
||
* https://github.com/golang/go/blob/go1.20/src/syscall/dirent.go#L60-L102
|
||
* https://go-review.googlesource.com/c/go/+/507915
|
||
|
||
## sys.Walltime and Nanotime
|
||
|
||
The `sys` package has two function types, `Walltime` and `Nanotime` for real
|
||
and monotonic clock exports. The naming matches conventions used in Go.
|
||
|
||
```go
|
||
func time_now() (sec int64, nsec int32, mono int64) {
|
||
sec, nsec = walltime()
|
||
return sec, nsec, nanotime()
|
||
}
|
||
```
|
||
|
||
Splitting functions for wall and clock time allow implementations to choose
|
||
whether to implement the clock once (as in Go), or split them out.
|
||
|
||
Each can be configured with a `ClockResolution`, although is it usually
|
||
incorrect as detailed in a sub-heading below. The only reason for exposing this
|
||
is to satisfy WASI:
|
||
|
||
See https://github.com/WebAssembly/wasi-clocks
|
||
|
||
### Why default to fake time?
|
||
|
||
WebAssembly has an implicit design pattern of capabilities based security. By
|
||
defaulting to a fake time, we reduce the chance of timing attacks, at the cost
|
||
of requiring configuration to opt-into real clocks.
|
||
|
||
See https://gruss.cc/files/fantastictimers.pdf for an example attacks.
|
||
|
||
### Why does fake time increase on reading?
|
||
|
||
Both the fake nanotime and walltime increase by 1ms on reading. Particularly in
|
||
the case of nanotime, this prevents spinning.
|
||
|
||
### Why not `time.Clock`?
|
||
|
||
wazero can't use `time.Clock` as a plugin for clock implementation as it is
|
||
only substitutable with build flags (`faketime`) and conflates wall and
|
||
monotonic time in the same call.
|
||
|
||
Go's `time.Clock` was added monotonic time after the fact. For portability with
|
||
prior APIs, a decision was made to combine readings into the same API call.
|
||
|
||
See https://go.googlesource.com/proposal/+/master/design/12914-monotonic.md
|
||
|
||
WebAssembly time imports do not have the same concern. In fact even Go's
|
||
imports for clocks split walltime from nanotime readings.
|
||
|
||
See https://github.com/golang/go/blob/go1.20/misc/wasm/wasm_exec.js#L243-L255
|
||
|
||
Finally, Go's clock is not an interface. WebAssembly users who want determinism
|
||
or security need to be able to substitute an alternative clock implementation
|
||
from the host process one.
|
||
|
||
### `ClockResolution`
|
||
|
||
A clock's resolution is hardware and OS dependent so requires a system call to retrieve an accurate value.
|
||
Go does not provide a function for getting resolution, so without CGO we don't have an easy way to get an actual
|
||
value. For now, we return fixed values of 1us for realtime and 1ns for monotonic, assuming that realtime clocks are
|
||
often lower precision than monotonic clocks. In the future, this could be improved by having OS+arch specific assembly
|
||
to make syscalls.
|
||
|
||
For example, Go implements time.Now for linux-amd64 with this [assembly](https://github.com/golang/go/blob/go1.20/src/runtime/time_linux_amd64.s).
|
||
Because retrieving resolution is not generally called often, unlike getting time, it could be appropriate to only
|
||
implement the fallback logic that does not use VDSO (executing syscalls in user mode). The syscall for clock_getres
|
||
is 229 and should be usable. https://pkg.go.dev/syscall#pkg-constants.
|
||
|
||
If implementing similar for Windows, [mingw](https://github.com/mirror/mingw-w64/blob/6a0e9165008f731bccadfc41a59719cf7c8efc02/mingw-w64-libraries/winpthreads/src/clock.c#L77
|
||
) is often a good source to find the Windows API calls that correspond
|
||
to a POSIX method.
|
||
|
||
Writing assembly would allow making syscalls without CGO, but comes with the cost that it will require implementations
|
||
across many combinations of OS and architecture.
|
||
|
||
## sys.Nanosleep
|
||
|
||
All major programming languages have a `sleep` mechanism to block for a
|
||
duration. Sleep is typically implemented by a WASI `poll_oneoff` relative clock
|
||
subscription.
|
||
|
||
For example, the below ends up calling `wasi_snapshot_preview1.poll_oneoff`:
|
||
|
||
```zig
|
||
const std = @import("std");
|
||
pub fn main() !void {
|
||
std.time.sleep(std.time.ns_per_s * 5);
|
||
}
|
||
```
|
||
|
||
Besides Zig, this is also the case with TinyGo (`-target=wasi`) and Rust
|
||
(`--target wasm32-wasi`).
|
||
|
||
We decided to expose `sys.Nanosleep` to allow overriding the implementation
|
||
used in the common case, even if it isn't used by Go, because this gives an
|
||
easy and efficient closure over a common program function. We also documented
|
||
`sys.Nanotime` to warn users that some compilers don't optimize sleep.
|
||
|
||
## sys.Osyield
|
||
|
||
We expose `sys.Osyield`, to allow users to control the behavior of WASI's
|
||
`sched_yield` without a new build of wazero. This is mainly for parity with
|
||
all other related features which we allow users to implement, including
|
||
`sys.Nanosleep`. Unlike others, we don't provide an out-of-box implementation
|
||
primarily because it will cause performance problems when accessed.
|
||
|
||
For example, the below implementation uses CGO, which might result in a 1us
|
||
delay per invocation depending on the platform.
|
||
|
||
See https://github.com/golang/go/issues/19409#issuecomment-284788196
|
||
```go
|
||
//go:noescape
|
||
//go:linkname osyield runtime.osyield
|
||
func osyield()
|
||
```
|
||
|
||
In practice, a request to customize this is unlikely to happen until other
|
||
thread based functions are implemented. That said, as of early 2023, there are
|
||
a few signs of implementation interest and cross-referencing:
|
||
|
||
See https://github.com/WebAssembly/stack-switching/discussions/38
|
||
See https://github.com/WebAssembly/wasi-threads#what-can-be-skipped
|
||
See https://slinkydeveloper.com/Kubernetes-controllers-A-New-Hope/
|
||
|
||
## sys.Stat_t
|
||
|
||
We expose `stat` information as `sys.Stat_t`, like `syscall.Stat_t` except
|
||
defined without build constraints. For example, you can use `sys.Stat_t` on
|
||
`GOOS=windows` which doesn't define `syscall.Stat_t`.
|
||
|
||
The first use case of this is to return inodes from `fs.FileInfo` without
|
||
relying on platform-specifics. For example, a user could return `*sys.Stat_t`
|
||
from `info.Sys()` and define a non-zero inode for a virtual file, or map a
|
||
real inode to a virtual one.
|
||
|
||
Notable choices per field are listed below, where `sys.Stat_t` is unlike
|
||
`syscall.Stat_t` on `GOOS=linux`, or needs clarification. One common issue
|
||
not repeated below is that numeric fields are 64-bit when at least one platform
|
||
defines it that large. Also, zero values are equivalent to nil or absent.
|
||
|
||
* `Dev` and `Ino` (`Inode`) are both defined unsigned as they are defined
|
||
opaque, and most `syscall.Stat_t` also defined them unsigned. There are
|
||
separate sections in this document discussing the impact of zero in `Ino`.
|
||
* `Mode` is defined as a `fs.FileMode` even though that is not defined in POSIX
|
||
and will not map to all possible values. This is because the current use is
|
||
WASI, which doesn't define any types or features not already supported. By
|
||
using `fs.FileMode`, we can re-use routine experience in Go.
|
||
* `NLink` is unsigned because it is defined that way in `syscall.Stat_t`: there
|
||
can never be less than zero links to a file. We suggest defaulting to 1 in
|
||
conversions when information is not knowable because at least that many links
|
||
exist.
|
||
* `Size` is signed because it is defined that way in `syscall.Stat_t`: while
|
||
regular files and directories will always be non-negative, irregular files
|
||
are possibly negative or not defined. Notably sparse files are known to
|
||
return negative values.
|
||
* `Atim`, `Mtim` and `Ctim` are signed because they are defined that way in
|
||
`syscall.Stat_t`: Negative values are time before 1970. The resolution is
|
||
nanosecond because that's the maximum resolution currently supported in Go.
|
||
|
||
### Why do we use `sys.EpochNanos` instead of `time.Time` or similar?
|
||
|
||
To simplify documentation, we defined a type alias `sys.EpochNanos` for int64.
|
||
`time.Time` is a data structure, and we could have used this for
|
||
`syscall.Stat_t` time values. The most important reason we do not is conversion
|
||
penalty deriving time from common types.
|
||
|
||
The most common ABI used in `wasip2`. This, and compatible ABI such as `wasix`,
|
||
encode timestamps in memory as a 64-bit number. If we used `time.Time`, we
|
||
would have to convert an underlying type like `syscall.Timespec` to `time.Time`
|
||
only to later have to call `.UnixNano()` to convert it back to a 64-bit number.
|
||
|
||
In the future, the component model module "wasi-filesystem" may represent stat
|
||
timestamps with a type shared with "wasi-clocks", abstractly structured similar
|
||
to `time.Time`. However, component model intentionally does not define an ABI.
|
||
It is likely that the canonical ABI for timestamp will be in two parts, but it
|
||
is not required for it to be intermediately represented this way. A utility
|
||
like `syscall.NsecToTimespec` could split an int64 so that it could be written
|
||
to memory as 96 bytes (int64, int32), without allocating a struct.
|
||
|
||
Finally, some may confuse epoch nanoseconds with 32-bit epoch seconds. While
|
||
32-bit epoch seconds has "The year 2038" problem, epoch nanoseconds has
|
||
"The Year 2262" problem, which is even less concerning for this library. If
|
||
the Go programming language and wazero exist in the 2200's, we can make a major
|
||
version increment to adjust the `sys.EpochNanos` approach. Meanwhile, we have
|
||
faster code.
|
||
|
||
## poll_oneoff
|
||
|
||
`poll_oneoff` is a WASI API for waiting for I/O events on multiple handles.
|
||
It is conceptually similar to the POSIX `poll(2)` syscall.
|
||
The name is not `poll`, because it references [“the fact that this function is not efficient
|
||
when used repeatedly with the same large set of handles”][poll_oneoff].
|
||
|
||
We chose to support this API in a handful of cases that work for regular files
|
||
and standard input. We currently do not support other types of file descriptors such
|
||
as socket handles.
|
||
|
||
### Clock Subscriptions
|
||
|
||
As detailed above in [sys.Nanosleep](#sysnanosleep), `poll_oneoff` handles
|
||
relative clock subscriptions. In our implementation we use `sys.Nanosleep()`
|
||
for this purpose in most cases, except when polling for interactive input
|
||
from `os.Stdin` (see more details below).
|
||
|
||
### FdRead and FdWrite Subscriptions
|
||
|
||
When subscribing a file descriptor (except `Stdin`) for reads or writes,
|
||
the implementation will generally return immediately with success, unless
|
||
the file descriptor is unknown. The file descriptor is not checked further
|
||
for new incoming data. Any timeout is cancelled, and the API call is able
|
||
to return, unless there are subscriptions to `Stdin`: these are handled
|
||
separately.
|
||
|
||
### FdRead and FdWrite Subscription to Stdin
|
||
|
||
Subscribing `Stdin` for reads (writes make no sense and cause an error),
|
||
requires extra care: wazero allows to configure a custom reader for `Stdin`.
|
||
|
||
In general, if a custom reader is found, the behavior will be the same
|
||
as for regular file descriptors: data is assumed to be present and
|
||
a success is written back to the result buffer.
|
||
|
||
However, if the reader is detected to read from `os.Stdin`,
|
||
a special code path is followed, invoking `sysfs.poll()`.
|
||
|
||
`sysfs.poll()` is a wrapper for `poll(2)` on POSIX systems,
|
||
and it is emulated on Windows.
|
||
|
||
### Poll on POSIX
|
||
|
||
On POSIX systems, `poll(2)` allows to wait for incoming data on a file
|
||
descriptor, and block until either data becomes available or the timeout
|
||
expires.
|
||
|
||
Usage of `syfs.poll()` is currently only reserved for standard input, because
|
||
|
||
1. it is really only necessary to handle interactive input: otherwise,
|
||
there is no way in Go to peek from Standard Input without actually
|
||
reading (and thus consuming) from it;
|
||
|
||
2. if `Stdin` is connected to a pipe, it is ok in most cases to return
|
||
with success immediately;
|
||
|
||
3. `syfs.poll()` is currently a blocking call, irrespective of goroutines,
|
||
because the underlying syscall is; thus, it is better to limit its usage.
|
||
|
||
So, if the subscription is for `os.Stdin` and the handle is detected
|
||
to correspond to an interactive session, then `sysfs.poll()` will be
|
||
invoked with a the `Stdin` handle *and* the timeout.
|
||
|
||
This also means that in this specific case, the timeout is uninterruptible,
|
||
unless data becomes available on `Stdin` itself.
|
||
|
||
### Select on Windows
|
||
|
||
On Windows `sysfs.poll()` cannot be delegated to a single
|
||
syscall, because there is no single syscall to handle sockets,
|
||
pipes and regular files.
|
||
|
||
Instead, we emulate its behavior for the cases that are currently
|
||
of interest.
|
||
|
||
- For regular files, we _always_ report them as ready, as
|
||
[most operating systems do anyway][async-io-windows].
|
||
|
||
- For pipes, we invoke [`PeekNamedPipe`][peeknamedpipe]
|
||
for each file handle we detect is a pipe open for reading.
|
||
We currently ignore pipes open for writing.
|
||
|
||
- Notably, we include also support for sockets using the [WinSock
|
||
implementation of `poll`][wsapoll], but instead
|
||
of relying on the timeout argument of the `WSAPoll` function,
|
||
we set a 0-duration timeout so that it behaves like a peek.
|
||
|
||
This way, we can check for regular files all at once,
|
||
at the beginning of the function, then we poll pipes and
|
||
sockets periodically using a cancellable `time.Tick`,
|
||
which plays nicely with the rest of the Go runtime.
|
||
|
||
### Impact of blocking
|
||
|
||
Because this is a blocking syscall, it will also block the carrier thread of
|
||
the goroutine, preventing any means to support context cancellation directly.
|
||
|
||
There are ways to obviate this issue. We outline here one idea, that is however
|
||
not currently implemented. A common approach to support context cancellation is
|
||
to add a signal file descriptor to the set, e.g. the read-end of a pipe or an
|
||
eventfd on Linux. When the context is canceled, we may unblock a Select call by
|
||
writing to the fd, causing it to return immediately. This however requires to
|
||
do a bit of housekeeping to hide the "special" FD from the end-user.
|
||
|
||
[poll_oneoff]: https://github.com/WebAssembly/wasi-poll#why-is-the-function-called-poll_oneoff
|
||
[async-io-windows]: https://tinyclouds.org/iocp_links
|
||
[peeknamedpipe]: https://learn.microsoft.com/en-us/windows/win32/api/namedpipeapi/nf-namedpipeapi-peeknamedpipe
|
||
[wsapoll]: https://learn.microsoft.com/en-us/windows/win32/api/winsock2/nf-winsock2-wsapoll
|
||
|
||
## Signed encoding of integer global constant initializers
|
||
|
||
wazero treats integer global constant initializers signed as their interpretation is not known at declaration time. For
|
||
example, there is no signed integer [value type](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#value-types%E2%91%A0).
|
||
|
||
To get at the problem, let's use an example.
|
||
```
|
||
(global (export "start_epoch") i64 (i64.const 1620216263544))
|
||
```
|
||
|
||
In both signed and unsigned LEB128 encoding, this value is the same bit pattern. The problem is that some numbers are
|
||
not. For example, 16256 is `807f` encoded as unsigned, but `80ff00` encoded as signed.
|
||
|
||
While the specification mentions uninterpreted integers are in abstract [unsigned values](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#integers%E2%91%A0),
|
||
the binary encoding is clear that they are encoded [signed](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#integers%E2%91%A4).
|
||
|
||
For consistency, we go with signed encoding in the special case of global constant initializers.
|
||
|
||
## Implementation limitations
|
||
|
||
WebAssembly 1.0 (20191205) specification allows runtimes to [limit certain aspects of Wasm module or execution](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#a2-implementation-limitations).
|
||
|
||
wazero limitations are imposed pragmatically and described below.
|
||
|
||
### Number of functions in a module
|
||
|
||
The possible number of function instances in [a module](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#module-instances%E2%91%A0) is not specified in the WebAssembly specifications since [`funcaddr`](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#syntax-funcaddr) corresponding to a function instance in a store can be arbitrary number.
|
||
wazero limits the maximum function instances to 2^27 as even that number would occupy 1GB in function pointers.
|
||
|
||
That is because not only we _believe_ that all use cases are fine with the limitation, but also we have no way to test wazero runtimes under these unusual circumstances.
|
||
|
||
### Number of function types in a store
|
||
|
||
There's no limitation on the number of function types in [a store](https://www.w3.org/TR/2019/REC-wasm-core-1-20191205/#store%E2%91%A0) according to the spec. In wazero implementation, we assign each function type to a unique ID, and choose to use `uint32` to represent the IDs.
|
||
Therefore the maximum number of function types a store can have is limited to 2^27 as even that number would occupy 512MB just to reference the function types.
|
||
|
||
This is due to the same reason for the limitation on the number of functions above.
|
||
|
||
### Number of values on the stack in a function
|
||
|
||
While the the spec does not clarify a limitation of function stack values, wazero limits this to 2^27 = 134,217,728.
|
||
The reason is that we internally represent all the values as 64-bit integers regardless of its types (including f32, f64), and 2^27 values means
|
||
1 GiB = (2^30). 1 GiB is the reasonable for most applications [as we see a Goroutine has 250 MB as a limit on the stack for 32-bit arch](https://github.com/golang/go/blob/go1.20/src/runtime/proc.go#L152-L159), considering that WebAssembly is (currently) 32-bit environment.
|
||
|
||
All the functions are statically analyzed at module instantiation phase, and if a function can potentially reach this limit, an error is returned.
|
||
|
||
### Number of globals in a module
|
||
|
||
Theoretically, a module can declare globals (including imports) up to 2^32 times. However, wazero limits this to 2^27(134,217,728) per module.
|
||
That is because internally we store globals in a slice with pointer types (meaning 8 bytes on 64-bit platforms), and therefore 2^27 globals
|
||
means that we have 1 GiB size of slice which seems large enough for most applications.
|
||
|
||
### Number of tables in a module
|
||
|
||
While the the spec says that a module can have up to 2^32 tables, wazero limits this to 2^27 = 134,217,728.
|
||
One of the reasons is even that number would occupy 1GB in the pointers tables alone. Not only that, we access tables slice by
|
||
table index by using 32-bit signed offset in the compiler implementation, which means that the table index of 2^27 can reach 2^27 * 8 (pointer size on 64-bit machines) = 2^30 offsets in bytes.
|
||
|
||
We _believe_ that all use cases are fine with the limitation, but also note that we have no way to test wazero runtimes under these unusual circumstances.
|
||
|
||
If a module reaches this limit, an error is returned at the compilation phase.
|
||
|
||
## Compiler engine implementation
|
||
|
||
### Why it's safe to execute runtime-generated machine codes against async Goroutine preemption
|
||
|
||
Goroutine preemption is the mechanism of the Go runtime to switch goroutines contexts on an OS thread.
|
||
There are two types of preemption: cooperative preemption and async preemption. The former happens, for example,
|
||
when making a function call, and it is not an issue for our runtime-generated functions as they do not make
|
||
direct function calls to Go-implemented functions. On the other hand, the latter, async preemption, can be problematic
|
||
since it tries to interrupt the execution of Goroutine at any point of function, and manipulates CPU register states.
|
||
|
||
Fortunately, our runtime-generated machine codes do not need to take the async preemption into account.
|
||
All the assembly codes are entered via the trampoline implemented as Go Assembler Function (e.g. [arch_amd64.s](./arch_amd64.s)),
|
||
and as of Go 1.20, these assembler functions are considered as _unsafe_ for async preemption:
|
||
- https://github.com/golang/go/blob/go1.20rc1/src/runtime/preempt.go#L406-L407
|
||
- https://github.com/golang/go/blob/9f0234214473dfb785a5ad84a8fc62a6a395cbc3/src/runtime/traceback.go#L227
|
||
|
||
From the Go runtime point of view, the execution of runtime-generated machine codes is considered as a part of
|
||
that trampoline function. Therefore, runtime-generated machine code is also correctly considered unsafe for async preemption.
|
||
|
||
## Why context cancellation is handled in Go code rather than native code
|
||
|
||
Since [wazero v1.0.0-pre.9](https://github.com/tetratelabs/wazero/releases/tag/v1.0.0-pre.9), the runtime
|
||
supports integration with Go contexts to interrupt execution after a timeout, or in response to explicit cancellation.
|
||
This support is internally implemented as a special opcode `builtinFunctionCheckExitCode` that triggers the execution of
|
||
a Go function (`ModuleInstance.FailIfClosed`) that atomically checks a sentinel value at strategic points in the code.
|
||
|
||
[It _is indeed_ possible to check the sentinel value directly, without leaving the native world][native_check], thus sparing some cycles;
|
||
however, because native code never preempts (see section above), this may lead to a state where the other goroutines
|
||
never get the chance to run, and thus never get the chance to set the sentinel value; effectively preventing
|
||
cancellation from taking place.
|
||
|
||
[native_check]: https://github.com/tetratelabs/wazero/issues/1409
|
||
|
||
## Golang patterns
|
||
|
||
### Hammer tests
|
||
Code that uses concurrency primitives, such as locks or atomics, should include "hammer tests", which run large loops
|
||
inside a bounded amount of goroutines, run by half that many `GOMAXPROCS`. These are named consistently "hammer", so
|
||
they are easy to find. The name inherits from some existing tests in [golang/go](https://github.com/golang/go/search?q=hammer&type=code).
|
||
|
||
Here is an annotated description of the key pieces of a hammer test:
|
||
1. `P` declares the count of goroutines to use, defaulting to 8 or 4 if `testing.Short`.
|
||
* Half this amount are the cores used, and 4 is less than a modern laptop's CPU. This allows multiple "hammer" tests to run in parallel.
|
||
2. `N` declares the scale of work (loop) per goroutine, defaulting to value that finishes in ~0.1s on a modern laptop.
|
||
* When in doubt, try 1000 or 100 if `testing.Short`
|
||
* Remember, there are multiple hammer tests and CI nodes are slow. Slower tests hurt feedback loops.
|
||
3. `defer runtime.GOMAXPROCS(runtime.GOMAXPROCS(P/2))` makes goroutines switch cores, testing visibility of shared data.
|
||
4. To ensure goroutines execute at the same time, block them with `sync.WaitGroup`, initialized to `Add(P)`.
|
||
* `sync.WaitGroup` internally uses `runtime_Semacquire` not available in any other library.
|
||
* `sync.WaitGroup.Add` with a negative value can unblock many goroutines at the same time, e.g. without a for loop.
|
||
5. Track goroutines progress via `finished := make(chan int)` where each goroutine in `P` defers `finished <- 1`.
|
||
1. Tests use `require.XXX`, so `recover()` into `t.Fail` in a `defer` function before `finished <- 1`.
|
||
* This makes it easier to spot larger concurrency problems as you see each failure, not just the first.
|
||
2. After the `defer` function, await unblocked, then run the stateful function `N` times in a normal loop.
|
||
* This loop should trigger shared state problems as locks or atomics are contended by `P` goroutines.
|
||
6. After all `P` goroutines launch, atomically release all of them with `WaitGroup.Add(-P)`.
|
||
7. Block the runner on goroutine completion, by (`<-finished`) for each `P`.
|
||
8. When all goroutines complete, `return` if `t.Failed()`, otherwise perform follow-up state checks.
|
||
|
||
This is implemented in wazero in [hammer.go](internal/testing/hammer/hammer.go)
|
||
|
||
### Lock-free, cross-goroutine observations of updates
|
||
|
||
How to achieve cross-goroutine reads of a variable are not explicitly defined in https://go.dev/ref/mem. wazero uses
|
||
atomics to implement this following unofficial practice. For example, a `Close` operation can be guarded to happen only
|
||
once via compare-and-swap (CAS) against a zero value. When we use this pattern, we consistently use atomics to both
|
||
read and update the same numeric field.
|
||
|
||
In lieu of formal documentation, we infer this pattern works from other sources (besides tests):
|
||
* `sync.WaitGroup` by definition must support calling `Add` from other goroutines. Internally, it uses atomics.
|
||
* rsc in golang/go#5045 writes "atomics guarantee sequential consistency among the atomic variables".
|
||
|
||
See https://github.com/golang/go/blob/go1.20/src/sync/waitgroup.go#L64
|
||
See https://github.com/golang/go/issues/5045#issuecomment-252730563
|
||
See https://www.youtube.com/watch?v=VmrEG-3bWyM
|