Thanks for the detailed explanation. Yocto has offline builds, but is missing host filesystem isolation.
Are you familiar with StageX, https://codeberg.org/stagex/stagex/#comparison? There's a comparison chart on that page which claims that Nix(OS?) is not fully reproducible. It would be useful to know which subset, if any, is not reproducible.
It is correct that NixOS is not fully reproducible.
Regarding Nix vs NixOS vs Nixpkgs:
- Nix is the name of the programming language (that Nix derivations are written in) and the package manager/build tool. (Though at times the term 'Nix' is just used to refer to the broader ecosystem including NixOS and Nixpkgs.)
- NixOS is an operating system that is built on top of Nix derivations.
- Nixpkgs is a giant repository of Nix derivations, including NixOS (NixOS and Nixpkgs used to be separate, but they were merged together in history.)
The reason why NixOS+Nixpkgs are not fully reproducible, despite all of the guarantees Nix gives you, is simply because there are derivations with non-determinism during the build process. An example of how this might play out is that you could wind up with the order in which operations complete in a parallel build process somehow getting encoded into the final package.
Unfortunately, I don't think there's funding or a ton of interest going into improving the reproducibility of NixOS at the moment, so progress towards squashing reproducibility issues has been slow for a while. You can see relatively up-to-date progress on getting the install media fully reproducible here:
StageX also correctly points out that the Nix bootstrap is inferior to some of the more extreme reproducibility projects. The Nix bootstrap is fairly large, unfortunately. The Guix team has put a substantial amount of effort into minimizing the bootstrap seed and reproducibility of packages. The Nixpkgs bootstrap seed (for my machine, anyway) is currently 27 MiB. The GuixSD bootstrap seed is, I believe, 357 bytes, which is a stunning accomplishment.
StageX considers NixOS trust to be centralized and GuixSD trust to be distributed; this is likely because of the Hydra binary cache which Nix is typically configured to trust by default. You can turn off the Hydra cache to remove this centralized entity, at the cost of obviously needing to build almost everything from scratch. I'm not sure what "distributed" trust actually means here, versus "decentralized".
StageX uses OCI image building as a base. It also doesn't seem to talk about sandboxing anywhere, so it is presumed that StageX is using Dockerfile OCI builds as their only sandboxing, which still allows Internet access. Having Internet access during builds is convenient, but it makes it pretty hard to guarantee that all inputs are accounted for. Their Rust example is pretty interesting:
> RUN ["cargo", "add", "regex"]
There's nothing inherently wrong about this, but despite all of the effort to make the base StageX OCI images reproducible, if you were to build this exact same OCI image months apart, you would presumably be liable to get different results here: you could, for example, get an entirely different version of the regex crate. With Nix, if you make a derivation to build a Rust package, you have to account for the Cargo dependencies in the build, as Nix builds aren't allowed to access the Internet, with the exception of fixed-output derivations. While this doesn't result in Nix derivations being bit-for-bit reproducible, it does ensure that every (external) input is bit-for-bit identical across builds to the same exact derivation, something you can't really easily achieve without a custom build tool like Nix or Guix. If it were possible to sneak an impurity into a Nix build, it is likely a CVE. (There are some exceptions on macOS due to limitations in Darwin sandboxing, but on Linux I believe this holds true. None of the exceptions would make it possible to easily accidentally introduce impurities on macOS, though; you'd pretty much need to do it on purpose.)
Even that aside, StageX uses the same impressive 357 byte bootstrap seed as GuixSD, so it is pretty cool for what it does. It's just a bit lower in scope than Nixpkgs and GuixSD. Nixpkgs is probably the largest single software repository ever built with over 100,000 packages, all of which having to follow this schema of hermetic builds.
that Rust example is gonna bite us in the ass until the day i die, i need to remove it.
The Keyfork project is probably the best example of how an _actual_ Rust project is developed and shipped with stagex (disclaimer, I'm a maintainer of both). Actual Rust programs are built using the following steps:
1. Before building, the stagex tooling downloads and verifies a hash-locked version of the source package
* Additionally, all dependencies for the package (compilers _and_ linked libraries) are verified to have been built.
2. Packages and pallets (collections of packages) are unpacked `FROM scratch` into a bare container
3. `cargo fetch --locked` is then invoked to fetch all the dependencies.
4. `RUN --network=none` is used when compiling the actual binary to ensure no network access happens _after_ the `cargo fetch` stage. Admittedly, it is not ideal to allow turning network access on and off throughout a build, but `--network=none` has helped us identify some odd cases where network access _does_ happen.
5. Once the binary is built, the binary is added on top of the base "filesystem" package, and is considered "done".
Unless some source file gets completely yoinked off the internet (which has happened, and we've had to "rebuild the world" because of it), every stagex package should be 100% bit-for-bit reproducible even if run several years down the line.
There may be some cases where we miss a datestamp or something similar, but hopefully as time goes on, we get the infrastructure to mock system times and throw other wrenches to test how reproducible these packages really are.
I probably should've mentioned that I don't actually have any familiarity with StageX, I did write that at some point but must've accidentally removed it from my reply while still working on it. Even so, I had a feeling the example wasn't a good example of how to actually use it properly, and I feel a little bad because I didn't really mean to critique StageX because of that particular issue, I just thought it was a good example of how Nix differs (Nix enforces purity, Dockerfile builds don't.) It seems like with StageX the goal is to ensure that the build is bit-for-bit reproducible as this would be a relatively good assurance that the inputs are also reproducible. On the other hand, it might be relatively hard to actually debug what went wrong in the more subtle cases where the inputs are not reproducible, since presumably the main artifact of this will be the output differing unexpectedly.
I'm definitely biased as a person who works on Nix stuff but I am not an absolutist when it comes to any of these things, based on what I'm reading about it I'd happily rely on StageX if I wanted reproducible OCI builds (and didn't feel like using Nix to do it, which has plenty of complexities on its own as nice as it can be.)
Are you familiar with StageX, https://codeberg.org/stagex/stagex/#comparison? There's a comparison chart on that page which claims that Nix(OS?) is not fully reproducible. It would be useful to know which subset, if any, is not reproducible.