tldr; Standardizing the loading process of Zarf packages.

Overview

Zarf natively supports creating the following package sources:

  • Local Tarball (.tar and .tar.zst)
    • Via zarf package create <dir> -o <dir>, whether or not the resulting tarball is compressed is determined by metadata.uncompressed in zarf.yaml
  • Split Tarball (.part...)
    • Via zarf package create <dir> --max-package-size <size> -o <dir> (or interactively if --confirm isn't passed)
  • OCI (oci://)
    • Via zarf package publish <source> oci:// or zarf package create <dir> -o oci://...
  • In-cluster (Deployed)
    • Post zarf package deploy <source> the package is shown in zarf package list

However, the current loading abilities of Zarf have been inconsistent depending upon the action specified. For example:

  • Split tarball packages could be created, deployed, but not inspected or removed
  • In-cluster packages could be removed (by name), but not inspected
  • HTTP(S) URLs could be deployed, but not inspected, or removed

And so the team was tasked with standardizing the loading process of Zarf packages so that:

  • deploy
  • inspect
  • remove
  • publish
  • pull
  • mirror-resources

would all work with all of the preceding sources.

A common interface

To accomplish this, the team turned towards behavior driven development and creating a standard interface for all package sources to implement.

In Go, interfaces are satisfied implicitly. This means that if a type has the same method signatures as an interface it's considered to implement that interface. For package sources, this means that any type that implements the PackageSource interface can be used as a package source.

// PackageSource is an interface for package sources.
//
// While this interface defines three functions, LoadPackage, LoadPackageMetadata, and Collect; only one of them should be used within a packager function.
//
// These functions currently do not promise repeatability due to the side effect nature of loading a package.
//
// Signature and integrity validation is up to the implementation of the package source.
//
//  `sources.ValidatePackageSignature` and `sources.ValidatePackageIntegrity` can be leveraged for this purpose.
type PackageSource interface {
  // LoadPackage loads a package from a source.
  LoadPackage(dst *layout.PackagePaths) error

  // LoadPackageMetadata loads a package's metadata from a source.
  LoadPackageMetadata(dst *layout.PackagePaths, wantSBOM bool, skipValidation bool) error

  // Collect relocates a package from its source to a tarball in a given destination directory.
  Collect(destinationDirectory string) (tarball string, err error)
}

The team went through a ton of iterations on the naming, inputs, and outputs on each of the preceding functions, but the end result is that each function is responsible for a single task:

  • LoadPackage and LoadPackageMetadata load a package from a source, whilst populating the PackagePaths struct with "loaded" paths
  • Collect transforms the source into a tarball, and returns the path to the tarball

Signature validation and package integrity are also performed within each implementation. This gives library users of Zarf full control over how they want to handle signatures and package integrity.

Result

The result of creating implementations for each source is such that any source can be used for any action. For example:

# inspecting an in-cluster package
$ zarf package inspect init

# removing an OCI package 
# (🦄 is an alias to `ghcr.io/defenseunicorns/packages`, just `defenseunicorns` works as well)
$ zarf package remove oci://🦄/init:v0.29.2-amd64

# pulling an HTTPS package
$ zarf package pull https://pkg.zarf.dev/hello-world

# deploying a split tarball package
$ zarf package deploy ./zarf-package-multi-part-arm64.tar.zst.part000

# publishing OCI to OCI
# this uses the same code that bundle uses, so layers never touch your disk!
# it pipes the GET request from the source to the PUT request to the destination
$ zarf package publish oci://🦄/init:v0.29.2-amd64 oci://docker.io/waynestarr

There are a ton more tweaks and changes under the hood to ensure feature parity of each source, but the end result is that Zarf is now more consistent and predictable when it comes to loading packages.