Before this change, the lower and upper parents provided to the cache
manager Diff method were not cloned, which resulted in some code paths
incorrectly providing them directly as the parents of the returned ref.
This meant that if they were released after the call to Diff, the diff
ref could become incorrectly invalidated.
Now, the lower and upper are cloned and unit test coverage has been
added to test that ref release is handled correctly.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Before this, there could be crash during a call to finalize a ref that
occured after the snapshot was committed but before committing the
metadata that indicates the immutable ref no longer had an
equalMutable. This resulted in a situation where any future calls to
finalize that ref would fail.
Now, if that situation happens, the cache will notice when it's
initially loaded that the ref has an equalMutable that's missing its
snapshot and that its own snapshot exists. It will then just use the
correctly committed snapshot and clear the equalMutable field.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This adds test coverage for ensuring the readonly parameter is honored
as expected in the ref Mount methods. There was a regression introduced
during #2335 that went unnoticed until identified and fixed in #2562.
This test coverage should help prevent similar regressions in the
future.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This allows you to create refs that are single layers representing the
diff between any two arbitrary refs. The primary use case for this is
to allows users to extract the changes created by ops like Exec and
rebase them elsewhere through MergeOp. However, there is no restriction
on the inputs to DiffOp and the resulting ref's layer is simply the
layer created by running the differ on the two inputs refs
(specifically, the same differ used during exports).
A Diff ref can be mounted by itself, in which case it is defined as the
result of applying the diff to Scratch. Most use cases though will use
Diff refs as the input to a MergeOp, in which case the diff is just
applied on top of the lower merge inputs, as was the case before.
In cases like Diff(A, A->B->C) (i.e. cases where the diff is between two
refs where the lower is an ancestor of upper), the diff will be defined
as the layers separating the two refs. In other cases, the diff is just
a single layer, not re-used from the inputs, representing the diff
between the two refs (which can be defined as the layer "Diff(A,B)" that
satisfies "Merge(A, Diff(A,B)) == B").
Note that there is technically a meaningful difference between the
"unmerge" behavior of extracting the layers separating diffs and the
"simple diff" of just running the differ on the two refs. Namely, in the
case where there are "intermediate deletes" (i.e. deletes that only
exist in layers between A and B but not between A and B by themselves),
then the simple diff and unmerge can create different results when
plugged into a MergeOp. This is due to the fact that intermediate
deletes will apply to the merge when using the unmerge behavior, but not
when using the simple diff. This is on top of the fact that the simple
diff inherently has a "flattening" behavior where multiple layers are
squashed into a single one.
So, in the case where lower is an ancestor of upper, we choose to follow
the unmerge behavior, but it's possible users may prefer the simple diff
behavior. As of right now, they won't be able to do so, but if needed we
can add the ability to choose which behavior is followed in the future.
This could be done through a flag provided to DiffOp or possibly by
adapting llb.Copy to support this type of behavior with the same
efficiency as DiffOp.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This fixes an issue where merge refs were incorrectly setting their
chain IDs to their last input's ID. This resulted in errors where
GetByBlob thought the merge ref and the final input ref were equivalent.
Now, merge refs have their chain IDs computed by digesting each blob in
the full chain.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This consists of just the base MergeOp with support for merging LLB
results that include deletions using hardlinks as the efficient path
and copies as fallback.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
It turns out that while Buildkit code did not need this method to
be public, moby code does still use it, so we have to re-add it
after its removal in #2216 (commit b85ef15).
This commit is not a revert because some of the changes are
still desireable, namely the removal of the "commit" parameter
which didn't serve any purpose.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Currently, eStargz compression doesn't preserve the original tar metadata
(header bytes and their order). This causes failure of `TestGetRemote` because
an uncompressed blob converted from a gzip blob provides different digset
against the one converted from eStargz blob even if their original tar (computed
by differ) are the same.
This commit solves this issue by fixing eStargz to preserve original tar's
metadata that is modified by eStargz.
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Estargz support has been removed from this test as
implementation does not guarantee digest stability
and only reason it passed were the exceptions in the
test via variant map that ignored cases where timing
resulted the digest to go wrong. This needs to be
addressed in the follow up if we want to keep estargz
support.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
There are a few goals with this refactor:
1. Remove external access to fields that no longer make sense and/or
won't make sense soon due to other potential changes. For example,
there can now be multiple blobs associated with a ref (for different
compression types), so the fact that you could access the "Blob"
field from the Info method on Ref incorrectly implied there was just
a single blob for the ref. This is on top of the fact that there is
no need for external access to blob digests.
2. Centralize use of cache metadata inside the cache package.
Previously, many parts of the code outside the cache package could
obtain the bolt storage item for any ref and read/write it directly.
This made it hard to understand what fields are used and when. Now,
the Metadata method has been removed from the Ref interface and
replaced with getters+setters for metadata fields we want to expose
outside the package, which makes it much easier to track and
understand. Similar changes have been made to the metadata search
interface.
3. Use a consistent getter+setter interface for metadata, replacing
the mix of interfaces like Metadata(), Size(), Info() and other
inconsistencies.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Previously, the flightcontrol group was being given a key just set to
the ref's ID, which meant that concurrent calls using different values
of compressionType, createIfNeeded and forceCompression would
incorrectly be de-duplicated.
The change here splits up the flightcontrol group into a few separate
calls and ensures that all the correct input variables are put into the
flightcontrol keys.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
GetByBlob checks to see if there are any other blobs with the same
(uncompressed) ChainID and, if so, reuses their unpacked snapshot if it
exists.
The problem is if this code finds a match, it was trying to get the
matching record, but couldn't do so when the match is lazy because the
caller doesn't necessarily have descriptor handlers setup for it.
This commit changes the behavior to just ignore any match with the same
ChainID that's also lazy as they just aren't usable for the
snapshot-reuse optimization.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Finalize was only used outside the cache package in one place, which
called it with the commit arg set to false. The code path followed
when commit==false turned out to essentially be a no-op because
it set "retain cache" to true if it was already set to true.
It was thus safe to remove the only external call to it and remove it
from the interface. This should be helpful for future efforts to
simplify the equal{Mutable,Immutable} fields in cacheRecord, which exist
due to the "lazy commit" feature that Finalize is tied into.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Containerd's mounter doesn't yet support bind-mounts on Windows.
BuildKit short-cuts this for read-write mounts, but not read-only
mounts.
Signed-off-by: Paul "TBBle" Hampson <Paul.Hampson@Pobox.com>
```
[5/5] RUN --mount=target=/go/src/github.com/moby/buildkit gometalinter ...
0.435 util/rootless/specconv/specconv_linux.go:1:⚠️ file is not goimported (goimports)
1.320 cache/manager.go:1:⚠️ file is not goimported (goimports)
1.335 cache/manager_test.go:1:⚠️ file is not goimported (goimports)
1.337 cache/migrate_v2.go:1:⚠️ file is not goimported (goimports)
1.342 cache/refs.go:1:⚠️ file is not goimported (goimports)
1.454 cache/remotecache/registry/registry.go:1:⚠️ file is not goimported (goimports)
2.285 cmd/buildctl/build.go:1:⚠️ file is not goimported (goimports)
3.082 executor/oci/user.go:1:⚠️ file is not goimported (goimports)
4.333 session/content/content_test.go:1:⚠️ file is not goimported (goimports)
4.614 snapshot/containerd/content.go:1:⚠️ file is not goimported (goimports)
4.721 solver/errdefs/vertex.go:1:⚠️ file is not goimported (goimports)
6.066 util/network/cniprovider/cni.go:1:⚠️ file is not goimported (goimports)
ERROR: executor failed running [/bin/sh -c gometalinter --config=gometalinter.json ./...]: buildkit-runc did not terminate successfully
```
Signed-off-by: Sebastiaan van Stijn <github@gone.nl>
It is enhancement which allows to unpack image into containerd
snapshotter storage by `--output type=image,<.>=<.>,unpack=true`.
In order to support this feature, we needs to extend the Snapshotter
witwh `Name() string` function. Because we needs to set gc label for
snapshotter which need snapshotter name.
fix: #908
Signed-off-by: Wei Fu <fuweid89@gmail.com>
Replaces previous mutable.Freeze logic with
commits that can live together with mutable data.
Finalize method is added if the implementation
needs to make sure that the immutable ref is
flushed to the driver. Refs are automaitcally
finalized when writable layers are created on
top of them.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>