This just makes sure the logic for the layer conversion is all in one
place and settable by a common option.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
Before this change buildkit was changing the media type for
non-distributable layers to normal layers.
It was also clearing out the urls to get those blobs.
Now the layer mediatype and URL's are preserved.
If a layer blob is seen more than once, if it has extra URL's they will
be appended to the stored value.
On export there is now a new exporter option to preserve the
non-distributable data values.
All URL's seen by buildkit will be added to the exported content.
Signed-off-by: Brian Goff <cpuguy83@gmail.com>
This allows clients to specify that LLB states should be grouped in
progress output under a custom name. Status updates for all vertexes in
the group will show up under a single vertex in the output.
The intended use cases are for Dockerfile COPY's that use MergeOp as a
backend and for grouping some other internal vertexes during frontend
builds.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Now, when a merge or diff ref is unlazied, the progress will show up
under the vertex for the merge/diff ref. Additionally, any ancestors of
the op that also need to be unlazied as part of unlazying the merge/diff
will show status updates under its vertex in the progress.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
compression-level option can be set on export to
define the preferred speed vs compression ratio. The
value is a number dependent on the compression algorithm.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
Before this change, the lower and upper parents provided to the cache
manager Diff method were not cloned, which resulted in some code paths
incorrectly providing them directly as the parents of the returned ref.
This meant that if they were released after the call to Diff, the diff
ref could become incorrectly invalidated.
Now, the lower and upper are cloned and unit test coverage has been
added to test that ref release is handled correctly.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Before this, there could be crash during a call to finalize a ref that
occured after the snapshot was committed but before committing the
metadata that indicates the immutable ref no longer had an
equalMutable. This resulted in a situation where any future calls to
finalize that ref would fail.
Now, if that situation happens, the cache will notice when it's
initially loaded that the ref has an equalMutable that's missing its
snapshot and that its own snapshot exists. It will then just use the
correctly committed snapshot and clear the equalMutable field.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This adds test coverage for ensuring the readonly parameter is honored
as expected in the ref Mount methods. There was a regression introduced
during #2335 that went unnoticed until identified and fixed in #2562.
This test coverage should help prevent similar regressions in the
future.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This change enables inline cache to work as expected with MergeOp by
supporting a new type of result, DiffResult, which enables results to be
specified as a specific ordered set of layers inside an image.
Previously, results could only be specified with a singe layer index,
which meant that they had to start at the image's base layer and end at
that index. That meant that merge inputs couldn't be specified as they
are often a subset of the image layers that don't begin at the base.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This allows you to create refs that are single layers representing the
diff between any two arbitrary refs. The primary use case for this is
to allows users to extract the changes created by ops like Exec and
rebase them elsewhere through MergeOp. However, there is no restriction
on the inputs to DiffOp and the resulting ref's layer is simply the
layer created by running the differ on the two inputs refs
(specifically, the same differ used during exports).
A Diff ref can be mounted by itself, in which case it is defined as the
result of applying the diff to Scratch. Most use cases though will use
Diff refs as the input to a MergeOp, in which case the diff is just
applied on top of the lower merge inputs, as was the case before.
In cases like Diff(A, A->B->C) (i.e. cases where the diff is between two
refs where the lower is an ancestor of upper), the diff will be defined
as the layers separating the two refs. In other cases, the diff is just
a single layer, not re-used from the inputs, representing the diff
between the two refs (which can be defined as the layer "Diff(A,B)" that
satisfies "Merge(A, Diff(A,B)) == B").
Note that there is technically a meaningful difference between the
"unmerge" behavior of extracting the layers separating diffs and the
"simple diff" of just running the differ on the two refs. Namely, in the
case where there are "intermediate deletes" (i.e. deletes that only
exist in layers between A and B but not between A and B by themselves),
then the simple diff and unmerge can create different results when
plugged into a MergeOp. This is due to the fact that intermediate
deletes will apply to the merge when using the unmerge behavior, but not
when using the simple diff. This is on top of the fact that the simple
diff inherently has a "flattening" behavior where multiple layers are
squashed into a single one.
So, in the case where lower is an ancestor of upper, we choose to follow
the unmerge behavior, but it's possible users may prefer the simple diff
behavior. As of right now, they won't be able to do so, but if needed we
can add the ability to choose which behavior is followed in the future.
This could be done through a flag provided to DiffOp or possibly by
adapting llb.Copy to support this type of behavior with the same
efficiency as DiffOp.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Before this change, test cases were running with an env var that forces
the overlay differ to be on even when the native snapshotter was being
used, which resulted in failures. Now, that env var is skipped when
using the native snapshotter.
Additionally, this includes a related change to skip even trying to use
the overlay differ when the native snapshotter is in use. Previously,
the blob creation code first tried to use the overlay differ and then
failed and fell back to the double-walking differ. Now, it just jumps
right to the double-walking differ when the native snapshotter is in
use.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
switch to using newer MatchesUsingParentResults methods which were
introduced in https://github.com/moby/moby/pull/43037
Signed-off-by: Alex Couture-Beil <alex@earthly.dev>
Before this, if you try to get a ref with an equal blobchain in
GetByBlob but hit a missing provider, the error was just returned. While
we never expect this situation to happen (you shouldn't be able to hit
this line if you didn't already have providers for each blob in the
chain), it technically shouldn't fail the build as you can just continue
on without re-using the ref with equal blobchainID.
Now, we log this at error level but allow the build to continue.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This fixes an issue where merge refs were incorrectly setting their
chain IDs to their last input's ID. This resulted in errors where
GetByBlob thought the merge ref and the final input ref were equivalent.
Now, merge refs have their chain IDs computed by digesting each blob in
the full chain.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This consists of just the base MergeOp with support for merging LLB
results that include deletions using hardlinks as the efficient path
and copies as fallback.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This is mostly just preparation for merge-op. The existing
Extract method is updated to be usable for unlazying any type of refs
rather than just lazy blobs. The way views are created is simplified and
centralized in one location.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
The Parent method will no longer make sense with forthcoming Merge and
Diff support as refs will become capable of having multiple parents. It
was also only ever used externally to get the full chain of refs for
each layer in the ref's chain.
The newly added LayerChain method replaces Parents with a method that
just returns a slice of refs for each layer in the ref's chain. This
will work more seamlessly with Merge and Diff (in which case it returns
the "flattened" ancestors of the ref) in addition to being a bit easier
to use for the exiting cases anyways.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Before this, descriptor handlers were not included with calls to the
exporter, which then sometimes called LoadRef and failed to get a ref
because it was lazy. This change results in the DescHandlers of the
already loaded refs to get plugged into context so they can be re-used
by the exporter if it needs to load the ref again.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
It turns out that while Buildkit code did not need this method to
be public, moby code does still use it, so we have to re-add it
after its removal in #2216 (commit b85ef15).
This commit is not a revert because some of the changes are
still desireable, namely the removal of the "commit" parameter
which didn't serve any purpose.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
FromRemote now calls CheckDescriptor to validate
if the blob still exists. Otherwise cache loading
fallback does not get triggered because cache is
actually lazily pulled in only on exporting phase.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>