Commit Graph

341 Commits (dd992414a36bbd19644c6008dd862ff2e7828d36)

Author SHA1 Message Date
CrazyMax 3df7865fbb
Merge pull request #2571 from tonistiigi/lint-update
hack: update linter to v1.43
2022-01-20 17:23:10 +01:00
Erik Sipsma 417ba9e0ed cache: improve diff ref release logic.
Before this change, the lower and upper parents provided to the cache
manager Diff method were not cloned, which resulted in some code paths
incorrectly providing them directly as the parents of the returned ref.
This meant that if they were released after the call to Diff, the diff
ref could become incorrectly invalidated.

Now, the lower and upper are cloned and unit test coverage has been
added to test that ref release is handled correctly.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2022-01-19 14:03:49 -08:00
Tonis Tiigi dc21885891 hack: enable more linters
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2022-01-19 12:20:30 -08:00
Tonis Tiigi 01e935cff5 hack: update linter to v1.43
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2022-01-19 11:48:57 -08:00
Erik Sipsma eb935e525f cache: handle crash after snapshot commit
Before this, there could be crash during a call to finalize a ref that
occured after the snapshot was committed but before committing the
metadata that indicates the immutable ref no longer had an
equalMutable. This resulted in a situation where any future calls to
finalize that ref would fail.

Now, if that situation happens, the cache will notice when it's
initially loaded that the ref has an equalMutable that's missing its
snapshot and that its own snapshot exists. It will then just use the
correctly committed snapshot and clear the equalMutable field.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2022-01-18 12:22:52 -08:00
Erik Sipsma 0eca53360b cache: add coverage for Mount readonly parameter.
This adds test coverage for ensuring the readonly parameter is honored
as expected in the ref Mount methods. There was a regression introduced
during #2335 that went unnoticed until identified and fixed in #2562.
This test coverage should help prevent similar regressions in the
future.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2022-01-17 16:47:44 -08:00
Erik Sipsma 6471f316d3
Merge pull request #2562 from ktock/refmountcachero
cache: do not ignore readonly option
2022-01-16 16:37:34 -08:00
Kohei Tokunaga f8c9a756e3 cache: do not ignore readonly option
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2022-01-16 13:20:17 +09:00
Erik Sipsma ce012ab70c remote cache: support arbitrary layers as results
This change enables inline cache to work as expected with MergeOp by
supporting a new type of result, DiffResult, which enables results to be
specified as a specific ordered set of layers inside an image.
Previously, results could only be specified with a singe layer index,
which meant that they had to start at the image's base layer and end at
that index. That meant that merge inputs couldn't be specified as they
are often a subset of the image layers that don't begin at the base.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2022-01-11 13:27:05 -08:00
Erik Sipsma 5c4dcb2741 cache: add support for Diff refs.
This allows you to create refs that are single layers representing the
diff between any two arbitrary refs. The primary use case for this is
to allows users to extract the changes created by ops like Exec and
rebase them elsewhere through MergeOp. However, there is no restriction
on the inputs to DiffOp and the resulting ref's layer is simply the
layer created by running the differ on the two inputs refs
(specifically, the same differ used during exports).

A Diff ref can be mounted by itself, in which case it is defined as the
result of applying the diff to Scratch. Most use cases though will use
Diff refs as the input to a MergeOp, in which case the diff is just
applied on top of the lower merge inputs, as was the case before.

In cases like Diff(A, A->B->C) (i.e. cases where the diff is between two
refs where the lower is an ancestor of upper), the diff will be defined
as the layers separating the two refs. In other cases, the diff is just
a single layer, not re-used from the inputs, representing the diff
between the two refs (which can be defined as the layer "Diff(A,B)" that
satisfies "Merge(A, Diff(A,B)) == B").

Note that there is technically a meaningful difference between the
"unmerge" behavior of extracting the layers separating diffs and the
"simple diff" of just running the differ on the two refs. Namely, in the
case where there are "intermediate deletes" (i.e. deletes that only
exist in layers between A and B but not between A and B by themselves),
then the simple diff and unmerge can create different results when
plugged into a MergeOp. This is due to the fact that intermediate
deletes will apply to the merge when using the unmerge behavior, but not
when using the simple diff. This is on top of the fact that the simple
diff inherently has a "flattening" behavior where multiple layers are
squashed into a single one.

So, in the case where lower is an ancestor of upper, we choose to follow
the unmerge behavior, but it's possible users may prefer the simple diff
behavior. As of right now, they won't be able to do so, but if needed we
can add the ability to choose which behavior is followed in the future.
This could be done through a flag provided to DiffOp or possibly by
adapting llb.Copy to support this type of behavior with the same
efficiency as DiffOp.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2022-01-06 11:05:51 -08:00
Tõnis Tiigi b9c4e0b302
Merge pull request #2486 from alexcb/acb/update-fsutil-and-docker
update fsutil and docker
2021-12-10 21:35:31 -08:00
Erik Sipsma abf373a3b6 cache: Disable overlay diff for native snapshotter
Before this change, test cases were running with an env var that forces
the overlay differ to be on even when the native snapshotter was being
used, which resulted in failures. Now, that env var is skipped when
using the native snapshotter.

Additionally, this includes a related change to skip even trying to use
the overlay differ when the native snapshotter is in use. Previously,
the blob creation code first tried to use the overlay differ and then
failed and fell back to the double-walking differ. Now, it just jumps
right to the double-walking differ when the native snapshotter is in
use.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-12-09 21:02:44 -08:00
Alex Couture-Beil c55a0b888c
use newer MatchesUsingParentResults
switch to using newer MatchesUsingParentResults methods which were
introduced in https://github.com/moby/moby/pull/43037

Signed-off-by: Alex Couture-Beil <alex@earthly.dev>
2021-12-08 09:41:55 -08:00
Erik Sipsma 441f1e7b27 cache: log missing providers for blobchainID ref
Before this, if you try to get a ref with an equal blobchain in
GetByBlob but hit a missing provider, the error was just returned. While
we never expect this situation to happen (you shouldn't be able to hit
this line if you didn't already have providers for each blob in the
chain), it technically shouldn't fail the build as you can just continue
on without re-using the ref with equal blobchainID.

Now, we log this at error level but allow the build to continue.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-12-02 12:28:28 -08:00
Erik Sipsma 5872bf3dd1 cache: fix merge ref chain IDs.
This fixes an issue where merge refs were incorrectly setting their
chain IDs to their last input's ID. This resulted in errors where
GetByBlob thought the merge ref and the final input ref were equivalent.

Now, merge refs have their chain IDs computed by digesting each blob in
the full chain.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-12-02 12:21:44 -08:00
Erik Sipsma d73e62f878 Add initial MergeOp implementation.
This consists of just the base MergeOp with support for merging LLB
results that include deletions using hardlinks as the efficient path
and copies as fallback.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-11-18 11:10:48 -08:00
Erik Sipsma 9321ec2f82 Refactor cache record mount cache.
This is mostly just preparation for merge-op. The existing
Extract method is updated to be usable for unlazying any type of refs
rather than just lazy blobs. The way views are created is simplified and
centralized in one location.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-11-17 11:02:16 -08:00
Erik Sipsma 03ed0548ef cache: Replace Parent method with LayerChain.
The Parent method will no longer make sense with forthcoming Merge and
Diff support as refs will become capable of having multiple parents. It
was also only ever used externally to get the full chain of refs for
each layer in the ref's chain.

The newly added LayerChain method replaces Parents with a method that
just returns a slice of refs for each layer in the ref's chain. This
will work more seamlessly with Merge and Diff (in which case it returns
the "flattened" ancestors of the ref) in addition to being a bit easier
to use for the exiting cases anyways.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-11-17 11:02:16 -08:00
Erik Sipsma 2fcce87cf4 Include DescHandler CacheOpts during export.
Before this, descriptor handlers were not included with calls to the
exporter, which then sometimes called LoadRef and failed to get a ref
because it was lazy. This change results in the DescHandlers of the
already loaded refs to get plugged into context so they can be re-used
by the exporter if it needs to load the ref again.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-11-17 11:02:16 -08:00
Erik Sipsma 982f4b8687 Move overlay diff to util package.
This allows the overlay diff logic to be re-used by the snapshot package
as part of merge+diff op.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-11-15 14:19:37 -08:00
Tonis Tiigi faf5ad9e96 push: workaround deadlock in containerd pusher
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-11-12 17:33:08 -08:00
Akihiro Suda 0279989d7f
Merge pull request #2433 from tonistiigi/gha-export-fix
gha: fix handling removed blobs on reexport
2021-11-01 13:23:57 +09:00
Tonis Tiigi 962c287b1b gha: fix handling removed blobs on reexport
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-10-28 20:30:48 -07:00
CrazyMax 54b8ff2fc8
go fmt: add //go:build
Signed-off-by: CrazyMax <crazy-max@users.noreply.github.com>
2021-10-28 13:26:43 +02:00
Tõnis Tiigi 4b86211bed
Merge pull request #2405 from ktock/cachecompression
Propagate compression options to the inline cache export
2021-10-27 20:58:53 -07:00
Erik Sipsma 3adb271954 Re-add Finalize method to ImmutableRef.
It turns out that while Buildkit code did not need this method to
be public, moby code does still use it, so we have to re-add it
after its removal in #2216 (commit b85ef15).

This commit is not a revert because some of the changes are
still desireable, namely the removal of the "commit" parameter
which didn't serve any purpose.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-10-25 10:51:30 -07:00
Kohei Tokunaga f9e0346b34 Propagate compression options to the inline cache export
Co-authored-by: Tonis Tiigi <tonistiigi@gmail.com>
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-10-22 15:43:32 +09:00
Kohei Tokunaga 5c27a53a15 Converter: make sure uncompressed digest annotation is set
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-10-05 15:02:05 +09:00
Tõnis Tiigi 78e88560e1
Merge pull request #2387 from tonistiigi/gha-fixes
fixes for github actions cache
2021-10-04 08:53:08 -07:00
Akihiro Suda 10947b040c
Merge pull request #2388 from ktock/containerd-overlaydiff-bufcopy
Differ: write diff to the content store over bufio writer
2021-10-04 17:05:49 +09:00
Kohei Tokunaga 9ffd7f24cb Do not enable overlayfs differ for fuse-overlayfs-snapshotter
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-10-04 13:50:05 +09:00
Kohei Tokunaga 4f3e74c0ca Differ: write diff to the content store over bufio writer
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-10-04 11:43:00 +09:00
Tonis Tiigi 066a011c01 gha: handle missing blob gracefully
FromRemote now calls CheckDescriptor to validate
if the blob still exists. Otherwise cache loading
fallback does not get triggered because cache is
actually lazily pulled in only on exporting phase.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-10-02 14:52:09 -07:00
Tonis Tiigi dad6751112 gha: handle already exist error on save
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-10-01 22:33:47 -07:00
Tõnis Tiigi deb1440fe6
Merge pull request #2372 from jgiannuzzi/fix-issue-2198
Fix flakiness during import of a cache with empty layers removed
2021-09-23 23:19:09 -07:00
Kohei Tokunaga da821a471c Fix estargz compression loses the original tar metadata
Currently, eStargz compression doesn't preserve the original tar metadata
(header bytes and their order). This causes failure of `TestGetRemote` because
an uncompressed blob converted from a gzip blob provides different digset
against the one converted from eStargz blob even if their original tar (computed
by differ) are the same.
This commit solves this issue by fixing eStargz to preserve original tar's
metadata that is modified by eStargz.

Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-09-22 10:50:57 +09:00
Jonathan Giannuzzi 2c540bdc9d Fix issues #1980 and #2198
Signed-off-by: Jonathan Giannuzzi <jonathan@giannuzzi.me>
2021-09-21 15:57:00 +01:00
Tonis Tiigi 35fcb28a00 Clean up old TODOs
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-09-14 22:28:08 -07:00
Tõnis Tiigi f5eb400a85
Merge pull request #2318 from aaronlehmann/follow-links-includedpaths
Follow links in includedPaths to resolve incorrect caching when source path is behind symlink
2021-09-08 10:43:26 -07:00
Aaron Lehmann e9e6cec838 Use getFollowLinksWalked
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
2021-09-08 09:14:48 -07:00
Aaron Lehmann 98f54ff22c Handle the case of multiple path component symlinks (including last component) in wildcard prefix
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
2021-09-07 16:46:00 -07:00
Aaron Lehmann ddd18de18e Add test case for symlink which is not final path component before wildcard
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
2021-09-07 13:41:19 -07:00
Tonis Tiigi a5e0b865f8 update getremote test for zstd
Estargz support has been removed from this test as
implementation does not guarantee digest stability
and only reason it passed were the exceptions in the
test via variant map that ignored cases where timing
resulted the digest to go wrong. This needs to be
addressed in the follow up if we want to keep estargz
support.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-09-06 17:58:47 -07:00
Tonis Tiigi 8b5c4d74ef exporter: support creating blobs with zstd compression
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2021-09-05 22:43:15 -07:00
Kohei Tokunaga d586efd5db Compute diff from the upper dir of overlayfs-based snapshotter
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-09-02 17:58:12 +09:00
Erik Sipsma a9f1980ebb Refactor cache metadata interface.
There are a few goals with this refactor:
1. Remove external access to fields that no longer make sense and/or
   won't make sense soon due to other potential changes. For example,
   there can now be multiple blobs associated with a ref (for different
   compression types), so the fact that you could access the "Blob"
   field from the Info method on Ref incorrectly implied there was just
   a single blob for the ref. This is on top of the fact that there is
   no need for external access to blob digests.
2. Centralize use of cache metadata inside the cache package.
   Previously, many parts of the code outside the cache package could
   obtain the bolt storage item for any ref and read/write it directly.
   This made it hard to understand what fields are used and when. Now,
   the Metadata method has been removed from the Ref interface and
   replaced with getters+setters for metadata fields we want to expose
   outside the package, which makes it much easier to track and
   understand. Similar changes have been made to the metadata search
   interface.
3. Use a consistent getter+setter interface for metadata, replacing
   the mix of interfaces like Metadata(), Size(), Info() and other
   inconsistencies.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-08-25 19:15:09 +00:00
Kohei Tokunaga f8d30d567e Add `estargz` compression type
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
2021-08-24 14:10:09 +09:00
Aaron Lehmann 8021a3e667 Use fixed fileutils matching functions
This is important for two reasons:

1) Keeps caching logic consistent with recent fsutil changes to use
   these functions (also vendored here).

2) Allows us to move forward with removal of the original buggy Matches
   implementation in moby/moby.

Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
2021-08-21 11:17:09 -07:00
Aaron Lehmann 56c0981f1e Follow links in includedPaths to resolve incorrect caching when source path is behind symlink
As discussed in #2300, includedPaths does not resolve symlinks when
looking up the source path in the prefix tree. If the user requests a
path that involves symlinks (for example, /a/foo when a symlink /a -> /b
exists), includedPaths will not find it, and will expect nothing to be
copied. This does not match the actual copy behavior implemented in
fsutil, which will follow symlinks in prefix components of a given path,
so it can end up caching an empty result even though the copy will
produce a non-empty result, which is quite bad.

To fix this, use getFollowLinks to resolve the path before walking it.
In the wildcard case, this is done to the non-wildcard prefix of the
path (if any), which matches the behavior in fsutil.

Fixes the repro case here:
https://gist.github.com/aaronlehmann/64054c9a2cff0d27e200cc107bba3d69

Fixes #2300

Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
2021-08-18 11:54:48 -07:00
Erik Sipsma 23ec51611f cache: Fix flightcontrol use in computeBlobChain.
Previously, the flightcontrol group was being given a key just set to
the ref's ID, which meant that concurrent calls using different values
of compressionType, createIfNeeded and forceCompression would
incorrectly be de-duplicated.

The change here splits up the flightcontrol group into a few separate
calls and ensures that all the correct input variables are put into the
flightcontrol keys.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-08-16 17:26:36 +00:00