Commit Graph

21 Commits (0aa688233438a4c75896cee6bd2ca1bf28d1dcbc)

Author SHA1 Message Date
Brian Goff 78bb7137ee Do not re-tag non-distributable blob descriptors
Before this change buildkit was changing the media type for
non-distributable layers to normal layers.
It was also clearing out the urls to get those blobs.

Now the layer mediatype and URL's are preserved.
If a layer blob is seen more than once, if it has extra URL's they will
be appended to the stored value.

On export there is now a new exporter option to preserve the
non-distributable data values.
All URL's seen by buildkit will be added to the exported content.

Signed-off-by: Brian Goff <cpuguy83@gmail.com>
2022-02-10 01:12:42 +00:00
Erik Sipsma 5c4dcb2741 cache: add support for Diff refs.
This allows you to create refs that are single layers representing the
diff between any two arbitrary refs. The primary use case for this is
to allows users to extract the changes created by ops like Exec and
rebase them elsewhere through MergeOp. However, there is no restriction
on the inputs to DiffOp and the resulting ref's layer is simply the
layer created by running the differ on the two inputs refs
(specifically, the same differ used during exports).

A Diff ref can be mounted by itself, in which case it is defined as the
result of applying the diff to Scratch. Most use cases though will use
Diff refs as the input to a MergeOp, in which case the diff is just
applied on top of the lower merge inputs, as was the case before.

In cases like Diff(A, A->B->C) (i.e. cases where the diff is between two
refs where the lower is an ancestor of upper), the diff will be defined
as the layers separating the two refs. In other cases, the diff is just
a single layer, not re-used from the inputs, representing the diff
between the two refs (which can be defined as the layer "Diff(A,B)" that
satisfies "Merge(A, Diff(A,B)) == B").

Note that there is technically a meaningful difference between the
"unmerge" behavior of extracting the layers separating diffs and the
"simple diff" of just running the differ on the two refs. Namely, in the
case where there are "intermediate deletes" (i.e. deletes that only
exist in layers between A and B but not between A and B by themselves),
then the simple diff and unmerge can create different results when
plugged into a MergeOp. This is due to the fact that intermediate
deletes will apply to the merge when using the unmerge behavior, but not
when using the simple diff. This is on top of the fact that the simple
diff inherently has a "flattening" behavior where multiple layers are
squashed into a single one.

So, in the case where lower is an ancestor of upper, we choose to follow
the unmerge behavior, but it's possible users may prefer the simple diff
behavior. As of right now, they won't be able to do so, but if needed we
can add the ability to choose which behavior is followed in the future.
This could be done through a flag provided to DiffOp or possibly by
adapting llb.Copy to support this type of behavior with the same
efficiency as DiffOp.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2022-01-06 11:05:51 -08:00
Erik Sipsma d73e62f878 Add initial MergeOp implementation.
This consists of just the base MergeOp with support for merging LLB
results that include deletions using hardlinks as the efficient path
and copies as fallback.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-11-18 11:10:48 -08:00
Erik Sipsma a9f1980ebb Refactor cache metadata interface.
There are a few goals with this refactor:
1. Remove external access to fields that no longer make sense and/or
   won't make sense soon due to other potential changes. For example,
   there can now be multiple blobs associated with a ref (for different
   compression types), so the fact that you could access the "Blob"
   field from the Info method on Ref incorrectly implied there was just
   a single blob for the ref. This is on top of the fact that there is
   no need for external access to blob digests.
2. Centralize use of cache metadata inside the cache package.
   Previously, many parts of the code outside the cache package could
   obtain the bolt storage item for any ref and read/write it directly.
   This made it hard to understand what fields are used and when. Now,
   the Metadata method has been removed from the Ref interface and
   replaced with getters+setters for metadata fields we want to expose
   outside the package, which makes it much easier to track and
   understand. Similar changes have been made to the metadata search
   interface.
3. Use a consistent getter+setter interface for metadata, replacing
   the mix of interfaces like Metadata(), Size(), Info() and other
   inconsistencies.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2021-08-25 19:15:09 +00:00
Erik Sipsma 1b30fd146b cache: Remove ImageRef from DescHandlers
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2020-08-05 17:18:43 -07:00
Erik Sipsma 55cbd19dec Add support for lazily-pulled blobs in cache manager.
This allows the layers of images to only be pulled if/once they are actually
required.

Signed-off-by: Erik Sipsma <erik@sipsma.dev>
2020-08-05 17:18:43 -07:00
Tonis Tiigi 339d4b2fef leaseutil: mark temporary leases with timestamps
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2019-10-16 10:35:50 -07:00
Tonis Tiigi 688e2c2272 cache: update components to new lease based cache manager
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2019-10-16 10:32:04 -07:00
Tonis Tiigi b9ec2bc4d4 cache: update manager to use leases instead of root labels
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2019-10-16 10:31:51 -07:00
John Howard 2de2c04c8e Revendoring to move boltdb to bbolt
Signed-off-by: John Howard <jhoward@microsoft.com>
2018-09-18 11:18:08 -07:00
Tonis Tiigi 489246dd28 cache: support for internal/frontend record type
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-07-26 22:54:53 -07:00
Tonis Tiigi 57b96a0ee5 cache: add record type field to usage record
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-07-26 22:54:39 -07:00
Tonis Tiigi bc765861be diff: implement windows layer support for linux
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-07-16 16:33:21 -07:00
Tonis Tiigi b8b55a8c22 image: export reproducible timestamps
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-05-04 15:51:19 -07:00
Tonis Tiigi 63ce643468 cache: add prune support
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2018-01-04 23:09:05 -08:00
Tonis Tiigi 6ffb48395b image: set up history properly for images
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-12-08 20:00:36 -08:00
Tonis Tiigi 8738929b8c solver: implement content based cache support
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-08-04 10:03:26 -07:00
Tonis Tiigi 38ac00090c cache: add more metadata values to snapshots
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-07-26 11:28:20 -07:00
Tonis Tiigi 2ef0971822 cache: remove unused branches
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-07-19 19:02:58 -07:00
Tonis Tiigi eac79f7c7e cache: refactor reference reuse and caching
Replaces previous mutable.Freeze logic with
commits that can live together with mutable data.
Finalize method is added if the implementation
needs to make sure that the immutable ref is
flushed to the driver. Refs are automaitcally
finalized when writable layers are created on
top of them.

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-07-18 16:01:01 -07:00
Tonis Tiigi 336cfe07fa persistent metadata helpers for snapshots
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
2017-07-04 21:55:20 -07:00