Currently, eStargz compression doesn't preserve the original tar metadata
(header bytes and their order). This causes failure of `TestGetRemote` because
an uncompressed blob converted from a gzip blob provides different digset
against the one converted from eStargz blob even if their original tar (computed
by differ) are the same.
This commit solves this issue by fixing eStargz to preserve original tar's
metadata that is modified by eStargz.
Signed-off-by: Kohei Tokunaga <ktokunaga.mail@gmail.com>
Estargz support has been removed from this test as
implementation does not guarantee digest stability
and only reason it passed were the exceptions in the
test via variant map that ignored cases where timing
resulted the digest to go wrong. This needs to be
addressed in the follow up if we want to keep estargz
support.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
There are a few goals with this refactor:
1. Remove external access to fields that no longer make sense and/or
won't make sense soon due to other potential changes. For example,
there can now be multiple blobs associated with a ref (for different
compression types), so the fact that you could access the "Blob"
field from the Info method on Ref incorrectly implied there was just
a single blob for the ref. This is on top of the fact that there is
no need for external access to blob digests.
2. Centralize use of cache metadata inside the cache package.
Previously, many parts of the code outside the cache package could
obtain the bolt storage item for any ref and read/write it directly.
This made it hard to understand what fields are used and when. Now,
the Metadata method has been removed from the Ref interface and
replaced with getters+setters for metadata fields we want to expose
outside the package, which makes it much easier to track and
understand. Similar changes have been made to the metadata search
interface.
3. Use a consistent getter+setter interface for metadata, replacing
the mix of interfaces like Metadata(), Size(), Info() and other
inconsistencies.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
This is important for two reasons:
1) Keeps caching logic consistent with recent fsutil changes to use
these functions (also vendored here).
2) Allows us to move forward with removal of the original buggy Matches
implementation in moby/moby.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
As discussed in #2300, includedPaths does not resolve symlinks when
looking up the source path in the prefix tree. If the user requests a
path that involves symlinks (for example, /a/foo when a symlink /a -> /b
exists), includedPaths will not find it, and will expect nothing to be
copied. This does not match the actual copy behavior implemented in
fsutil, which will follow symlinks in prefix components of a given path,
so it can end up caching an empty result even though the copy will
produce a non-empty result, which is quite bad.
To fix this, use getFollowLinks to resolve the path before walking it.
In the wildcard case, this is done to the non-wildcard prefix of the
path (if any), which matches the behavior in fsutil.
Fixes the repro case here:
https://gist.github.com/aaronlehmann/64054c9a2cff0d27e200cc107bba3d69Fixes#2300
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
Previously, the flightcontrol group was being given a key just set to
the ref's ID, which meant that concurrent calls using different values
of compressionType, createIfNeeded and forceCompression would
incorrectly be de-duplicated.
The change here splits up the flightcontrol group into a few separate
calls and ensures that all the correct input variables are put into the
flightcontrol keys.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
The transformation to rootedPatterns seems very wrong and inconsistent
with what the copy logic did. Change it to match the copy logic, and add
more testing.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
This adds a little extra testing around ** patterns, and adds a
(currently skipped) test for copying directories under symlinks (#2300).
It removes an extra call to `filepath.FromSlash` in `shouldIncludePath`
and an unused argument to that function.
Signed-off-by: Aaron Lehmann <alehmann@netflix.com>
The previous implementation had many issues. Eg. on fetch, even if
the data already existed and no remote connections were needed
the request would still be waiting in the queue. Or if two fetches
of same blob happened together they would take up two places in queue
although there was only one remote request.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
This is a safer alternative until we figure out why
http.Transport based limiting fails.
Some connections like cache export/import do not have a
domain key atm and these connections use global pool.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
GetByBlob checks to see if there are any other blobs with the same
(uncompressed) ChainID and, if so, reuses their unpacked snapshot if it
exists.
The problem is if this code finds a match, it was trying to get the
matching record, but couldn't do so when the match is lazy because the
caller doesn't necessarily have descriptor handlers setup for it.
This commit changes the behavior to just ignore any match with the same
ChainID that's also lazy as they just aren't usable for the
snapshot-reuse optimization.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
Finalize was only used outside the cache package in one place, which
called it with the commit arg set to false. The code path followed
when commit==false turned out to essentially be a no-op because
it set "retain cache" to true if it was already set to true.
It was thus safe to remove the only external call to it and remove it
from the interface. This should be helpful for future efforts to
simplify the equal{Mutable,Immutable} fields in cacheRecord, which exist
due to the "lazy commit" feature that Finalize is tied into.
Signed-off-by: Erik Sipsma <erik@sipsma.dev>
The locks usage is mixed up because two locks separate locks
are actually needed. With a specific lock, calls to SetValue
can be protected.
Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>