Fix DiffID computation to use uncompressed layer digest#587
Fix DiffID computation to use uncompressed layer digest#587cluster2600 wants to merge 5 commits intoapple:mainfrom
Conversation
Resolves apple#467. Previously, any arbitrary string could be passed as a nameserver in DNS configuration, which would silently result in an invalid /etc/resolv.conf inside the container. This change adds a DNS.validate() method that ensures every nameserver string is a valid IPv4 or IPv6 address (using the existing ContainerizationExtras parsers). The method is called from Vminitd.configureDNS() before applying the configuration. Tests added to DNSTests.swift covering valid IPv4, IPv6, mixed, empty nameserver lists, and invalid hostname/address rejection. Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
The validate() method uses ContainerizationError which lives in its own module and must be explicitly imported. Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
The OCI Image Specification requires DiffIDs to be the SHA256 digest of the uncompressed layer content. InitImage.create() was incorrectly using the digest of the compressed gzip layer. Add ContentWriter.diffID(of:) which decompresses the gzip stream and computes SHA256 of the raw content. The implementation parses the gzip header (handling FEXTRA, FNAME, FCOMMENT, FHCRC flags) and feeds the raw deflate stream to Apple's Compression framework. Signed-off-by: Maxime Grenu <maxime@cluster2600.com> Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
| import Foundation | ||
| import Testing | ||
|
|
||
| @testable import ContainerizationOCI |
There was a problem hiding this comment.
Question: I think we may need a test to validate gzip trailer to ensure it does not return a digest for malformed data.
There was a problem hiding this comment.
Good shout — the implementation wasn't validating the gzip trailer at all. I've added CRC32 + ISIZE verification after decompression (throws gzipTrailerMismatch on failure) and two new tests: one for a truncated archive with the trailer chopped off, and one for a corrupted CRC32 field.
Verify the gzip trailer (CRC32 checksum and original size) after decompression to reject malformed or truncated archives. Add tests for truncated gzip (missing trailer) and corrupted CRC. Signed-off-by: Maxime Grenu <maxime.grenu@gmail.com>
| /// - Parameter url: The URL of the gzip-compressed file. | ||
| /// - Returns: The SHA256 digest of the uncompressed content. | ||
| public static func diffID(of url: URL) throws -> SHA256.Digest { | ||
| let compressedData = try Data(contentsOf: url) |
There was a problem hiding this comment.
Question: If I'm reading this right Data(contentsOf: url) will load the entire compressed file into memory and then fed into decompressGzip which is not same as "using a streaming approach for memory efficiency" per the comment?
| } | ||
|
|
||
| let start = data.startIndex | ||
| let flags = data[start + 3] |
There was a problem hiding this comment.
Question: what's the reason the current changes skipped compression method (CM) (ref https://datatracker.ietf.org/doc/html/rfc1952#page-5) entirely.
Summary
InitImage.create()to compute DiffIDs from the uncompressed layer content per the OCI Image SpecificationContentWriter.diffID(of:)which decompresses gzip data and computes SHA256 of the raw contentChanges
ContentWriter.swift: AdddiffID(of:)static method with gzip header parsing and streaming decompression via Apple's Compression frameworkInitImage.swift: Replace compressed digest with uncompressed digest forRootfs.diffIDsDiffIDTests.swift: 6 tests covering correctness, determinism, error handling, large layers, and output formatContext
Resolves the TODO at
InitImage.swift:56:Test plan