Summary
Internally built software artifacts have known and verifiable provenance, establishing a documented chain of custody from source code commit through build and into deployment.
Description
Software artifact provenance answers the question “where did this artifact come from?” for any artifact at any time. Provenance records are created at build time and capture the source commit, repository state, build environment details, build log references, and the resulting artifact identity. These records are immutable once written and stored in a tamper-evident system that prevents retroactive modification.
By linking the cryptographic identity of an artifact (from content addressable identities) to its source and build metadata, software artifact provenance ensures that only artifacts built from known source code through an authorised build process.
Requirements
- Each artifact MUST be traceable back to a specific source code commit, build process, and build environment
- Provenance records MUST be created at build time and be immutable once written
- Provenance records MUST include:
- Cryptographic hash of the output artifact
- Source commit reference
- Sufficient build environment context to support traceability, for example:
- Build system URL
- Build log reference
- Builder identity
- Build tool versions
- Timestamp
Examples & Commentary
- No artifact should be deployed to production without a corresponding provenance record
- Use a dedicated provenance store or attestation service to maintain build records independently of the CI/CD system
- Implement deployment gates that verify an artifact has a valid provenance record before allowing promotion to production environments
- Periodically audit provenance records against running deployments to confirm that all production artifacts have known origins
- Distinguish between human-friendly identifiers (semantic versioning, commit references) for navigation and cryptographic hashes for security and compliance purposes
- Where human-readable identifiers are used, they should ideally be mapped to their corresponding cryptographic identities in an immutable way, such that the mapping itself cannot be altered or reassigned after the fact