# Integrity Verification Workflow Design This document defines the process for SHA256 checksum verification on artifact downloads, including failure handling and retry mechanisms. ## Overview Orchard uses content-addressable storage where the artifact ID is the SHA256 hash of the content. This design leverages that property to provide configurable integrity verification during downloads. ## Current State | Aspect | Status | |--------|--------| | Download streams content directly from S3 | ✅ Implemented | | Artifact ID is the SHA256 hash | ✅ Implemented | | S3 key derived from SHA256 hash | ✅ Implemented | | Verification during download | ❌ Not implemented | | Checksum headers in response | ❌ Not implemented | | Retry mechanism on failure | ❌ Not implemented | | Failure handling beyond S3 errors | ❌ Not implemented | ## Verification Modes The verification mode is selected via query parameter `?verify=` or server-wide default via `ORCHARD_VERIFY_MODE`. | Mode | Performance | Integrity | Use Case | |------|-------------|-----------|----------| | `none` | ⚡ Fastest | Client-side | Trusted networks, high throughput | | `header` | ⚡ Fast | Client-side | Standard downloads, client verification | | `stream` | 🔄 Moderate | Post-hoc server | Logging/auditing, non-blocking | | `pre` | 🐢 Slower | Guaranteed | Critical downloads, untrusted storage | | `strict` | 🐢 Slower | Guaranteed + Alert | Security-sensitive, compliance | ### Mode: None (Default) **Behavior:** - Stream content directly from S3 with no server-side processing - Maximum download performance - Client is responsible for verification **Headers Returned:** ``` X-Checksum-SHA256: Content-Length: ``` **Flow:** ``` Client Request → Lookup Artifact → Stream from S3 → Client ``` ### Mode: Header **Behavior:** - Stream content directly from S3 - Include comprehensive checksum headers - Client performs verification using headers **Headers Returned:** ``` X-Checksum-SHA256: Content-Length: Digest: sha-256= ETag: "" X-Content-SHA256: ``` **Flow:** ``` Client Request → Lookup Artifact → Add Headers → Stream from S3 → Client Verifies ``` **Client Verification Example:** ```bash # Download and verify curl -OJ https://orchard/project/foo/bar/+/v1.0.0 EXPECTED=$(curl -sI https://orchard/project/foo/bar/+/v1.0.0 | grep X-Checksum-SHA256 | cut -d' ' -f2) ACTUAL=$(sha256sum downloaded_file | cut -d' ' -f1) [ "$EXPECTED" = "$ACTUAL" ] && echo "OK" || echo "MISMATCH" ``` ### Mode: Stream (Post-Hoc Verification) **Behavior:** - Wrap S3 stream with `HashingStreamWrapper` - Compute SHA256 incrementally while streaming to client - Verify hash after stream completes - Log verification result - Cannot reject content (already sent to client) **Headers Returned:** ``` X-Checksum-SHA256: Content-Length: X-Verify-Mode: stream Trailer: X-Verified ``` **Trailers (if client supports):** ``` X-Verified: true|false X-Computed-SHA256: ``` **Flow:** ``` Client Request → Lookup Artifact → Wrap Stream → Stream to Client ↓ Compute Hash Incrementally ↓ Verify After Complete → Log Result ``` **Implementation:** ```python class HashingStreamWrapper: def __init__(self, stream, expected_hash: str, on_complete: Callable): self.stream = stream self.hasher = hashlib.sha256() self.expected_hash = expected_hash self.on_complete = on_complete def __iter__(self): for chunk in self.stream: self.hasher.update(chunk) yield chunk # Stream complete, verify computed = self.hasher.hexdigest() self.on_complete(computed == self.expected_hash, computed) ``` ### Mode: Pre-Verify (Blocking) **Behavior:** - Download entire content from S3 to memory/temp file - Compute SHA256 hash before sending to client - On match: stream verified content to client - On mismatch: retry from S3 (up to N times) - If retries exhausted: return 500 error **Headers Returned:** ``` X-Checksum-SHA256: Content-Length: X-Verify-Mode: pre X-Verified: true ``` **Flow:** ``` Client Request → Lookup Artifact → Download from S3 → Compute Hash ↓ Hash Matches? ↓ ↓ Yes No ↓ ↓ Stream to Client Retry? ↓ Yes → Loop No → 500 Error ``` **Memory Considerations:** - For files < `ORCHARD_VERIFY_MEMORY_LIMIT` (default 100MB): buffer in memory - For larger files: use temporary file with streaming hash computation - Cleanup temp files after response sent ### Mode: Strict **Behavior:** - Same as pre-verify but with no retries - Fail immediately on any mismatch - Quarantine artifact on failure (mark as potentially corrupted) - Trigger alert/notification on failure - For security-critical downloads **Headers Returned (on success):** ``` X-Checksum-SHA256: Content-Length: X-Verify-Mode: strict X-Verified: true ``` **Error Response (on failure):** ```json { "error": "integrity_verification_failed", "message": "Artifact content does not match expected checksum", "expected_hash": "", "computed_hash": "", "artifact_id": "", "action_taken": "quarantined" } ``` **Quarantine Process:** 1. Mark artifact `status = 'quarantined'` in database 2. Log security event to audit_logs 3. Optionally notify via webhook/email 4. Artifact becomes unavailable for download until resolved ## Failure Detection ### Failure Types | Failure Type | Detection Method | Severity | |--------------|------------------|----------| | Hash mismatch | Computed SHA256 ≠ Expected | Critical | | Size mismatch | Actual bytes ≠ `Content-Length` | High | | S3 read error | boto3 exception | Medium | | Truncated content | Stream ends early | High | | S3 object missing | `NoSuchKey` error | Critical | | ETag mismatch | S3 ETag ≠ expected | Medium | ### Detection Implementation ```python class VerificationResult: success: bool failure_type: Optional[str] # hash_mismatch, size_mismatch, etc. expected_hash: str computed_hash: Optional[str] expected_size: int actual_size: Optional[int] error_message: Optional[str] retry_count: int ``` ## Retry Mechanism ### Configuration | Environment Variable | Default | Description | |---------------------|---------|-------------| | `ORCHARD_VERIFY_MAX_RETRIES` | 3 | Maximum retry attempts | | `ORCHARD_VERIFY_RETRY_DELAY_MS` | 100 | Base delay between retries | | `ORCHARD_VERIFY_RETRY_BACKOFF` | 2.0 | Exponential backoff multiplier | | `ORCHARD_VERIFY_RETRY_MAX_DELAY_MS` | 5000 | Maximum delay cap | ### Backoff Formula ``` delay = min(base_delay * (backoff ^ attempt), max_delay) ``` Example with defaults: - Attempt 1: 100ms - Attempt 2: 200ms - Attempt 3: 400ms ### Retry Flow ```python async def download_with_retry(artifact, max_retries=3): for attempt in range(max_retries + 1): try: content = await fetch_from_s3(artifact.s3_key) computed_hash = compute_sha256(content) if computed_hash == artifact.id: return content # Success # Hash mismatch log.warning(f"Verification failed, attempt {attempt + 1}/{max_retries + 1}") if attempt < max_retries: delay = calculate_backoff(attempt) await asyncio.sleep(delay / 1000) else: raise IntegrityError("Max retries exceeded") except S3Error as e: if attempt < max_retries: delay = calculate_backoff(attempt) await asyncio.sleep(delay / 1000) else: raise ``` ### Retryable vs Non-Retryable Failures **Retryable:** - S3 read timeout - S3 connection error - Hash mismatch (may be transient S3 issue) - Truncated content **Non-Retryable:** - S3 object not found (404) - S3 access denied (403) - Artifact not in database - Strict mode failures ## Configuration Reference ### Environment Variables ```bash # Verification mode (none, header, stream, pre, strict) ORCHARD_VERIFY_MODE=none # Retry settings ORCHARD_VERIFY_MAX_RETRIES=3 ORCHARD_VERIFY_RETRY_DELAY_MS=100 ORCHARD_VERIFY_RETRY_BACKOFF=2.0 ORCHARD_VERIFY_RETRY_MAX_DELAY_MS=5000 # Memory limit for pre-verify buffering (bytes) ORCHARD_VERIFY_MEMORY_LIMIT=104857600 # 100MB # Strict mode settings ORCHARD_VERIFY_QUARANTINE_ON_FAILURE=true ORCHARD_VERIFY_ALERT_WEBHOOK=https://alerts.example.com/webhook # Allow per-request mode override ORCHARD_VERIFY_ALLOW_OVERRIDE=true ``` ### Per-Request Override When `ORCHARD_VERIFY_ALLOW_OVERRIDE=true`, clients can specify verification mode: ``` GET /api/v1/project/foo/bar/+/v1.0.0?verify=pre GET /api/v1/project/foo/bar/+/v1.0.0?verify=none ``` ## API Changes ### Download Endpoint **Request:** ``` GET /api/v1/project/{project}/{package}/+/{ref}?verify={mode} ``` **New Query Parameters:** | Parameter | Type | Default | Description | |-----------|------|---------|-------------| | `verify` | string | from config | Verification mode | **New Response Headers:** | Header | Description | |--------|-------------| | `X-Checksum-SHA256` | Expected SHA256 hash | | `X-Verify-Mode` | Active verification mode | | `X-Verified` | `true` if server verified content | | `Digest` | RFC 3230 digest header | ### New Endpoint: Verify Artifact **Request:** ``` POST /api/v1/project/{project}/{package}/+/{ref}/verify ``` **Response:** ```json { "artifact_id": "abc123...", "verified": true, "expected_hash": "abc123...", "computed_hash": "abc123...", "size_match": true, "expected_size": 1048576, "actual_size": 1048576, "verification_time_ms": 45 } ``` ## Logging and Monitoring ### Log Events | Event | Level | When | |-------|-------|------| | `verification.success` | INFO | Hash verified successfully | | `verification.failure` | ERROR | Hash mismatch detected | | `verification.retry` | WARN | Retry attempt initiated | | `verification.quarantine` | ERROR | Artifact quarantined | | `verification.skip` | DEBUG | Verification skipped (mode=none) | ### Metrics | Metric | Type | Description | |--------|------|-------------| | `orchard_verification_total` | Counter | Total verification attempts | | `orchard_verification_failures` | Counter | Failed verifications | | `orchard_verification_retries` | Counter | Retry attempts | | `orchard_verification_duration_ms` | Histogram | Verification time | ### Audit Log Entry ```json { "action": "artifact.download.verified", "resource": "project/foo/package/bar/artifact/abc123", "user_id": "user@example.com", "details": { "verification_mode": "pre", "verified": true, "retry_count": 0, "duration_ms": 45 } } ``` ## Security Considerations 1. **Strict Mode for Sensitive Data**: Use strict mode for artifacts containing credentials, certificates, or security-critical code. 2. **Quarantine Isolation**: Quarantined artifacts should be moved to a separate S3 prefix or bucket for forensic analysis. 3. **Alert on Repeated Failures**: Multiple verification failures for the same artifact may indicate storage corruption or tampering. 4. **Audit Trail**: All verification events should be logged for compliance and forensic purposes. 5. **Client Trust**: In `none` and `header` modes, clients must implement their own verification for security guarantees. ## Implementation Phases ### Phase 1: Headers Only - Add `X-Checksum-SHA256` header to all downloads - Add `verify=header` mode support - Add configuration options ### Phase 2: Stream Verification - Implement `HashingStreamWrapper` - Add `verify=stream` mode - Add verification logging ### Phase 3: Pre-Verification - Implement buffered verification - Add retry mechanism - Add `verify=pre` mode ### Phase 4: Strict Mode - Implement quarantine mechanism - Add alerting integration - Add `verify=strict` mode ## Client Integration Examples ### curl with Verification ```bash #!/bin/bash URL="https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0" # Get expected hash from headers EXPECTED=$(curl -sI "$URL" | grep -i "X-Checksum-SHA256" | tr -d '\r' | cut -d' ' -f2) # Download file curl -sO "$URL" FILENAME=$(basename "$URL") # Verify ACTUAL=$(sha256sum "$FILENAME" | cut -d' ' -f1) if [ "$EXPECTED" = "$ACTUAL" ]; then echo "✓ Verification passed" else echo "✗ Verification FAILED" echo " Expected: $EXPECTED" echo " Actual: $ACTUAL" exit 1 fi ``` ### Python Client ```python import hashlib import requests def download_verified(url: str) -> bytes: # Get headers first head = requests.head(url) expected_hash = head.headers.get('X-Checksum-SHA256') expected_size = int(head.headers.get('Content-Length', 0)) # Download content response = requests.get(url) content = response.content # Verify size if len(content) != expected_size: raise ValueError(f"Size mismatch: {len(content)} != {expected_size}") # Verify hash actual_hash = hashlib.sha256(content).hexdigest() if actual_hash != expected_hash: raise ValueError(f"Hash mismatch: {actual_hash} != {expected_hash}") return content ``` ### Server-Side Verification ```bash # Force server to verify before sending curl -O "https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0?verify=pre" # Check if verification was performed curl -I "https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0?verify=pre" | grep X-Verified # X-Verified: true ```