From 2686fdcb89692d81f8c76d1a600a5c81ea9bf96a Mon Sep 17 00:00:00 2001 From: Mondo Diaz Date: Mon, 15 Dec 2025 14:00:32 -0600 Subject: [PATCH] Add integrity verification workflow design document --- CHANGELOG.md | 2 + docs/design/integrity-verification.md | 504 ++++++++++++++++++++++++++ 2 files changed, 506 insertions(+) create mode 100644 docs/design/integrity-verification.md diff --git a/CHANGELOG.md b/CHANGELOG.md index 2044bf8..2b5a4ea 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -6,6 +6,8 @@ The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/), and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html). ## [Unreleased] +### Added +- Added integrity verification workflow design document (#24) ## [0.2.0] - 2025-12-15 ### Changed diff --git a/docs/design/integrity-verification.md b/docs/design/integrity-verification.md new file mode 100644 index 0000000..5153f3e --- /dev/null +++ b/docs/design/integrity-verification.md @@ -0,0 +1,504 @@ +# Integrity Verification Workflow Design + +This document defines the process for SHA256 checksum verification on artifact downloads, including failure handling and retry mechanisms. + +## Overview + +Orchard uses content-addressable storage where the artifact ID is the SHA256 hash of the content. This design leverages that property to provide configurable integrity verification during downloads. + +## Current State + +| Aspect | Status | +|--------|--------| +| Download streams content directly from S3 | ✅ Implemented | +| Artifact ID is the SHA256 hash | ✅ Implemented | +| S3 key derived from SHA256 hash | ✅ Implemented | +| Verification during download | ❌ Not implemented | +| Checksum headers in response | ❌ Not implemented | +| Retry mechanism on failure | ❌ Not implemented | +| Failure handling beyond S3 errors | ❌ Not implemented | + +## Verification Modes + +The verification mode is selected via query parameter `?verify=` or server-wide default via `ORCHARD_VERIFY_MODE`. + +| Mode | Performance | Integrity | Use Case | +|------|-------------|-----------|----------| +| `none` | ⚡ Fastest | Client-side | Trusted networks, high throughput | +| `header` | ⚡ Fast | Client-side | Standard downloads, client verification | +| `stream` | 🔄 Moderate | Post-hoc server | Logging/auditing, non-blocking | +| `pre` | 🐢 Slower | Guaranteed | Critical downloads, untrusted storage | +| `strict` | 🐢 Slower | Guaranteed + Alert | Security-sensitive, compliance | + +### Mode: None (Default) + +**Behavior:** +- Stream content directly from S3 with no server-side processing +- Maximum download performance +- Client is responsible for verification + +**Headers Returned:** +``` +X-Checksum-SHA256: +Content-Length: +``` + +**Flow:** +``` +Client Request → Lookup Artifact → Stream from S3 → Client +``` + +### Mode: Header + +**Behavior:** +- Stream content directly from S3 +- Include comprehensive checksum headers +- Client performs verification using headers + +**Headers Returned:** +``` +X-Checksum-SHA256: +Content-Length: +Digest: sha-256= +ETag: "" +X-Content-SHA256: +``` + +**Flow:** +``` +Client Request → Lookup Artifact → Add Headers → Stream from S3 → Client Verifies +``` + +**Client Verification Example:** +```bash +# Download and verify +curl -OJ https://orchard/project/foo/bar/+/v1.0.0 +EXPECTED=$(curl -sI https://orchard/project/foo/bar/+/v1.0.0 | grep X-Checksum-SHA256 | cut -d' ' -f2) +ACTUAL=$(sha256sum downloaded_file | cut -d' ' -f1) +[ "$EXPECTED" = "$ACTUAL" ] && echo "OK" || echo "MISMATCH" +``` + +### Mode: Stream (Post-Hoc Verification) + +**Behavior:** +- Wrap S3 stream with `HashingStreamWrapper` +- Compute SHA256 incrementally while streaming to client +- Verify hash after stream completes +- Log verification result +- Cannot reject content (already sent to client) + +**Headers Returned:** +``` +X-Checksum-SHA256: +Content-Length: +X-Verify-Mode: stream +Trailer: X-Verified +``` + +**Trailers (if client supports):** +``` +X-Verified: true|false +X-Computed-SHA256: +``` + +**Flow:** +``` +Client Request → Lookup Artifact → Wrap Stream → Stream to Client + ↓ + Compute Hash Incrementally + ↓ + Verify After Complete → Log Result +``` + +**Implementation:** +```python +class HashingStreamWrapper: + def __init__(self, stream, expected_hash: str, on_complete: Callable): + self.stream = stream + self.hasher = hashlib.sha256() + self.expected_hash = expected_hash + self.on_complete = on_complete + + def __iter__(self): + for chunk in self.stream: + self.hasher.update(chunk) + yield chunk + # Stream complete, verify + computed = self.hasher.hexdigest() + self.on_complete(computed == self.expected_hash, computed) +``` + +### Mode: Pre-Verify (Blocking) + +**Behavior:** +- Download entire content from S3 to memory/temp file +- Compute SHA256 hash before sending to client +- On match: stream verified content to client +- On mismatch: retry from S3 (up to N times) +- If retries exhausted: return 500 error + +**Headers Returned:** +``` +X-Checksum-SHA256: +Content-Length: +X-Verify-Mode: pre +X-Verified: true +``` + +**Flow:** +``` +Client Request → Lookup Artifact → Download from S3 → Compute Hash + ↓ + Hash Matches? + ↓ ↓ + Yes No + ↓ ↓ + Stream to Client Retry? + ↓ + Yes → Loop + No → 500 Error +``` + +**Memory Considerations:** +- For files < `ORCHARD_VERIFY_MEMORY_LIMIT` (default 100MB): buffer in memory +- For larger files: use temporary file with streaming hash computation +- Cleanup temp files after response sent + +### Mode: Strict + +**Behavior:** +- Same as pre-verify but with no retries +- Fail immediately on any mismatch +- Quarantine artifact on failure (mark as potentially corrupted) +- Trigger alert/notification on failure +- For security-critical downloads + +**Headers Returned (on success):** +``` +X-Checksum-SHA256: +Content-Length: +X-Verify-Mode: strict +X-Verified: true +``` + +**Error Response (on failure):** +```json +{ + "error": "integrity_verification_failed", + "message": "Artifact content does not match expected checksum", + "expected_hash": "", + "computed_hash": "", + "artifact_id": "", + "action_taken": "quarantined" +} +``` + +**Quarantine Process:** +1. Mark artifact `status = 'quarantined'` in database +2. Log security event to audit_logs +3. Optionally notify via webhook/email +4. Artifact becomes unavailable for download until resolved + +## Failure Detection + +### Failure Types + +| Failure Type | Detection Method | Severity | +|--------------|------------------|----------| +| Hash mismatch | Computed SHA256 ≠ Expected | Critical | +| Size mismatch | Actual bytes ≠ `Content-Length` | High | +| S3 read error | boto3 exception | Medium | +| Truncated content | Stream ends early | High | +| S3 object missing | `NoSuchKey` error | Critical | +| ETag mismatch | S3 ETag ≠ expected | Medium | + +### Detection Implementation + +```python +class VerificationResult: + success: bool + failure_type: Optional[str] # hash_mismatch, size_mismatch, etc. + expected_hash: str + computed_hash: Optional[str] + expected_size: int + actual_size: Optional[int] + error_message: Optional[str] + retry_count: int +``` + +## Retry Mechanism + +### Configuration + +| Environment Variable | Default | Description | +|---------------------|---------|-------------| +| `ORCHARD_VERIFY_MAX_RETRIES` | 3 | Maximum retry attempts | +| `ORCHARD_VERIFY_RETRY_DELAY_MS` | 100 | Base delay between retries | +| `ORCHARD_VERIFY_RETRY_BACKOFF` | 2.0 | Exponential backoff multiplier | +| `ORCHARD_VERIFY_RETRY_MAX_DELAY_MS` | 5000 | Maximum delay cap | + +### Backoff Formula + +``` +delay = min(base_delay * (backoff ^ attempt), max_delay) +``` + +Example with defaults: +- Attempt 1: 100ms +- Attempt 2: 200ms +- Attempt 3: 400ms + +### Retry Flow + +```python +async def download_with_retry(artifact, max_retries=3): + for attempt in range(max_retries + 1): + try: + content = await fetch_from_s3(artifact.s3_key) + computed_hash = compute_sha256(content) + + if computed_hash == artifact.id: + return content # Success + + # Hash mismatch + log.warning(f"Verification failed, attempt {attempt + 1}/{max_retries + 1}") + + if attempt < max_retries: + delay = calculate_backoff(attempt) + await asyncio.sleep(delay / 1000) + else: + raise IntegrityError("Max retries exceeded") + + except S3Error as e: + if attempt < max_retries: + delay = calculate_backoff(attempt) + await asyncio.sleep(delay / 1000) + else: + raise +``` + +### Retryable vs Non-Retryable Failures + +**Retryable:** +- S3 read timeout +- S3 connection error +- Hash mismatch (may be transient S3 issue) +- Truncated content + +**Non-Retryable:** +- S3 object not found (404) +- S3 access denied (403) +- Artifact not in database +- Strict mode failures + +## Configuration Reference + +### Environment Variables + +```bash +# Verification mode (none, header, stream, pre, strict) +ORCHARD_VERIFY_MODE=none + +# Retry settings +ORCHARD_VERIFY_MAX_RETRIES=3 +ORCHARD_VERIFY_RETRY_DELAY_MS=100 +ORCHARD_VERIFY_RETRY_BACKOFF=2.0 +ORCHARD_VERIFY_RETRY_MAX_DELAY_MS=5000 + +# Memory limit for pre-verify buffering (bytes) +ORCHARD_VERIFY_MEMORY_LIMIT=104857600 # 100MB + +# Strict mode settings +ORCHARD_VERIFY_QUARANTINE_ON_FAILURE=true +ORCHARD_VERIFY_ALERT_WEBHOOK=https://alerts.example.com/webhook + +# Allow per-request mode override +ORCHARD_VERIFY_ALLOW_OVERRIDE=true +``` + +### Per-Request Override + +When `ORCHARD_VERIFY_ALLOW_OVERRIDE=true`, clients can specify verification mode: + +``` +GET /api/v1/project/foo/bar/+/v1.0.0?verify=pre +GET /api/v1/project/foo/bar/+/v1.0.0?verify=none +``` + +## API Changes + +### Download Endpoint + +**Request:** +``` +GET /api/v1/project/{project}/{package}/+/{ref}?verify={mode} +``` + +**New Query Parameters:** +| Parameter | Type | Default | Description | +|-----------|------|---------|-------------| +| `verify` | string | from config | Verification mode | + +**New Response Headers:** +| Header | Description | +|--------|-------------| +| `X-Checksum-SHA256` | Expected SHA256 hash | +| `X-Verify-Mode` | Active verification mode | +| `X-Verified` | `true` if server verified content | +| `Digest` | RFC 3230 digest header | + +### New Endpoint: Verify Artifact + +**Request:** +``` +POST /api/v1/project/{project}/{package}/+/{ref}/verify +``` + +**Response:** +```json +{ + "artifact_id": "abc123...", + "verified": true, + "expected_hash": "abc123...", + "computed_hash": "abc123...", + "size_match": true, + "expected_size": 1048576, + "actual_size": 1048576, + "verification_time_ms": 45 +} +``` + +## Logging and Monitoring + +### Log Events + +| Event | Level | When | +|-------|-------|------| +| `verification.success` | INFO | Hash verified successfully | +| `verification.failure` | ERROR | Hash mismatch detected | +| `verification.retry` | WARN | Retry attempt initiated | +| `verification.quarantine` | ERROR | Artifact quarantined | +| `verification.skip` | DEBUG | Verification skipped (mode=none) | + +### Metrics + +| Metric | Type | Description | +|--------|------|-------------| +| `orchard_verification_total` | Counter | Total verification attempts | +| `orchard_verification_failures` | Counter | Failed verifications | +| `orchard_verification_retries` | Counter | Retry attempts | +| `orchard_verification_duration_ms` | Histogram | Verification time | + +### Audit Log Entry + +```json +{ + "action": "artifact.download.verified", + "resource": "project/foo/package/bar/artifact/abc123", + "user_id": "user@example.com", + "details": { + "verification_mode": "pre", + "verified": true, + "retry_count": 0, + "duration_ms": 45 + } +} +``` + +## Security Considerations + +1. **Strict Mode for Sensitive Data**: Use strict mode for artifacts containing credentials, certificates, or security-critical code. + +2. **Quarantine Isolation**: Quarantined artifacts should be moved to a separate S3 prefix or bucket for forensic analysis. + +3. **Alert on Repeated Failures**: Multiple verification failures for the same artifact may indicate storage corruption or tampering. + +4. **Audit Trail**: All verification events should be logged for compliance and forensic purposes. + +5. **Client Trust**: In `none` and `header` modes, clients must implement their own verification for security guarantees. + +## Implementation Phases + +### Phase 1: Headers Only +- Add `X-Checksum-SHA256` header to all downloads +- Add `verify=header` mode support +- Add configuration options + +### Phase 2: Stream Verification +- Implement `HashingStreamWrapper` +- Add `verify=stream` mode +- Add verification logging + +### Phase 3: Pre-Verification +- Implement buffered verification +- Add retry mechanism +- Add `verify=pre` mode + +### Phase 4: Strict Mode +- Implement quarantine mechanism +- Add alerting integration +- Add `verify=strict` mode + +## Client Integration Examples + +### curl with Verification +```bash +#!/bin/bash +URL="https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0" + +# Get expected hash from headers +EXPECTED=$(curl -sI "$URL" | grep -i "X-Checksum-SHA256" | tr -d '\r' | cut -d' ' -f2) + +# Download file +curl -sO "$URL" +FILENAME=$(basename "$URL") + +# Verify +ACTUAL=$(sha256sum "$FILENAME" | cut -d' ' -f1) + +if [ "$EXPECTED" = "$ACTUAL" ]; then + echo "✓ Verification passed" +else + echo "✗ Verification FAILED" + echo " Expected: $EXPECTED" + echo " Actual: $ACTUAL" + exit 1 +fi +``` + +### Python Client +```python +import hashlib +import requests + +def download_verified(url: str) -> bytes: + # Get headers first + head = requests.head(url) + expected_hash = head.headers.get('X-Checksum-SHA256') + expected_size = int(head.headers.get('Content-Length', 0)) + + # Download content + response = requests.get(url) + content = response.content + + # Verify size + if len(content) != expected_size: + raise ValueError(f"Size mismatch: {len(content)} != {expected_size}") + + # Verify hash + actual_hash = hashlib.sha256(content).hexdigest() + if actual_hash != expected_hash: + raise ValueError(f"Hash mismatch: {actual_hash} != {expected_hash}") + + return content +``` + +### Server-Side Verification +```bash +# Force server to verify before sending +curl -O "https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0?verify=pre" + +# Check if verification was performed +curl -I "https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0?verify=pre" | grep X-Verified +# X-Verified: true +```