4 Commits

Author SHA1 Message Date
Mondo Diaz
ebcd1944bf Merge remote-tracking branch 'origin/main' into feature/integrity-verification-design 2025-12-15 12:43:05 -06:00
Mondo Diaz
b0d65f3509 Add integrity verification workflow design document
Define SHA256 checksum verification process for artifact downloads:
- Five verification modes: none, header, stream, pre, strict
- Failure detection for hash/size mismatch, S3 errors, truncation
- Retry mechanism with exponential backoff
- Quarantine process for strict mode failures
- Configuration options and client integration examples
2025-12-15 12:30:18 -06:00
Dane Moss
0eb2deb4ca Merge branch 'update_urls' into 'main'
update URLs to point to BSF

Closes #46

See merge request esv/bsf/bsf-integration/orchard/orchard-mvp!14
2025-12-15 11:30:07 -07:00
Dane Moss
3fe421f31d update URLs to point to BSF 2025-12-15 11:30:07 -07:00
4 changed files with 514 additions and 6 deletions

View File

@@ -7,7 +7,11 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
## [Unreleased]
## [0.2.0] - 2025-12-15
### Changed
- Updated images to use internal container BSF proxy (#46)
### Added
- Added integrity verification workflow design document (#24)
- Added `format` and `platform` fields to packages table (#16)
- Added `checksum_md5` and `metadata` JSONB fields to artifacts table (#16)
- Added `updated_at` field to tags table (#16)

View File

@@ -1,5 +1,5 @@
# Frontend build stage
FROM node:20-alpine AS frontend-builder
FROM containers.global.bsf.tools/node:20-alpine AS frontend-builder
ARG NPM_REGISTRY=https://deps.global.bsf.tools/artifactory/api/npm/registry.npmjs.org/
@@ -19,7 +19,7 @@ COPY frontend/ ./
RUN npm run build
# Runtime stage
FROM python:3.12-slim
FROM containers.global.bsf.tools/python:3.12-slim
# Disable proxy cache
RUN echo 'Acquire::http::Pipeline-Depth 0;\nAcquire::http::No-Cache true;\nAcquire::BrokenProxy true;\n' > /etc/apt/apt.conf.d/99fixbadproxy

View File

@@ -36,7 +36,7 @@ services:
restart: unless-stopped
postgres:
image: postgres:16-alpine
image: containers.global.bsf.tools/postgres:16-alpine
environment:
- POSTGRES_USER=orchard
- POSTGRES_PASSWORD=orchard_secret
@@ -56,7 +56,7 @@ services:
restart: unless-stopped
minio:
image: minio/minio:latest
image: containers.global.bsf.tools/minio/minio:latest
command: server /data --console-address ":9001"
environment:
- MINIO_ROOT_USER=minioadmin
@@ -76,7 +76,7 @@ services:
restart: unless-stopped
minio-init:
image: minio/mc:latest
image: containers.global.bsf.tools/minio/mc:latest
depends_on:
minio:
condition: service_healthy
@@ -91,7 +91,7 @@ services:
- orchard-network
redis:
image: redis:7-alpine
image: containers.global.bsf.tools/redis:7-alpine
command: redis-server --appendonly yes
volumes:
- redis-data:/data

View File

@@ -0,0 +1,504 @@
# Integrity Verification Workflow Design
This document defines the process for SHA256 checksum verification on artifact downloads, including failure handling and retry mechanisms.
## Overview
Orchard uses content-addressable storage where the artifact ID is the SHA256 hash of the content. This design leverages that property to provide configurable integrity verification during downloads.
## Current State
| Aspect | Status |
|--------|--------|
| Download streams content directly from S3 | ✅ Implemented |
| Artifact ID is the SHA256 hash | ✅ Implemented |
| S3 key derived from SHA256 hash | ✅ Implemented |
| Verification during download | ❌ Not implemented |
| Checksum headers in response | ❌ Not implemented |
| Retry mechanism on failure | ❌ Not implemented |
| Failure handling beyond S3 errors | ❌ Not implemented |
## Verification Modes
The verification mode is selected via query parameter `?verify=<mode>` or server-wide default via `ORCHARD_VERIFY_MODE`.
| Mode | Performance | Integrity | Use Case |
|------|-------------|-----------|----------|
| `none` | ⚡ Fastest | Client-side | Trusted networks, high throughput |
| `header` | ⚡ Fast | Client-side | Standard downloads, client verification |
| `stream` | 🔄 Moderate | Post-hoc server | Logging/auditing, non-blocking |
| `pre` | 🐢 Slower | Guaranteed | Critical downloads, untrusted storage |
| `strict` | 🐢 Slower | Guaranteed + Alert | Security-sensitive, compliance |
### Mode: None (Default)
**Behavior:**
- Stream content directly from S3 with no server-side processing
- Maximum download performance
- Client is responsible for verification
**Headers Returned:**
```
X-Checksum-SHA256: <expected_hash>
Content-Length: <expected_size>
```
**Flow:**
```
Client Request → Lookup Artifact → Stream from S3 → Client
```
### Mode: Header
**Behavior:**
- Stream content directly from S3
- Include comprehensive checksum headers
- Client performs verification using headers
**Headers Returned:**
```
X-Checksum-SHA256: <expected_hash>
Content-Length: <expected_size>
Digest: sha-256=<base64_encoded_hash>
ETag: "<sha256_hash>"
X-Content-SHA256: <expected_hash>
```
**Flow:**
```
Client Request → Lookup Artifact → Add Headers → Stream from S3 → Client Verifies
```
**Client Verification Example:**
```bash
# Download and verify
curl -OJ https://orchard/project/foo/bar/+/v1.0.0
EXPECTED=$(curl -sI https://orchard/project/foo/bar/+/v1.0.0 | grep X-Checksum-SHA256 | cut -d' ' -f2)
ACTUAL=$(sha256sum downloaded_file | cut -d' ' -f1)
[ "$EXPECTED" = "$ACTUAL" ] && echo "OK" || echo "MISMATCH"
```
### Mode: Stream (Post-Hoc Verification)
**Behavior:**
- Wrap S3 stream with `HashingStreamWrapper`
- Compute SHA256 incrementally while streaming to client
- Verify hash after stream completes
- Log verification result
- Cannot reject content (already sent to client)
**Headers Returned:**
```
X-Checksum-SHA256: <expected_hash>
Content-Length: <expected_size>
X-Verify-Mode: stream
Trailer: X-Verified
```
**Trailers (if client supports):**
```
X-Verified: true|false
X-Computed-SHA256: <computed_hash>
```
**Flow:**
```
Client Request → Lookup Artifact → Wrap Stream → Stream to Client
Compute Hash Incrementally
Verify After Complete → Log Result
```
**Implementation:**
```python
class HashingStreamWrapper:
def __init__(self, stream, expected_hash: str, on_complete: Callable):
self.stream = stream
self.hasher = hashlib.sha256()
self.expected_hash = expected_hash
self.on_complete = on_complete
def __iter__(self):
for chunk in self.stream:
self.hasher.update(chunk)
yield chunk
# Stream complete, verify
computed = self.hasher.hexdigest()
self.on_complete(computed == self.expected_hash, computed)
```
### Mode: Pre-Verify (Blocking)
**Behavior:**
- Download entire content from S3 to memory/temp file
- Compute SHA256 hash before sending to client
- On match: stream verified content to client
- On mismatch: retry from S3 (up to N times)
- If retries exhausted: return 500 error
**Headers Returned:**
```
X-Checksum-SHA256: <expected_hash>
Content-Length: <expected_size>
X-Verify-Mode: pre
X-Verified: true
```
**Flow:**
```
Client Request → Lookup Artifact → Download from S3 → Compute Hash
Hash Matches?
↓ ↓
Yes No
↓ ↓
Stream to Client Retry?
Yes → Loop
No → 500 Error
```
**Memory Considerations:**
- For files < `ORCHARD_VERIFY_MEMORY_LIMIT` (default 100MB): buffer in memory
- For larger files: use temporary file with streaming hash computation
- Cleanup temp files after response sent
### Mode: Strict
**Behavior:**
- Same as pre-verify but with no retries
- Fail immediately on any mismatch
- Quarantine artifact on failure (mark as potentially corrupted)
- Trigger alert/notification on failure
- For security-critical downloads
**Headers Returned (on success):**
```
X-Checksum-SHA256: <expected_hash>
Content-Length: <expected_size>
X-Verify-Mode: strict
X-Verified: true
```
**Error Response (on failure):**
```json
{
"error": "integrity_verification_failed",
"message": "Artifact content does not match expected checksum",
"expected_hash": "<expected>",
"computed_hash": "<computed>",
"artifact_id": "<id>",
"action_taken": "quarantined"
}
```
**Quarantine Process:**
1. Mark artifact `status = 'quarantined'` in database
2. Log security event to audit_logs
3. Optionally notify via webhook/email
4. Artifact becomes unavailable for download until resolved
## Failure Detection
### Failure Types
| Failure Type | Detection Method | Severity |
|--------------|------------------|----------|
| Hash mismatch | Computed SHA256 ≠ Expected | Critical |
| Size mismatch | Actual bytes ≠ `Content-Length` | High |
| S3 read error | boto3 exception | Medium |
| Truncated content | Stream ends early | High |
| S3 object missing | `NoSuchKey` error | Critical |
| ETag mismatch | S3 ETag ≠ expected | Medium |
### Detection Implementation
```python
class VerificationResult:
success: bool
failure_type: Optional[str] # hash_mismatch, size_mismatch, etc.
expected_hash: str
computed_hash: Optional[str]
expected_size: int
actual_size: Optional[int]
error_message: Optional[str]
retry_count: int
```
## Retry Mechanism
### Configuration
| Environment Variable | Default | Description |
|---------------------|---------|-------------|
| `ORCHARD_VERIFY_MAX_RETRIES` | 3 | Maximum retry attempts |
| `ORCHARD_VERIFY_RETRY_DELAY_MS` | 100 | Base delay between retries |
| `ORCHARD_VERIFY_RETRY_BACKOFF` | 2.0 | Exponential backoff multiplier |
| `ORCHARD_VERIFY_RETRY_MAX_DELAY_MS` | 5000 | Maximum delay cap |
### Backoff Formula
```
delay = min(base_delay * (backoff ^ attempt), max_delay)
```
Example with defaults:
- Attempt 1: 100ms
- Attempt 2: 200ms
- Attempt 3: 400ms
### Retry Flow
```python
async def download_with_retry(artifact, max_retries=3):
for attempt in range(max_retries + 1):
try:
content = await fetch_from_s3(artifact.s3_key)
computed_hash = compute_sha256(content)
if computed_hash == artifact.id:
return content # Success
# Hash mismatch
log.warning(f"Verification failed, attempt {attempt + 1}/{max_retries + 1}")
if attempt < max_retries:
delay = calculate_backoff(attempt)
await asyncio.sleep(delay / 1000)
else:
raise IntegrityError("Max retries exceeded")
except S3Error as e:
if attempt < max_retries:
delay = calculate_backoff(attempt)
await asyncio.sleep(delay / 1000)
else:
raise
```
### Retryable vs Non-Retryable Failures
**Retryable:**
- S3 read timeout
- S3 connection error
- Hash mismatch (may be transient S3 issue)
- Truncated content
**Non-Retryable:**
- S3 object not found (404)
- S3 access denied (403)
- Artifact not in database
- Strict mode failures
## Configuration Reference
### Environment Variables
```bash
# Verification mode (none, header, stream, pre, strict)
ORCHARD_VERIFY_MODE=none
# Retry settings
ORCHARD_VERIFY_MAX_RETRIES=3
ORCHARD_VERIFY_RETRY_DELAY_MS=100
ORCHARD_VERIFY_RETRY_BACKOFF=2.0
ORCHARD_VERIFY_RETRY_MAX_DELAY_MS=5000
# Memory limit for pre-verify buffering (bytes)
ORCHARD_VERIFY_MEMORY_LIMIT=104857600 # 100MB
# Strict mode settings
ORCHARD_VERIFY_QUARANTINE_ON_FAILURE=true
ORCHARD_VERIFY_ALERT_WEBHOOK=https://alerts.example.com/webhook
# Allow per-request mode override
ORCHARD_VERIFY_ALLOW_OVERRIDE=true
```
### Per-Request Override
When `ORCHARD_VERIFY_ALLOW_OVERRIDE=true`, clients can specify verification mode:
```
GET /api/v1/project/foo/bar/+/v1.0.0?verify=pre
GET /api/v1/project/foo/bar/+/v1.0.0?verify=none
```
## API Changes
### Download Endpoint
**Request:**
```
GET /api/v1/project/{project}/{package}/+/{ref}?verify={mode}
```
**New Query Parameters:**
| Parameter | Type | Default | Description |
|-----------|------|---------|-------------|
| `verify` | string | from config | Verification mode |
**New Response Headers:**
| Header | Description |
|--------|-------------|
| `X-Checksum-SHA256` | Expected SHA256 hash |
| `X-Verify-Mode` | Active verification mode |
| `X-Verified` | `true` if server verified content |
| `Digest` | RFC 3230 digest header |
### New Endpoint: Verify Artifact
**Request:**
```
POST /api/v1/project/{project}/{package}/+/{ref}/verify
```
**Response:**
```json
{
"artifact_id": "abc123...",
"verified": true,
"expected_hash": "abc123...",
"computed_hash": "abc123...",
"size_match": true,
"expected_size": 1048576,
"actual_size": 1048576,
"verification_time_ms": 45
}
```
## Logging and Monitoring
### Log Events
| Event | Level | When |
|-------|-------|------|
| `verification.success` | INFO | Hash verified successfully |
| `verification.failure` | ERROR | Hash mismatch detected |
| `verification.retry` | WARN | Retry attempt initiated |
| `verification.quarantine` | ERROR | Artifact quarantined |
| `verification.skip` | DEBUG | Verification skipped (mode=none) |
### Metrics
| Metric | Type | Description |
|--------|------|-------------|
| `orchard_verification_total` | Counter | Total verification attempts |
| `orchard_verification_failures` | Counter | Failed verifications |
| `orchard_verification_retries` | Counter | Retry attempts |
| `orchard_verification_duration_ms` | Histogram | Verification time |
### Audit Log Entry
```json
{
"action": "artifact.download.verified",
"resource": "project/foo/package/bar/artifact/abc123",
"user_id": "user@example.com",
"details": {
"verification_mode": "pre",
"verified": true,
"retry_count": 0,
"duration_ms": 45
}
}
```
## Security Considerations
1. **Strict Mode for Sensitive Data**: Use strict mode for artifacts containing credentials, certificates, or security-critical code.
2. **Quarantine Isolation**: Quarantined artifacts should be moved to a separate S3 prefix or bucket for forensic analysis.
3. **Alert on Repeated Failures**: Multiple verification failures for the same artifact may indicate storage corruption or tampering.
4. **Audit Trail**: All verification events should be logged for compliance and forensic purposes.
5. **Client Trust**: In `none` and `header` modes, clients must implement their own verification for security guarantees.
## Implementation Phases
### Phase 1: Headers Only
- Add `X-Checksum-SHA256` header to all downloads
- Add `verify=header` mode support
- Add configuration options
### Phase 2: Stream Verification
- Implement `HashingStreamWrapper`
- Add `verify=stream` mode
- Add verification logging
### Phase 3: Pre-Verification
- Implement buffered verification
- Add retry mechanism
- Add `verify=pre` mode
### Phase 4: Strict Mode
- Implement quarantine mechanism
- Add alerting integration
- Add `verify=strict` mode
## Client Integration Examples
### curl with Verification
```bash
#!/bin/bash
URL="https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0"
# Get expected hash from headers
EXPECTED=$(curl -sI "$URL" | grep -i "X-Checksum-SHA256" | tr -d '\r' | cut -d' ' -f2)
# Download file
curl -sO "$URL"
FILENAME=$(basename "$URL")
# Verify
ACTUAL=$(sha256sum "$FILENAME" | cut -d' ' -f1)
if [ "$EXPECTED" = "$ACTUAL" ]; then
echo "✓ Verification passed"
else
echo "✗ Verification FAILED"
echo " Expected: $EXPECTED"
echo " Actual: $ACTUAL"
exit 1
fi
```
### Python Client
```python
import hashlib
import requests
def download_verified(url: str) -> bytes:
# Get headers first
head = requests.head(url)
expected_hash = head.headers.get('X-Checksum-SHA256')
expected_size = int(head.headers.get('Content-Length', 0))
# Download content
response = requests.get(url)
content = response.content
# Verify size
if len(content) != expected_size:
raise ValueError(f"Size mismatch: {len(content)} != {expected_size}")
# Verify hash
actual_hash = hashlib.sha256(content).hexdigest()
if actual_hash != expected_hash:
raise ValueError(f"Hash mismatch: {actual_hash} != {expected_hash}")
return content
```
### Server-Side Verification
```bash
# Force server to verify before sending
curl -O "https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0?verify=pre"
# Check if verification was performed
curl -I "https://orchard.example.com/api/v1/project/myproject/mypackage/+/v1.0.0?verify=pre" | grep X-Verified
# X-Verified: true
```