Add deduplication design doc, file size limits, and validation tests

- Add max_file_size (10GB) and min_file_size (1 byte) config options
- Add file size validation to regular and resumable upload endpoints
- Create comprehensive deduplication design document covering:
  - SHA256 algorithm selection rationale and migration path
  - Content-addressable storage model
  - S3 key derivation and prefix sharding
  - Duplicate detection workflow
  - Reference counting lifecycle
  - Edge cases and error handling
  - Collision detection strategy
  - Performance considerations
  - Operations runbook
- Add tests for empty file rejection and file size validation
This commit is contained in:
Mondo Diaz
2026-01-05 15:35:21 -06:00
parent 32115fc1c5
commit 55a38ad850
5 changed files with 664 additions and 0 deletions

View File

@@ -38,6 +38,10 @@ class Settings(BaseSettings):
s3_read_timeout: int = 60 # Read timeout in seconds
s3_max_retries: int = 3 # Max retry attempts for transient failures
# Upload settings
max_file_size: int = 10 * 1024 * 1024 * 1024 # 10GB default max file size
min_file_size: int = 1 # Minimum 1 byte (empty files rejected)
# Download settings
download_mode: str = "presigned" # "presigned", "redirect", or "proxy"
presigned_url_expiry: int = (