Files
orchard/CHANGELOG.md
Mondo Diaz 3a61576764 Fix S3 client to support IRSA credentials (#54)
Only pass explicit credentials to boto3 if they're actually set.
This allows the default credential chain (including IRSA web identity
tokens) to be used when no access key is configured.

Also adds CHANGELOG entries for AWS services configuration.
2026-01-21 20:27:47 +00:00

339 lines
23 KiB
Markdown

# Changelog
All notable changes to this project will be documented in this file.
The format is based on [Keep a Changelog](https://keepachangelog.com/en/1.1.0/),
and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0.html).
## [Unreleased]
### Added
- Added AWS Secrets Manager CSI driver support for database credentials (#54)
- Added SecretProviderClass template for Secrets Manager integration (#54)
- Added IRSA service account annotations for prod and stage environments (#54)
### Changed
- Configured stage and prod to use AWS RDS instead of PostgreSQL subchart (#54)
- Configured stage and prod to use AWS S3 instead of MinIO subchart (#54)
- Changed prod deployment from manual to automatic on version tags (#54)
- Updated S3 client to support IRSA credentials when no explicit keys provided (#54)
- Changed prod image pullPolicy to Always (#54)
- Added proxy-body-size annotation to prod ingress for large uploads (#54)
### Removed
- Disabled PostgreSQL subchart for stage and prod environments (#54)
- Disabled MinIO subchart for stage and prod environments (#54)
### Added
- Added comprehensive upload/download tests for size boundaries (1B to 1GB) (#38)
- Added concurrent upload/download tests (2, 5, 10 parallel operations) (#38)
- Added data integrity tests (binary, text, unicode, compressed content) (#38)
- Added chunk boundary tests for edge cases (#38)
- Added `@pytest.mark.large` and `@pytest.mark.concurrent` test markers (#38)
- Added `generate_content()` and `generate_content_with_hash()` test helpers (#38)
- Added `sized_content` fixture for generating test content of specific sizes (#38)
- Added upload API tests: upload without tag, artifact creation verification, S3 object creation (#38)
- Added download API tests: tag: prefix resolution, 404 for nonexistent project/package/artifact (#38)
- Added download header tests: Content-Type, Content-Length, Content-Disposition, ETag, X-Checksum-SHA256 (#38)
- Added error handling tests: timeout behavior, checksum validation, resource cleanup, graceful error responses (#38)
- Added version API tests: version creation, auto-detection, listing, download by version prefix (#38)
- Added integrity verification tests: round-trip hash verification, client-side verification workflow, size variants (1KB-10MB) (#40)
- Added consistency check endpoint tests with response format validation (#40)
- Added corruption detection tests: bit flip, truncation, appended content, size mismatch, missing S3 objects (#40)
- Added Digest header tests (RFC 3230) and verification mode tests (#40)
- Added integrity verification documentation (`docs/integrity-verification.md`) (#40)
- Added conditional request support for downloads (If-None-Match, If-Modified-Since) returning 304 Not Modified (#42)
- Added caching headers to downloads: Cache-Control (immutable), Last-Modified (#42)
- Added 416 Range Not Satisfiable response for invalid range requests (#42)
- Added download completion logging with bytes transferred and throughput (#42)
- Added client disconnect handling during streaming downloads (#42)
- Added streaming download tests: range requests, conditional requests, caching headers, download resume (#42)
- Added upload duration and throughput metrics (`duration_ms`, `throughput_mbps`) to upload response (#43)
- Added upload progress logging for large files (hash computation and multipart upload phases) (#43)
- Added client disconnect handling during uploads with proper cleanup (#43)
- Added upload progress tracking endpoint `GET /upload/{upload_id}/progress` for resumable uploads (#43)
- Added large file upload tests (10MB, 100MB, 1GB) with multipart upload verification (#43)
- Added upload cancellation and timeout handling tests (#43)
- Added comprehensive API documentation for upload endpoints with curl, Python, and JavaScript examples (#43)
- Added `package_versions` table for immutable version tracking separate from mutable tags (#56)
- Versions are set at upload time via explicit `version` parameter or auto-detected from filename/metadata
- Version detection priority: explicit parameter > package metadata > filename pattern
- Versions are immutable once created (unlike tags which can be moved)
- Added version API endpoints (#56):
- `GET /api/v1/project/{project}/{package}/versions` - List all versions for a package
- `GET /api/v1/project/{project}/{package}/versions/{version}` - Get specific version details
- `DELETE /api/v1/project/{project}/{package}/versions/{version}` - Delete a version (admin only)
- Added version support to upload endpoint via `version` form parameter (#56)
- Added `version:X.Y.Z` prefix for explicit version resolution in download refs (#56)
- Added version field to tag responses (shows which version the artifact has, if any) (#56)
- Added migration `007_package_versions.sql` with ref_count triggers and data migration from semver tags (#56)
- Added production deployment job triggered by semantic version tags (v1.0.0) with manual approval gate (#63)
- Added production Helm values file with persistence enabled (20Gi PostgreSQL, 100Gi MinIO) (#63)
- Added integration tests for production deployment (#63)
- Added GitLab CI pipeline for feature branch deployments to dev namespace (#51)
- Added `deploy_feature` job with dynamic hostnames and unique release names (#51)
- Added `cleanup_feature` job with `on_stop` for automatic cleanup on merge (#51)
- Added `values-dev.yaml` Helm values for lightweight ephemeral environments (#51)
- Added main branch deployment to stage environment (#51)
- Added post-deployment integration tests (#51)
- Added internal proxy configuration for npm, pip, helm, and apt (#51)
### Changed
- CI integration tests now run full pytest suite (~350 tests) against deployed environment instead of 3 smoke tests
- CI production deployment uses lightweight smoke tests only (no test data creation in prod)
- CI pipeline improvements: shared pip cache, `interruptible` flag on test jobs, retry on integration tests
- Simplified deploy verification to health check only (full checks done by integration tests)
- Extracted environment URLs to global variables for maintainability
- Made `cleanup_feature` job standalone (no longer inherits deploy template dependencies)
- Renamed `integration_test_prod` to `smoke_test_prod` for clarity
- Updated download ref resolution to check versions before tags (version → tag → artifact ID) (#56)
- Deploy jobs now require all security scans to pass before deployment (added test_image, app_deps_scan, cve_scan, cve_sbom_analysis, app_sbom_analysis to dependencies) (#63)
- Increased deploy job timeout from 5m to 10m (#63)
- Added `--atomic` flag to Helm deployments for automatic rollback on failure
- Adjusted dark mode color palette to use lighter background tones for better readability and reduced eye strain (#52)
- Replaced project card grid with sortable data table on Home page for better handling of large project lists
- Replaced package card grid with sortable data table on Project page for consistency
- Replaced SortDropdown with table header sorting on Package page for consistency
- Enabled sorting on supported table columns (name, created, updated) via clickable headers
- Updated browser tab title to "Orchard" with custom favicon
- Improved pod naming: Orchard pods now named `orchard-{env}-server-*` for clarity (#51)
### Fixed
- Fixed CI integration test rate limiting: added configurable `ORCHARD_LOGIN_RATE_LIMIT` env var, relaxed to 1000/minute for dev/stage
- Fixed duplicate `TestSecurityEdgeCases` class definition in test_auth_api.py
- Fixed integration tests auth: session-scoped client, configurable credentials via env vars, fail-fast on auth errors
- Fixed 413 Request Entity Too Large errors on uploads by adding `proxy-body-size: "0"` nginx annotation to Orchard ingress
- Fixed CI tests that require direct S3 access: added `@pytest.mark.requires_direct_s3` marker and excluded from CI
- Fixed ref_count triggers not being created: added auto-migration for tags ref_count trigger functions
- Fixed Content-Disposition header encoding for non-ASCII filenames using RFC 5987 (#38)
- Fixed deploy jobs running even when tests or security scans fail (changed rules from `when: always` to `when: on_success`) (#63)
- Fixed python_tests job not using internal PyPI proxy (#63)
- Fixed `cleanup_feature` job failing when branch is deleted (`GIT_STRATEGY: none`) (#51)
- Fixed gitleaks false positives with fingerprints for historical commits (#51)
- Fixed integration tests running when deploy fails (`when: on_success`) (#51)
- Fixed static file serving for favicon and other files in frontend dist root
- Fixed deploy jobs running when secrets scan fails (added `secrets` to deploy dependencies)
- Fixed dev environment memory requests to equal limits per cluster Kyverno policy
- Fixed init containers missing resource limits (Kyverno policy compliance)
- Fixed Python SyntaxWarning for invalid escape sequence in database migration regex pattern
### Removed
- Removed unused `store_streaming()` method from storage.py (#51)
## [0.4.0] - 2026-01-12
### Added
- Added user authentication system with session-based login (#50)
- `users` table with password hashing (bcrypt), admin flag, active status
- `sessions` table for web login sessions (24-hour expiry)
- `auth_settings` table for future OIDC configuration
- Default admin user created on first boot (username: admin, password: admin)
- Added auth API endpoints (#50)
- `POST /api/v1/auth/login` - Login with username/password
- `POST /api/v1/auth/logout` - Logout and clear session
- `GET /api/v1/auth/me` - Get current user info
- `POST /api/v1/auth/change-password` - Change own password
- Added API key management with user ownership (#50)
- `POST /api/v1/auth/keys` - Create API key (format: `orch_<random>`)
- `GET /api/v1/auth/keys` - List user's API keys
- `DELETE /api/v1/auth/keys/{id}` - Revoke API key
- Added `owner_id`, `scopes`, `description` columns to `api_keys` table
- Added admin user management endpoints (#50)
- `GET /api/v1/admin/users` - List all users
- `POST /api/v1/admin/users` - Create user
- `GET /api/v1/admin/users/{username}` - Get user details
- `PUT /api/v1/admin/users/{username}` - Update user (admin/active status)
- `POST /api/v1/admin/users/{username}/reset-password` - Reset password
- Added `auth.py` module with AuthService class and FastAPI dependencies (#50)
- Added auth schemas: LoginRequest, LoginResponse, UserResponse, APIKeyResponse (#50)
- Added migration `006_auth_tables.sql` for auth database tables (#50)
- Added frontend Login page with session management (#50)
- Added frontend API Keys management page (#50)
- Added frontend Admin Users page (admin-only) (#50)
- Added AuthContext for frontend session state (#50)
- Added user menu to Layout header with login/logout (#50)
- Added 15 integration tests for auth system (#50)
- Added reusable `DragDropUpload` component for artifact uploads (#8)
- Drag-and-drop file selection with visual feedback
- Click-to-browse fallback
- Multiple file upload support with queue management
- Real-time progress indicators with speed and ETA
- File type and size validation (configurable)
- Concurrent upload handling (configurable max concurrent)
- Automatic retry with exponential backoff for network errors
- Individual file status (pending, uploading, complete, failed)
- Retry and remove actions per file
- Auto-dismiss success messages after 5 seconds
- Integrated DragDropUpload into PackagePage replacing basic file input (#8)
- Added frontend testing infrastructure with Vitest and React Testing Library (#14)
- Configured Vitest for React/TypeScript with jsdom
- Added 24 unit tests for DragDropUpload component
- Tests cover: rendering, drag-drop events, file validation, upload queue, progress, errors
- Added chunked upload support for large files (#9)
- Files >100MB automatically use chunked upload API (10MB chunks)
- Client-side SHA256 hash computation via Web Crypto API
- localStorage persistence for resume after browser close
- Deduplication check at upload init phase
- Added offline detection and network resilience (#12)
- Automatic pause when browser goes offline
- Auto-resume when connection restored
- Offline banner UI with status message
- XHR abort on network loss to prevent hung requests
- Added download by artifact ID feature (#10)
- Direct artifact ID input field on package page
- Hex-only input validation with character count
- File size and filename displayed in tag list
- Added backend security tests (#15)
- Path traversal prevention tests for upload/download
- Malformed request handling tests
- Checksum validation tests
- 10 new security-focused integration tests
- Added download verification with `verify` and `verify_mode` query parameters (#26)
- `?verify=true&verify_mode=pre` - Pre-verification: verify before streaming (guaranteed no corrupt data)
- `?verify=true&verify_mode=stream` - Streaming verification: verify while streaming (logs error if mismatch)
- Added checksum response headers to all download endpoints (#27)
- `X-Checksum-SHA256` - SHA256 hash of the artifact
- `X-Content-Length` - File size in bytes
- `X-Checksum-MD5` - MD5 hash (if available)
- `ETag` - Artifact ID (SHA256)
- `Digest` - RFC 3230 format sha-256 hash (base64)
- `X-Verified` - Verification status (true/false/pending)
- Added `checksum.py` module with SHA256 utilities (#26)
- `compute_sha256()` and `compute_sha256_stream()` functions
- `HashingStreamWrapper` for incremental hash computation
- `VerifyingStreamWrapper` for stream verification
- `verify_checksum()` and `verify_checksum_strict()` functions
- `ChecksumMismatchError` exception with context
- Added `get_verified()` and `get_stream_verified()` methods to storage layer (#26)
- Added `logging_config.py` module with structured logging (#28)
- JSON logging format for production
- Request ID tracking via context variables
- Verification failure logging with full context
- Added `log_level` and `log_format` settings to configuration (#28)
- Added 62 unit tests for checksum utilities and verification (#29)
- Added 17 integration tests for download verification API (#29)
- Added global artifacts endpoint `GET /api/v1/artifacts` with project/package/tag/size/date filters (#18)
- Added global tags endpoint `GET /api/v1/tags` with project/package/search/date filters (#18)
- Added wildcard pattern matching (`*`) for tag filters across all endpoints (#18)
- Added comma-separated multi-value support for tag filters (#18)
- Added `search` parameter to `/api/v1/uploads` for filename search (#18)
- Added `tag` filter to `/api/v1/uploads` endpoint (#18)
- Added `sort` and `order` parameters to `/api/v1/uploads` endpoint (#18)
- Added `min_size` and `max_size` filters to package artifacts endpoint (#18)
- Added `sort` and `order` parameters to package artifacts endpoint (#18)
- Added `from` and `to` date filters to package tags endpoint (#18)
- Added `GlobalArtifactResponse` and `GlobalTagResponse` schemas (#18)
- Added S3 object verification before database commit during upload (#19)
- Added S3 object cleanup on database commit failure (#19)
- Added upload duration tracking (`duration_ms` field) (#19)
- Added `User-Agent` header capture during uploads (#19)
- Added `X-Checksum-SHA256` header support for client-side checksum verification (#19)
- Added `status`, `error_message`, `client_checksum` columns to uploads table (#19)
- Added `upload_locks` table for future concurrent upload conflict detection (#19)
- Added consistency check endpoint `GET /api/v1/admin/consistency-check` (#19)
- Added `PUT /api/v1/projects/{project}` endpoint for project updates with audit logging (#20)
- Added `PUT /api/v1/project/{project}/packages/{package}` endpoint for package updates with audit logging (#20)
- Added `artifact.download` audit logging to download endpoint (#20)
- Added `ProjectHistory` and `PackageHistory` models with database triggers (#20)
- Added migration `004_history_tables.sql` for project/package history (#20)
- Added migration `005_upload_enhancements.sql` for upload status tracking (#19)
- Added 9 integration tests for global artifacts/tags endpoints (#18)
- Added global uploads query endpoint `GET /api/v1/uploads` with project/package/user/date filters (#18)
- Added project-level uploads endpoint `GET /api/v1/project/{project}/uploads` (#18)
- Added `has_more` field to pagination metadata for easier pagination UI (#18)
- Added `upload_id`, `content_type`, `original_name`, `created_at` fields to upload response (#19)
- Added audit log API endpoints with filtering and pagination (#20)
- `GET /api/v1/audit-logs` - list all audit logs with action/resource/user/date filters
- `GET /api/v1/projects/{project}/audit-logs` - project-scoped audit logs
- `GET /api/v1/project/{project}/{package}/audit-logs` - package-scoped audit logs
- Added upload history API endpoints (#20)
- `GET /api/v1/project/{project}/{package}/uploads` - list upload events for a package
- `GET /api/v1/artifact/{id}/uploads` - list all uploads of a specific artifact
- Added artifact provenance endpoint `GET /api/v1/artifact/{id}/history` (#20)
- Returns full artifact history including packages, tags, and upload events
- Added audit logging for project.create, package.create, tag.create, tag.update, artifact.upload actions (#20)
- Added `AuditLogResponse`, `UploadHistoryResponse`, `ArtifactProvenanceResponse` schemas (#20)
- Added `TagHistoryDetailResponse` schema with artifact metadata (#20)
- Added 31 integration tests for audit log, history, and upload query endpoints (#22)
### Changed
- Standardized audit action naming to `{entity}.{action}` pattern (project.delete, package.delete, tag.delete) (#20)
- Added `StorageBackend` protocol/interface for backend-agnostic storage (#33)
- Added `health_check()` method to storage backend with `/health` endpoint integration (#33)
- Added `verify_integrity()` method for post-upload hash validation (#33)
- Added S3 configuration options: `s3_verify_ssl`, `s3_connect_timeout`, `s3_read_timeout`, `s3_max_retries` (#33)
- Added `S3StorageUnavailableError` and `HashCollisionError` exception types (#33)
- Added hash collision detection by comparing file sizes during deduplication (#33)
- Added garbage collection endpoint `POST /api/v1/admin/garbage-collect` for orphaned artifacts (#36)
- Added orphaned artifacts listing endpoint `GET /api/v1/admin/orphaned-artifacts` (#36)
- Added global storage statistics endpoint `GET /api/v1/stats` (#34)
- Added storage breakdown endpoint `GET /api/v1/stats/storage` (#34)
- Added deduplication metrics endpoint `GET /api/v1/stats/deduplication` (#34)
- Added per-project statistics endpoint `GET /api/v1/projects/{project}/stats` (#34)
- Added per-package statistics endpoint `GET /api/v1/project/{project}/packages/{package}/stats` (#34)
- Added per-artifact statistics endpoint `GET /api/v1/artifact/{id}/stats` (#34)
- Added cross-project deduplication endpoint `GET /api/v1/stats/cross-project` (#34)
- Added timeline statistics endpoint `GET /api/v1/stats/timeline` with daily/weekly/monthly periods (#34)
- Added stats export endpoint `GET /api/v1/stats/export` with JSON/CSV formats (#34)
- Added summary report endpoint `GET /api/v1/stats/report` with markdown/JSON formats (#34)
- Added Dashboard page at `/dashboard` with storage and deduplication visualizations (#34)
- Added pytest infrastructure with mock S3 client for unit testing (#35)
- Added unit tests for SHA256 hash calculation (#35)
- Added unit tests for duplicate detection and deduplication behavior (#35)
- Added integration tests for upload scenarios and ref_count management (#35)
- Added integration tests for S3 verification and failure cleanup (#35)
- Added integration tests for all stats endpoints (#35)
- Added integration tests for cascade deletion ref_count behavior (package/project delete) (#35)
- Added integration tests for tag update ref_count adjustments (#35)
- Added integration tests for garbage collection endpoints (#35)
- Added integration tests for file size validation (#35)
- Added test dependencies to requirements.txt (pytest, pytest-asyncio, pytest-cov, httpx, moto) (#35)
- Added `ORCHARD_MAX_FILE_SIZE` config option (default: 10GB) for upload size limits (#37)
- Added `ORCHARD_MIN_FILE_SIZE` config option (default: 1 byte, rejects empty files) (#37)
- Added file size validation to upload and resumable upload endpoints (#37)
- Added comprehensive deduplication design document (`docs/design/deduplication-design.md`) (#37)
### Fixed
- Fixed Helm chart `minio.ingress` conflicting with Bitnami MinIO subchart by renaming to `minioIngress` (#48)
- Fixed JSON report serialization error for Decimal types in `GET /api/v1/stats/report` (#34)
- Fixed resumable upload double-counting ref_count when tag provided (removed manual increment, SQL triggers handle it) (#35)
## [0.3.0] - 2025-12-15
### Changed
- Changed default download mode from `proxy` to `presigned` for better performance (#48)
### Added
- Added presigned URL support for direct S3 downloads (#48)
- Added `ORCHARD_DOWNLOAD_MODE` config option (`presigned`, `redirect`, `proxy`) (#48)
- Added `ORCHARD_PRESIGNED_URL_EXPIRY` config option (default: 3600 seconds) (#48)
- Added `?mode=` query parameter to override download mode per-request (#48)
- Added `/api/v1/project/{project}/{package}/+/{ref}/url` endpoint for getting presigned URLs (#48)
- Added `PresignedUrlResponse` schema with URL, expiry, checksums, and artifact metadata (#48)
- Added MinIO ingress support in Helm chart for presigned URL access (#48)
- Added `orchard.download.mode` and `orchard.download.presignedUrlExpiry` Helm values (#48)
- Added integrity verification workflow design document (#24)
- Added `sha256` field to API responses for clarity (alias of `id`) (#25)
- Added `checksum_sha1` field to artifacts table for compatibility (#25)
- Added `s3_etag` field to artifacts table for S3 verification (#25)
- Compute and store MD5, SHA1, and S3 ETag alongside SHA256 during upload (#25)
- Added `Dockerfile.local` and `docker-compose.local.yml` for local development (#25)
- Added migration script `003_checksum_fields.sql` for existing databases (#25)
## [0.2.0] - 2025-12-15
### Added
- Added `format` and `platform` fields to packages table (#16)
- Added `checksum_md5` and `metadata` JSONB fields to artifacts table (#16)
- Added `updated_at` field to tags table (#16)
- Added `tag_name`, `user_agent`, `duration_ms`, `deduplicated`, `checksum_verified` fields to uploads table (#16)
- Added `change_type` field to tag_history table (#16)
- Added composite indexes for common query patterns (#16)
- Added GIN indexes on JSONB fields for efficient JSON queries (#16)
- Added partial index for public projects (#16)
- Added database triggers for `updated_at` timestamps (#16)
- Added database triggers for maintaining artifact `ref_count` accuracy (#16)
- Added CHECK constraints for data integrity (`size > 0`, `ref_count >= 0`) (#16)
- Added migration script `002_schema_enhancements.sql` for existing databases (#16)
### Changed
- Updated images to use internal container BSF proxy (#46)
## [0.1.0] - 2025-12-12
### Added
- Added Prosper docker template config (#45)
### Changed
- Changed the Dockerfile npm build arg to use the deps.global.bsf.tools URL as the default registry (#45)