Files
orchard/CHANGELOG.md
Mondo Diaz 3a61576764 Fix S3 client to support IRSA credentials (#54)
Only pass explicit credentials to boto3 if they're actually set.
This allows the default credential chain (including IRSA web identity
tokens) to be used when no access key is configured.

Also adds CHANGELOG entries for AWS services configuration.
2026-01-21 20:27:47 +00:00

23 KiB

Changelog

All notable changes to this project will be documented in this file.

The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.

[Unreleased]

Added

  • Added AWS Secrets Manager CSI driver support for database credentials (#54)
  • Added SecretProviderClass template for Secrets Manager integration (#54)
  • Added IRSA service account annotations for prod and stage environments (#54)

Changed

  • Configured stage and prod to use AWS RDS instead of PostgreSQL subchart (#54)
  • Configured stage and prod to use AWS S3 instead of MinIO subchart (#54)
  • Changed prod deployment from manual to automatic on version tags (#54)
  • Updated S3 client to support IRSA credentials when no explicit keys provided (#54)
  • Changed prod image pullPolicy to Always (#54)
  • Added proxy-body-size annotation to prod ingress for large uploads (#54)

Removed

  • Disabled PostgreSQL subchart for stage and prod environments (#54)
  • Disabled MinIO subchart for stage and prod environments (#54)

Added

  • Added comprehensive upload/download tests for size boundaries (1B to 1GB) (#38)
  • Added concurrent upload/download tests (2, 5, 10 parallel operations) (#38)
  • Added data integrity tests (binary, text, unicode, compressed content) (#38)
  • Added chunk boundary tests for edge cases (#38)
  • Added @pytest.mark.large and @pytest.mark.concurrent test markers (#38)
  • Added generate_content() and generate_content_with_hash() test helpers (#38)
  • Added sized_content fixture for generating test content of specific sizes (#38)
  • Added upload API tests: upload without tag, artifact creation verification, S3 object creation (#38)
  • Added download API tests: tag: prefix resolution, 404 for nonexistent project/package/artifact (#38)
  • Added download header tests: Content-Type, Content-Length, Content-Disposition, ETag, X-Checksum-SHA256 (#38)
  • Added error handling tests: timeout behavior, checksum validation, resource cleanup, graceful error responses (#38)
  • Added version API tests: version creation, auto-detection, listing, download by version prefix (#38)
  • Added integrity verification tests: round-trip hash verification, client-side verification workflow, size variants (1KB-10MB) (#40)
  • Added consistency check endpoint tests with response format validation (#40)
  • Added corruption detection tests: bit flip, truncation, appended content, size mismatch, missing S3 objects (#40)
  • Added Digest header tests (RFC 3230) and verification mode tests (#40)
  • Added integrity verification documentation (docs/integrity-verification.md) (#40)
  • Added conditional request support for downloads (If-None-Match, If-Modified-Since) returning 304 Not Modified (#42)
  • Added caching headers to downloads: Cache-Control (immutable), Last-Modified (#42)
  • Added 416 Range Not Satisfiable response for invalid range requests (#42)
  • Added download completion logging with bytes transferred and throughput (#42)
  • Added client disconnect handling during streaming downloads (#42)
  • Added streaming download tests: range requests, conditional requests, caching headers, download resume (#42)
  • Added upload duration and throughput metrics (duration_ms, throughput_mbps) to upload response (#43)
  • Added upload progress logging for large files (hash computation and multipart upload phases) (#43)
  • Added client disconnect handling during uploads with proper cleanup (#43)
  • Added upload progress tracking endpoint GET /upload/{upload_id}/progress for resumable uploads (#43)
  • Added large file upload tests (10MB, 100MB, 1GB) with multipart upload verification (#43)
  • Added upload cancellation and timeout handling tests (#43)
  • Added comprehensive API documentation for upload endpoints with curl, Python, and JavaScript examples (#43)
  • Added package_versions table for immutable version tracking separate from mutable tags (#56)
    • Versions are set at upload time via explicit version parameter or auto-detected from filename/metadata
    • Version detection priority: explicit parameter > package metadata > filename pattern
    • Versions are immutable once created (unlike tags which can be moved)
  • Added version API endpoints (#56):
    • GET /api/v1/project/{project}/{package}/versions - List all versions for a package
    • GET /api/v1/project/{project}/{package}/versions/{version} - Get specific version details
    • DELETE /api/v1/project/{project}/{package}/versions/{version} - Delete a version (admin only)
  • Added version support to upload endpoint via version form parameter (#56)
  • Added version:X.Y.Z prefix for explicit version resolution in download refs (#56)
  • Added version field to tag responses (shows which version the artifact has, if any) (#56)
  • Added migration 007_package_versions.sql with ref_count triggers and data migration from semver tags (#56)
  • Added production deployment job triggered by semantic version tags (v1.0.0) with manual approval gate (#63)
  • Added production Helm values file with persistence enabled (20Gi PostgreSQL, 100Gi MinIO) (#63)
  • Added integration tests for production deployment (#63)
  • Added GitLab CI pipeline for feature branch deployments to dev namespace (#51)
  • Added deploy_feature job with dynamic hostnames and unique release names (#51)
  • Added cleanup_feature job with on_stop for automatic cleanup on merge (#51)
  • Added values-dev.yaml Helm values for lightweight ephemeral environments (#51)
  • Added main branch deployment to stage environment (#51)
  • Added post-deployment integration tests (#51)
  • Added internal proxy configuration for npm, pip, helm, and apt (#51)

Changed

  • CI integration tests now run full pytest suite (~350 tests) against deployed environment instead of 3 smoke tests
  • CI production deployment uses lightweight smoke tests only (no test data creation in prod)
  • CI pipeline improvements: shared pip cache, interruptible flag on test jobs, retry on integration tests
  • Simplified deploy verification to health check only (full checks done by integration tests)
  • Extracted environment URLs to global variables for maintainability
  • Made cleanup_feature job standalone (no longer inherits deploy template dependencies)
  • Renamed integration_test_prod to smoke_test_prod for clarity
  • Updated download ref resolution to check versions before tags (version → tag → artifact ID) (#56)
  • Deploy jobs now require all security scans to pass before deployment (added test_image, app_deps_scan, cve_scan, cve_sbom_analysis, app_sbom_analysis to dependencies) (#63)
  • Increased deploy job timeout from 5m to 10m (#63)
  • Added --atomic flag to Helm deployments for automatic rollback on failure
  • Adjusted dark mode color palette to use lighter background tones for better readability and reduced eye strain (#52)
  • Replaced project card grid with sortable data table on Home page for better handling of large project lists
  • Replaced package card grid with sortable data table on Project page for consistency
  • Replaced SortDropdown with table header sorting on Package page for consistency
  • Enabled sorting on supported table columns (name, created, updated) via clickable headers
  • Updated browser tab title to "Orchard" with custom favicon
  • Improved pod naming: Orchard pods now named orchard-{env}-server-* for clarity (#51)

Fixed

  • Fixed CI integration test rate limiting: added configurable ORCHARD_LOGIN_RATE_LIMIT env var, relaxed to 1000/minute for dev/stage
  • Fixed duplicate TestSecurityEdgeCases class definition in test_auth_api.py
  • Fixed integration tests auth: session-scoped client, configurable credentials via env vars, fail-fast on auth errors
  • Fixed 413 Request Entity Too Large errors on uploads by adding proxy-body-size: "0" nginx annotation to Orchard ingress
  • Fixed CI tests that require direct S3 access: added @pytest.mark.requires_direct_s3 marker and excluded from CI
  • Fixed ref_count triggers not being created: added auto-migration for tags ref_count trigger functions
  • Fixed Content-Disposition header encoding for non-ASCII filenames using RFC 5987 (#38)
  • Fixed deploy jobs running even when tests or security scans fail (changed rules from when: always to when: on_success) (#63)
  • Fixed python_tests job not using internal PyPI proxy (#63)
  • Fixed cleanup_feature job failing when branch is deleted (GIT_STRATEGY: none) (#51)
  • Fixed gitleaks false positives with fingerprints for historical commits (#51)
  • Fixed integration tests running when deploy fails (when: on_success) (#51)
  • Fixed static file serving for favicon and other files in frontend dist root
  • Fixed deploy jobs running when secrets scan fails (added secrets to deploy dependencies)
  • Fixed dev environment memory requests to equal limits per cluster Kyverno policy
  • Fixed init containers missing resource limits (Kyverno policy compliance)
  • Fixed Python SyntaxWarning for invalid escape sequence in database migration regex pattern

Removed

  • Removed unused store_streaming() method from storage.py (#51)

[0.4.0] - 2026-01-12

Added

  • Added user authentication system with session-based login (#50)
    • users table with password hashing (bcrypt), admin flag, active status
    • sessions table for web login sessions (24-hour expiry)
    • auth_settings table for future OIDC configuration
    • Default admin user created on first boot (username: admin, password: admin)
  • Added auth API endpoints (#50)
    • POST /api/v1/auth/login - Login with username/password
    • POST /api/v1/auth/logout - Logout and clear session
    • GET /api/v1/auth/me - Get current user info
    • POST /api/v1/auth/change-password - Change own password
  • Added API key management with user ownership (#50)
    • POST /api/v1/auth/keys - Create API key (format: orch_<random>)
    • GET /api/v1/auth/keys - List user's API keys
    • DELETE /api/v1/auth/keys/{id} - Revoke API key
    • Added owner_id, scopes, description columns to api_keys table
  • Added admin user management endpoints (#50)
    • GET /api/v1/admin/users - List all users
    • POST /api/v1/admin/users - Create user
    • GET /api/v1/admin/users/{username} - Get user details
    • PUT /api/v1/admin/users/{username} - Update user (admin/active status)
    • POST /api/v1/admin/users/{username}/reset-password - Reset password
  • Added auth.py module with AuthService class and FastAPI dependencies (#50)
  • Added auth schemas: LoginRequest, LoginResponse, UserResponse, APIKeyResponse (#50)
  • Added migration 006_auth_tables.sql for auth database tables (#50)
  • Added frontend Login page with session management (#50)
  • Added frontend API Keys management page (#50)
  • Added frontend Admin Users page (admin-only) (#50)
  • Added AuthContext for frontend session state (#50)
  • Added user menu to Layout header with login/logout (#50)
  • Added 15 integration tests for auth system (#50)
  • Added reusable DragDropUpload component for artifact uploads (#8)
    • Drag-and-drop file selection with visual feedback
    • Click-to-browse fallback
    • Multiple file upload support with queue management
    • Real-time progress indicators with speed and ETA
    • File type and size validation (configurable)
    • Concurrent upload handling (configurable max concurrent)
    • Automatic retry with exponential backoff for network errors
    • Individual file status (pending, uploading, complete, failed)
    • Retry and remove actions per file
    • Auto-dismiss success messages after 5 seconds
  • Integrated DragDropUpload into PackagePage replacing basic file input (#8)
  • Added frontend testing infrastructure with Vitest and React Testing Library (#14)
    • Configured Vitest for React/TypeScript with jsdom
    • Added 24 unit tests for DragDropUpload component
    • Tests cover: rendering, drag-drop events, file validation, upload queue, progress, errors
  • Added chunked upload support for large files (#9)
    • Files >100MB automatically use chunked upload API (10MB chunks)
    • Client-side SHA256 hash computation via Web Crypto API
    • localStorage persistence for resume after browser close
    • Deduplication check at upload init phase
  • Added offline detection and network resilience (#12)
    • Automatic pause when browser goes offline
    • Auto-resume when connection restored
    • Offline banner UI with status message
    • XHR abort on network loss to prevent hung requests
  • Added download by artifact ID feature (#10)
    • Direct artifact ID input field on package page
    • Hex-only input validation with character count
    • File size and filename displayed in tag list
  • Added backend security tests (#15)
    • Path traversal prevention tests for upload/download
    • Malformed request handling tests
    • Checksum validation tests
    • 10 new security-focused integration tests
  • Added download verification with verify and verify_mode query parameters (#26)
    • ?verify=true&verify_mode=pre - Pre-verification: verify before streaming (guaranteed no corrupt data)
    • ?verify=true&verify_mode=stream - Streaming verification: verify while streaming (logs error if mismatch)
  • Added checksum response headers to all download endpoints (#27)
    • X-Checksum-SHA256 - SHA256 hash of the artifact
    • X-Content-Length - File size in bytes
    • X-Checksum-MD5 - MD5 hash (if available)
    • ETag - Artifact ID (SHA256)
    • Digest - RFC 3230 format sha-256 hash (base64)
    • X-Verified - Verification status (true/false/pending)
  • Added checksum.py module with SHA256 utilities (#26)
    • compute_sha256() and compute_sha256_stream() functions
    • HashingStreamWrapper for incremental hash computation
    • VerifyingStreamWrapper for stream verification
    • verify_checksum() and verify_checksum_strict() functions
    • ChecksumMismatchError exception with context
  • Added get_verified() and get_stream_verified() methods to storage layer (#26)
  • Added logging_config.py module with structured logging (#28)
    • JSON logging format for production
    • Request ID tracking via context variables
    • Verification failure logging with full context
  • Added log_level and log_format settings to configuration (#28)
  • Added 62 unit tests for checksum utilities and verification (#29)
  • Added 17 integration tests for download verification API (#29)
  • Added global artifacts endpoint GET /api/v1/artifacts with project/package/tag/size/date filters (#18)
  • Added global tags endpoint GET /api/v1/tags with project/package/search/date filters (#18)
  • Added wildcard pattern matching (*) for tag filters across all endpoints (#18)
  • Added comma-separated multi-value support for tag filters (#18)
  • Added search parameter to /api/v1/uploads for filename search (#18)
  • Added tag filter to /api/v1/uploads endpoint (#18)
  • Added sort and order parameters to /api/v1/uploads endpoint (#18)
  • Added min_size and max_size filters to package artifacts endpoint (#18)
  • Added sort and order parameters to package artifacts endpoint (#18)
  • Added from and to date filters to package tags endpoint (#18)
  • Added GlobalArtifactResponse and GlobalTagResponse schemas (#18)
  • Added S3 object verification before database commit during upload (#19)
  • Added S3 object cleanup on database commit failure (#19)
  • Added upload duration tracking (duration_ms field) (#19)
  • Added User-Agent header capture during uploads (#19)
  • Added X-Checksum-SHA256 header support for client-side checksum verification (#19)
  • Added status, error_message, client_checksum columns to uploads table (#19)
  • Added upload_locks table for future concurrent upload conflict detection (#19)
  • Added consistency check endpoint GET /api/v1/admin/consistency-check (#19)
  • Added PUT /api/v1/projects/{project} endpoint for project updates with audit logging (#20)
  • Added PUT /api/v1/project/{project}/packages/{package} endpoint for package updates with audit logging (#20)
  • Added artifact.download audit logging to download endpoint (#20)
  • Added ProjectHistory and PackageHistory models with database triggers (#20)
  • Added migration 004_history_tables.sql for project/package history (#20)
  • Added migration 005_upload_enhancements.sql for upload status tracking (#19)
  • Added 9 integration tests for global artifacts/tags endpoints (#18)
  • Added global uploads query endpoint GET /api/v1/uploads with project/package/user/date filters (#18)
  • Added project-level uploads endpoint GET /api/v1/project/{project}/uploads (#18)
  • Added has_more field to pagination metadata for easier pagination UI (#18)
  • Added upload_id, content_type, original_name, created_at fields to upload response (#19)
  • Added audit log API endpoints with filtering and pagination (#20)
    • GET /api/v1/audit-logs - list all audit logs with action/resource/user/date filters
    • GET /api/v1/projects/{project}/audit-logs - project-scoped audit logs
    • GET /api/v1/project/{project}/{package}/audit-logs - package-scoped audit logs
  • Added upload history API endpoints (#20)
    • GET /api/v1/project/{project}/{package}/uploads - list upload events for a package
    • GET /api/v1/artifact/{id}/uploads - list all uploads of a specific artifact
  • Added artifact provenance endpoint GET /api/v1/artifact/{id}/history (#20)
    • Returns full artifact history including packages, tags, and upload events
  • Added audit logging for project.create, package.create, tag.create, tag.update, artifact.upload actions (#20)
  • Added AuditLogResponse, UploadHistoryResponse, ArtifactProvenanceResponse schemas (#20)
  • Added TagHistoryDetailResponse schema with artifact metadata (#20)
  • Added 31 integration tests for audit log, history, and upload query endpoints (#22)

Changed

  • Standardized audit action naming to {entity}.{action} pattern (project.delete, package.delete, tag.delete) (#20)
  • Added StorageBackend protocol/interface for backend-agnostic storage (#33)
  • Added health_check() method to storage backend with /health endpoint integration (#33)
  • Added verify_integrity() method for post-upload hash validation (#33)
  • Added S3 configuration options: s3_verify_ssl, s3_connect_timeout, s3_read_timeout, s3_max_retries (#33)
  • Added S3StorageUnavailableError and HashCollisionError exception types (#33)
  • Added hash collision detection by comparing file sizes during deduplication (#33)
  • Added garbage collection endpoint POST /api/v1/admin/garbage-collect for orphaned artifacts (#36)
  • Added orphaned artifacts listing endpoint GET /api/v1/admin/orphaned-artifacts (#36)
  • Added global storage statistics endpoint GET /api/v1/stats (#34)
  • Added storage breakdown endpoint GET /api/v1/stats/storage (#34)
  • Added deduplication metrics endpoint GET /api/v1/stats/deduplication (#34)
  • Added per-project statistics endpoint GET /api/v1/projects/{project}/stats (#34)
  • Added per-package statistics endpoint GET /api/v1/project/{project}/packages/{package}/stats (#34)
  • Added per-artifact statistics endpoint GET /api/v1/artifact/{id}/stats (#34)
  • Added cross-project deduplication endpoint GET /api/v1/stats/cross-project (#34)
  • Added timeline statistics endpoint GET /api/v1/stats/timeline with daily/weekly/monthly periods (#34)
  • Added stats export endpoint GET /api/v1/stats/export with JSON/CSV formats (#34)
  • Added summary report endpoint GET /api/v1/stats/report with markdown/JSON formats (#34)
  • Added Dashboard page at /dashboard with storage and deduplication visualizations (#34)
  • Added pytest infrastructure with mock S3 client for unit testing (#35)
  • Added unit tests for SHA256 hash calculation (#35)
  • Added unit tests for duplicate detection and deduplication behavior (#35)
  • Added integration tests for upload scenarios and ref_count management (#35)
  • Added integration tests for S3 verification and failure cleanup (#35)
  • Added integration tests for all stats endpoints (#35)
  • Added integration tests for cascade deletion ref_count behavior (package/project delete) (#35)
  • Added integration tests for tag update ref_count adjustments (#35)
  • Added integration tests for garbage collection endpoints (#35)
  • Added integration tests for file size validation (#35)
  • Added test dependencies to requirements.txt (pytest, pytest-asyncio, pytest-cov, httpx, moto) (#35)
  • Added ORCHARD_MAX_FILE_SIZE config option (default: 10GB) for upload size limits (#37)
  • Added ORCHARD_MIN_FILE_SIZE config option (default: 1 byte, rejects empty files) (#37)
  • Added file size validation to upload and resumable upload endpoints (#37)
  • Added comprehensive deduplication design document (docs/design/deduplication-design.md) (#37)

Fixed

  • Fixed Helm chart minio.ingress conflicting with Bitnami MinIO subchart by renaming to minioIngress (#48)
  • Fixed JSON report serialization error for Decimal types in GET /api/v1/stats/report (#34)
  • Fixed resumable upload double-counting ref_count when tag provided (removed manual increment, SQL triggers handle it) (#35)

[0.3.0] - 2025-12-15

Changed

  • Changed default download mode from proxy to presigned for better performance (#48)

Added

  • Added presigned URL support for direct S3 downloads (#48)
  • Added ORCHARD_DOWNLOAD_MODE config option (presigned, redirect, proxy) (#48)
  • Added ORCHARD_PRESIGNED_URL_EXPIRY config option (default: 3600 seconds) (#48)
  • Added ?mode= query parameter to override download mode per-request (#48)
  • Added /api/v1/project/{project}/{package}/+/{ref}/url endpoint for getting presigned URLs (#48)
  • Added PresignedUrlResponse schema with URL, expiry, checksums, and artifact metadata (#48)
  • Added MinIO ingress support in Helm chart for presigned URL access (#48)
  • Added orchard.download.mode and orchard.download.presignedUrlExpiry Helm values (#48)
  • Added integrity verification workflow design document (#24)
  • Added sha256 field to API responses for clarity (alias of id) (#25)
  • Added checksum_sha1 field to artifacts table for compatibility (#25)
  • Added s3_etag field to artifacts table for S3 verification (#25)
  • Compute and store MD5, SHA1, and S3 ETag alongside SHA256 during upload (#25)
  • Added Dockerfile.local and docker-compose.local.yml for local development (#25)
  • Added migration script 003_checksum_fields.sql for existing databases (#25)

[0.2.0] - 2025-12-15

Added

  • Added format and platform fields to packages table (#16)
  • Added checksum_md5 and metadata JSONB fields to artifacts table (#16)
  • Added updated_at field to tags table (#16)
  • Added tag_name, user_agent, duration_ms, deduplicated, checksum_verified fields to uploads table (#16)
  • Added change_type field to tag_history table (#16)
  • Added composite indexes for common query patterns (#16)
  • Added GIN indexes on JSONB fields for efficient JSON queries (#16)
  • Added partial index for public projects (#16)
  • Added database triggers for updated_at timestamps (#16)
  • Added database triggers for maintaining artifact ref_count accuracy (#16)
  • Added CHECK constraints for data integrity (size > 0, ref_count >= 0) (#16)
  • Added migration script 002_schema_enhancements.sql for existing databases (#16)

Changed

  • Updated images to use internal container BSF proxy (#46)

[0.1.0] - 2025-12-12

Added

  • Added Prosper docker template config (#45)

Changed

  • Changed the Dockerfile npm build arg to use the deps.global.bsf.tools URL as the default registry (#45)